London Stock Exchange Tackles System Problem
DMandPenfold writes "The London Stock Exchange has taken steps to resolve a system problem that occurred at 4.30pm Tuesday, which saw a delay to the start of the closing auction and knocked out automatic trades during a 42 second period. The problem occurred a day after the high profile launch of its new matching engine on the main equities market, based on the SUSE Linux system from Novell."
Imagine telling a trader in the 1970s that we had a 42 second outage in the stock market, it was all over the news, and a few companies probably lost hundreds of millions in revenue.
No test can ever be a 100% accurate representation or real use.
It's probably somewhere in the 0.1% mismatch where this problem occurred.
Slashdot social media options: AIM, ICQ, Yahoo, Jabber and Mobile Text. Why no MySpace?
This just shows that it's hard to build these highly available, low latency, massive usergroup systems. Previously there was a lot of chatter about the platforms (.NET, MSSQL 2003, etc...)
Yes. And let us not forget that a lot of that chatter came from Microsoft's PR department.
To be fair, if this was Windows you *know* the Linux fans (myself included) would be berating them for choosing such a crappy platform. And it was Windows, and we did berate them for it...
Of course, the truth is somewhere in the middle - isn't it always? The most important part of a trading computer system set up is, well, the trading computer system software. The system's biggest problems aren't going to be due to Coolwebsearch or a bluescreen, they'll be with the trading software itself. The OS doesn't really matter. Buggy software is buggy software.
That being said, Linux is just a better platform to build something like this on. Sure, you can do it with Windows and make it work, but it's just more and unnecessarily difficult.
I have developed a truly marvelous proof of this comment, which this signature is too narrow to contain.
If I were to introduce a new trading system like that, I would probably run it in parallel with the old system for a couple of months, feeding it the same data and checking the results. I can't imagine them just doing some offline tests and then throwing the switch and hope for the best? Surely not?
They should have just used a Mac. iStockExchange is a free download for MobileMe members.
Liffe[sic] the universe and everything.
This works for a component with one set of inputs and one set of outputs.
A trading system is essentially chaotic in the way it processes data because it gets so many inputs and their relative arrival times determine the system's behavior. You'd have to be replacing the old system with an identical new one, and then add heavy and slow synchronization to all the inputs going to both systems (so e.g. a trade A hitting the old system one microsecond before trade B also hits the new system the same way).
So yes, it comes down to running a whole lot of offline tests using real data and then bringing it online.
My blog
Why is this news? None of the several hour long outage calls I've been involved with were ever on the news.
One of the major exchanges in Chicago, as well as one one of the bigger global banks. Not a small firm.
Ah, I see... well, maybe it would be news-worthy, but... too pity there's too much noise coming from the open outcry trading in the pit... nobody hears about an outage and even during an outage the business goes as usual in that pit... everybody furiously shouting and gesticulating.
(just kidding)
Maybe it will help to compare the LSE trade-volume per day with the same for your workplace?
Questions raise, answers kill. Raise questions to stay alive.
Why is this news? None of the several hour long outage calls I've been involved with were ever on the news.
Simple, when a Linux system has a 42 seconds outage that's news. When a windows systems has several hours outage, there is nothing new about.
Odd that you didn't get that..
---
I've seen this too often in my 25+ professional years in IT. The system test manager produces an excellent plan, that fully simulates the anticipated workload. But it requires X testers, Y test case developers and Z machines. The program manager rejects the plan, "because he is under pressure to reduce costs." The program manager says, "The testing that the developers do should be enough." He then moves on, before the system goes into production.
The result? It always ends in tears.
Schroedinger's Brexit: The UK is both in and out of the EU at the same time!
But wont Apple keep 30% of the money that flows through it?
The system has been online since the middle of last year. Nice for you to drop by and dish out out your golden nuggets of wisdom tough.
:. Ultimate Control Dedicated/VM Servers
Given that this was a showcase client for MS, it does not make them looks good.
Given that the MS was involved in developing the TradeElect system (the Windows based one), so even is the fault lies in TradeElect rather than the MS platform, then its still at least partly MS's fault.
Microsoft and Accenture developed a system that turned out not to be as good as the one developed by a small Sri Lankan company no-one had ever heard of.
Licensing costs are trivial in the context of these sorts of systems. TradeElect cost £40m, the new system was £50m - but that was to buy the company, not just the software.
The summary is even lamer. It didn't knock out any trading. What happened is that the closing auction announcement was delayed by 42 seconds. Normally any trades submitted prior to the announcement get ignored. However, in this case I'll quote from the article:
LSE traders are required to wait for the message before trading. Normally auction trade instructions sent before the message would not complete, however because of the surprise delay, on this occasion order book trades were allowed to complete later.
So, no outage, no downtime, no lost trades.
Previously there was a lot of chatter about the platforms (.NET, MSSQL 2003, etc...)
It's one thing to have a 42 seconds glitch in the first day a totally new system is powered up. That's perfectly normal, and had been predicted:
"Observers watching today's Linux-based launch will likely note that such a large change could bring about some teething problems, as with any technology overhaul."
It's a totally different thing to have it stop for a whole day after having been in operation for three months.
So, in conclusion, yes, it's about the platform. .NET, MSSQL 2003, etc aren't robust enough for this kind of job.
Was the consequences of the loss of the automatic trade?
Everybody won? :D
I worked on a trading system back in the early days. We hit lots of "edge" performance cases. To take full advantage of what a system offers us and to code around problems we usually have source code to look at. We didn't change it, but we had to have the access. MS would gladly give their source code to major customers, but frankly there is more expertise around Linux kernels than Windows.