London Stock Exchange Price Errors 'Emerged At Linux Launch'
DMandPenfold writes "Within the first 20 seconds of the London Stock Exchange's new matching engine going live on Monday, price data vendors began displaying incorrect prices, blank prices and wrong trading volumes, according to Computerworld UK sources. Thomson Reuters, Interactive Data and Netbuilder are among the largest data vendors, providing share prices to traders, that have been displaying pricing problems on some stocks throughout the week. Even the LSE's own data vendor, ProQuote, experienced problems. Concerns are being raised that there could be mistakenly setup connections or incorrect software interfaces at some of the large data vendors. Alternatively, there may be a data caching issue at the LSE that means data going out is not properly synchronised between different systems."
Within the first 20 seconds of the London Stock Exchange's new matching engine going live on Monday, price data vendors began displaying incorrect prices, blank prices and wrong trading volumes, according to Computerworld UK sources.
Surely they would have run test data before letting it go live. Maybe even feed it the actual data and simply not publish its results.
Google, Amazon, IBM, Redhat, ... Take your pick :-)
Turn it off and on again.
Don't forget the US DoD, The US Navy Submarine Fleet, the FAA, and Cisco:
http://www.focus.com/fyi/information-technology/50-places-linux-running-you-might-not-expect/
I don't see IBM running Watson on Windows 2008 Server because, well, you know....they want it to work.
My heart goes out to the devs "working long hours and night shifts" to suss this out.
That being said, the line that catches my eye most is: "The fact the majority of smaller vendors were fine demonstrated that those having trouble had made mistakes." In my experience, that means one of two things:
1) The devs configuring the system didn't properly account for the sheer scale of the stress on their systems.
2) The smaller vendors took the change more seriously, and being smaller and more flexible, successfully updated their systems to interact properly with the new systems.
Or, of course, both.
Did they reboot it 3 times? I've heard that works.
Oh wait this isn't windows 2008.
The amusing part of this troll is that it would be modded up to 5, instantly, if it were attacking Windows. The day when I see a pro-linux comment modded "troll" is the day that I take anything posted on Slashdot seriously. Until then, it's just self-important, self-congratulatory, mindless groupthink.
Btw, they are _upgrading_ to Linux, because the previous system (Windows and .net) failed several times, required MW of energy and was slow.
This new _upgraded_ system is many orders of magnitude faster.
Unfortunately, it appears, it wasn't properly tested... WTH? I'd expect for the system to run only 100x faster, instead of 1000x faster, during the first few weeks. Or maybe that some API from 1990 stopped working. Wrong data is just too much. Maybe the story isn't telling all the facts? ...Hmmm... Maybe the affected clients are running Windows?!!!!!!
I work on large scale air traffic control systems which run Linux and I don't envy the LSE in their task. Most of our interfaces are relatively simple and go out to organisations with a good history of validating interfaces. This trading system seems to have to interface to a lot of little offices around the place running various implementations. Its no surprise some of the interfaces weren't tested to the point where they are known to work 100%, though they may be 100% correct.
http://michaelsmith.id.au
Surely they would have run test data before letting it go live. Maybe even feed it the actual data and simply not publish its results.
Quote from the project manager:
"Dang. If only we had read Tubal-Cain's post before going live. Who would have thought to run test data through this darn thing?"
Hmmm.. So .Net was bad ? Bot a good start on Linux
You get what you pay for with Linux.
If they had gone with MS for the upgrade it would have been bulletproof, as they have a reputation to keep up.
Article is a a bit misleading....
Reuters, Netbuilder et al are quote services that run externally and fetch quotes (price/volume) from the LSE and are *not* part of what would be considered the main trading platform. The problem isnt the actual OS (Linux) or tradiing platform, which has been heralded as a big step forward from the MS solution, but rather how the resulting trade data that is pushed from the exchange to the 3rd party sites and then ultimately to the end users.
botNet wab bab? Bot a bood bart on Binux
I wouldn't go that far, but I would like any Linux fanboys out there to remember this the next time they read a story about a Microsoft based system having problems. No OS is flawless.
using a solid os won't protect you from human stupidity. i thought that was a given. perhaps letting microsoft setup everything would be better for stupid people like lse. more expensive but atleast no stupid errors.
meanwhile, intelligent people continue using linux to their advantage.
Wealth is the gift that keeps on giving.
I thought I'd stuffed up big time when I hooked all of our users (around 1000 users at the time) up to test market prices rather than live data (Australian Stock Exchange) over market opening one morning (our product is aimed at day traders too). My stuff up seems minor in comparison to this.
Data is always seems to be the least respected item in the IT stack (TBV). Yet its the lifeblood. Referential integrity, blah. At your peril.
The offending code was traced back to :
$stockprice = power( 2, rand()*10 + 0.0001 );
Data vendors quickly addressed the issue by cutting the LSE out entirely, and rolling their own rand() function.
Fucking stock markets... .what a joke!
-Billco, Fnarg.com
On the other hand, a moon mission interfaced a host of complex systems including the lives of the astronauts. Difference is the level of integration of QA into the project.
...We'd have heard a million nerds orgasming on Slashdot, reminding us how much Windows sucks.
As someone on the receiving end of this problem (as head of Development for one of the largest UK stockbrokers), it has been immensly frustrating trying to get an admission of blame from anyone involved in this issue. Our clients are complaining to use because elements of the data we're displaying on our site are just so out of whack they are laughable. In the end we've just had to remove those elements of data from our website, and they remain removed until the fix is confirmed working on Monday. This is after several attempts to fix the issues during the week.
It's also causing serious issues for things like our black box trading engine which is used to drive automatic execution of client orders when the price moves into range for execution. The data being bad means these systems just have to be turned off, and clients are being advised to simply delete their orders.
This is, frankly, a pretty appalling situation as the service to our clients is being impacted pretty severely.
For an exchange that is attempting to position itself as the true 'global' exchange, this is doing an immense amount of damage to it's reputation.
is the concept of "high speed trading",
This horsecrap has to stop, the stock market has to be dismantled
albeit, shoes and a tie soaked with sweat as he leaps about like a trained monkey desperately trying to feign relevance.
I am curious which part of the Linux (kernel) they'll be going to blame and how Slashdot will turn it into FUD with adverts around it... xD
It's only a stock exchange. Why the fuck should they bother testing it first, right? This just goes to underline that QA does not exist in the software world.
Seven puppies were harmed during the making of this post.
As with most major issues there's bound to be a big ol' postmortem on this. As head of Dev you've probably got a unique insight into this, I'm curious as to your perspective on this, what you think the cause of failure might be? More strategic or more technical? Poor interface specification? Inability to handle queries under full load? From TFA there was supposedly 15 months of testing. So I'm curious as to why it failed, whether the testing simply wasn't realistic and/or thorough enough, or it was something that just wouldn't come up except on the live system?
I of course don't make mistakes, but what do you think would happen if for example you wiped 8.5 billion dollars off of the value of your company?
http://finance.yahoo.com/q/bc?s=NOK+Basic+Chart&t=3m
Do we have a world record here Mr. Elop?
Deleted
If it didn't crash and didn't drop its network connections, Linux was doing its job.
If the application software had bugs, then the application software developers are to blame.
An attempt to rubbish Linux.
After 20 seconds they should have realised that they should have tested more, and fired the programmers for allowing such mistakes. This is nothing to do with Linux.
You'd think the LSE would have learnt from their last computer system rollout which also had massive problems. No quality control = management problem.
Take Nobody's Word For It.
Sure wish I had mod points, this man knows how to think.
- Dan.
~ People that think they are better than anyone else for any reason are the cause of all the strife in the world.
From TFA there was supposedly 15 months of testing. So I'm curious as to why it failed, whether the testing simply wasn't realistic and/or thorough enough, or it was something that just wouldn't come up except on the live system?
Well, he had 15 months to test it, but spent 14 of them on /. instead of actually testing the push feed.
http://www.linuxjournal.com/node/1000121
Windows is not ready till it breaks the LSE. Tim S. PS: Already rated a post in this thread so posting anom.
Aaaaannnd.... there it is! I was about to write a post saying that I wonder how someone would try to turn this into an apropos-of-nothing, Microsoft-bashing event.
Good luck to the developers and integrators who will undoubtedly be working 16-hour days until this is fixed.
That quoted phrase 'emerged at linux launch' is NOWHERE in the article. The article says "London Stock Exchange price data failures ‘emerged immediately at Millennium launch’". This is obviously an application issue, why is the summary sort of blaming the operating system?
WTF am I doing replying to an AC at 5 A.M on a Friday night?
Those *are* facts. The new system is better in every metric. The old system failed several times. Once it crashed and stayed down for the better part of the day.
And yet you cite not a single source with your supposed metrics or stories showing it failed "several times". There is a single downtime that one can find any mention of. Care to provide actually evidence of your claims now instead of just asserting they are facts?
If this has been Windows story, there would be much frothing at the mouth and blame on Microsoft. Since it is Linux, the blame magically lies with the implementation.
Hyporcrite much, Slashdot?
No surprise here for anyone working with them. Their marked data specs have hundreds of pages while Chi-X EU have about 20. Chi-X EU quoting and trading volume is far greater. Coincidence? I say no - people voting with their feet and going from LSE to Chi-X EU (biggest trading platform in EU now), BATS EU, and other MTFs. Maybe the only thing keeping LSE still afloat is a fact that FTSE 100 index is computed using LSE prices. Once that gone, potentially as soon as MIFID II comes out, LSE is going to circle in the crapper.
I work on large scale air traffic control systems which run Linux and I don't envy the LSE in their task. Most of our interfaces are relatively simple and go out to organisations with a good history of validating interfaces. This trading system seems to have to interface to a lot of little offices around the place running various implementations. Its no surprise some of the interfaces weren't tested to the point where they are known to work 100%, though they may be 100% correct.
According to the article, the little organizations are running fine -- it's some of the big company systems that are having problems. My guess would be that their systems are so large and complex that they haven't been able to do sufficient testing, and that they weren't able to make the changes that they needed to make because the systems are all in use 24 x 7.
Its like the story of why a heart surgeon is paid more than a car mechanic. Try fixing a car engine while its running.
If it is orders of magnitude faster, that's likely due more to better algorithms than choice of platform. The .net platform is no slouch, and I would be REALLY surprised to see anything greater than a 300% speed up if the same system was merely reimplemented under Linux/C++.
No, no OS is flawless, but these problems are nothing to do with the Linux OS in use. This is a problem with a lack of integration testing by multiple parties. In other words, it's a human OS screw up.
Don't be stupid.
I would wager that the vast majority of problems in Windows are also due to user error, yet Linux fanatics instantly claim it's the OS.
Seems to me like a few Slashdot users got into positions of authority at the London Stock Exchange. .NET & M$ Windoze has bad performance. We should rewrite with Linux and Opens Sores for Maximum 1337.
Since these were Open Sores blowhards instead of good developers they now have a stock exchange where you can't trust the numbers
They typically have decent to excellent central systems, and it feels like they're treating it seriously and spending lots of money to maintain and upgrade them.
Now at the periphery ... the interconnections, for electronic payment and so on, they're a disgusting mess. It doesn't help that some protocols are truly horrible and should have died 40 years ago, but even when they've moved into the IP era, they keep on using outdated shit and/or idiotic settings.
At very least you'd expect a large scale dry-run with simulated data that should bring these errors out before becoming live.
Maybe. Such testing only tests the type of errors you have thought of and are anticipating. I was once blessed with a competent and skilled QA department and months of pre-release testing, simulation, fuzzing, beta testing, and yet when a product goes live to millions bugs become apparent. You just can't think of and test for everything in a sufficiently complex system. The real failure of this LSE rollout may have been not running the old and new systems in parallel while looking for discrepancies in the publicly reported data.
Canadians have bigger issues with the LSE
Seriously, what other fact do you need?
Nah, everyone knows Windows is just for games.
They should have gone with OS/2
As one of the AC posters above, who was working for one of the involved companies suggested, it's a bit rough to suggest that the programmers are the ones who should be sacked.
The companies involved have systems that are built through years of acquisitions and management decisions that favour short-term 'wins' over long-term stability.
The focus in such an organisation is often on shipping at all costs (there are, after-all, regulatory, compliance and business obligations that MUST be met), and obviously in those situations, given a relatively constant set of team members skilled enough to do the work, the quality will suffer. The more systems added in to the mix from various acquisitions, the harder it gets to guarantee a quality outcome. Adding more people doesn't necessarily help on a project with a fixed-timeline, as training the new team members distracts those who actually know what they're doing, as do the required communications/meetings to co-ordinate everybody.
A software developer working in an environment like that has far less control over the 'quality' variable than a complete outsider might expect, and I for one hope that the management wear their fair portion of blame when the time for reckoning comes.
Nobody is blaming the LSE's implementation. The LSE (and MilleniumIT) has provided a new system and interface. Many third-party vendors (Reuters, Trading Technology, Interactive Brokers, etc.) write adapter software which consumes that interface and exposes it through their own standard (but usually slower or less powerful) interfaces. It is typically cheaper and easier for traders to use these vendor interfaces to connect to a variety of exchanges, instead of forcing every trader to adopt the raw (high-performance and bandwidth-intensive) interfaces provided by the LSE. Those vendors now have bugs in their software, but the LSE is taking the heat. That's not right.
when multi-million dollar CEOs don't get their trades spot on.. or have to wait. (AAAHHH!) they fling the blame around.. from the golf course, to the call girls.. word gets around.
Some of the more important algorithms are in the kernel.
I know tobacco is bad for you, so I smoke weed with crack.
Crappy memory management, application protection and limitation, backwards compatibility issues, kitchen sink architecture...
I know tobacco is bad for you, so I smoke weed with crack.
Zero, since programmers of Linux don't get paid.
This is strictly an application server implementation issue, not a Linux one. They had more (though probably different) problems with their previous MS Server infrastructure, which is, as I understand it, the reason behind the migration to Linux. So, don't tie this fiasco to the use of Linux, but rather to inadequate testing of the new systems in a real-world load scenario. This is really difficult for large scale distributed systems such as this is. I spent a LOT of time and engineering effort to build a testing framework for a major manufacturing execution system. It took the efforts of a fair number of really talented software engineers to emulate and resolve many of the issues (race conditions, hardware/software failures, etc) before we had a dead-bang reliable system that can run a major semiconductor FAB on a 365x24 basis. You must start thinking about these things during the design phase - not after the system has been implemented.
Sometimes, real fast is almost as good as real-time.
That's the thing, Microsoft helped setup the original LSE. It was well publised in their own get the facts campaign on why Microsoft's solutions are better then open source.
Again, back to the "orders of magnitude" claim. 2 orders of magnitude (the minimum required to be plural) indicates a 100x increase in speed. That kind of increase doesn't happen from changing operating systems.
Microsoft has has had a lot of bad press lately as linux and java seem to be eating its lunch in the new world (high-performance, web base computing, small form factor and mobile computing). I suspect these issues are true but exaggerated.
Remember that the LSE had serious problems using the MS-based system before. Remember the curious wording of the recent problems with the .NET-based system. Somehow, they tried to link this to the fact that the Linux-based system was being tested (of course it had nothing to with that).
Windows Phone 7 is seriously buggy (many basic functions still don't work). Android (java) and IPhone (freebsd) have a lock on the smartphone market and MS's feeble attempt is too little, too late. Selling low-quality crap only works if you have a monopoly and it must be hard for them to see that they can't compete in the new emerging markets.
Even Microsoft has to provide its updates via Akamai linux-based servers. It's hotmail servers couldn't handle the load when switched to Windows Server and had to be switched back.
All high-end Web-based services (facebook, twitter, google, yahoo, etc) are all run on linux or freebsd systems.
Does it surprise anyone that an issue with a high-profile linux system is going to be big news for the Windows world?