London Stock Exchange Rejects .NET For Open Source
ChiefMonkeyGrinder writes "This summer, the London Stock Exchange decided to move away from its Microsoft .Net-based trading platform, TradElect. Instead, they'll be using the GNU/Linux-based MillenniumIT system. The switch is a pretty savage indictment of the costs of a complex .Net system. The GNU/Linux-based software is also faster, and offers several other major benefits. The details provide some fascinating insights into the world of very high performance — and very expensive — enterprise systems. ... [R]ather than being just any old deal that Microsoft happened to lose, this really is something of a total rout, and in an extremely demanding and high-profile sector. Enterprise wins for GNU/Linux don't come much better than this."
Why is this news? Sun/Solaris dominated the high-end financial sector for ages...any exchange/trading house/equity firm/etc that is using Windows is insane IMHO. Linux is just the most recent unix platform to show up in the sector, it's not revolutionary...
How disingenuous.
While it is 2.3ms faster it is also compared to 0.4ms (vs 2.7) making it 6.75 *times* faster.
Sub ms latency in trading is a critical requirement for this application and .net on windows just wasn't up to the task.
As a performance expert, this doesn't surprise me. In my opinion, current .net implementations are fundamentally unsuited to hard RT.
Ian Ameline
Hehe:) For those that are interested, they still have a InfoElect case study from 2006 posted on their site, which I believe was the the precursor to TradElect.
You didn't read the article did you?
It was cheaper for them to buy the WHOLE COMPANY that had built this technology, than it was to continue running/maintaining a .NET application. The .NET application was built and maintained by accenture, who can just as easily hire cheap devs in india or sri lanka as any other outsourced IT consultancy.
Also, they specifically state multiple times that the .NET solution would not scale to meet their needs, the quoted stats are 2.7ms/transaction in .NET and the linux app performs the same transaction in .4ms... So the linux system can handle 6-7 times the transactions on the same hardware...
They are talking about scaling up from 100 million transactions a day to 5-6 billion, so, yeah having to buy 6 times less hardware will probably save them some cash.
If you had read the earlier articles on the TradElect fiasco, you would have known that it was basically written and designed by Microsoft itself. Accenture had a very heavy involvement in the project straight from Redmond.
So yes, this is an outright condemnation of the quality of Microsoft's products.
Mart
"I know I will be modded down for this": where's the option '-1, Asking for it'?
Uhh... I've never seen this level of RTFA and.. man this is slashdot where that is the norm.
the LSE ALREADY ENTERED A PURCHASE AGREEMENT TO BUY THE COMPANY that ALREADY BUILT A TRADING PLATFORM THAT IS BEING USED TODAY IN OTHER EXCHANGES! The deal closes in the next week or 2. The article says 95% of the "Non-Refundable" parts of the deal have already been transacted. Neither the LSE nor Millenium IT (the Sri Lankan company that is being purchased) is walking away from this deal.
You don't spend $30 million dollars and purchase a company if you aren't moving your software to that platform. The article states they already had a trial phase and brought in originally 20 platforms, shortlisted 4, ran those for a period, and MilleniumIT won. They then decided to purchase the entire company. This process is much further along the road than you seem to think.
You are not accurate. The LSE bought a dev shop that ALREADY BUILT A TRADING PLATFORM, that is being used today in other exchanges. The platform in question ALREADY achieves 6 times the performance of their existing platform (built by accenture), and has MORE FEATURES.
And they are moving from an outsourced dev model to an in house model, as they now own the devs and the software. Sure they devs are still in Sri Lanka, but Accenture could just as easily hire people in India or Sri Lanka to get the same cost savings.
They never were with Microsoft, at least not in the Chicago Operations Center when I worked for them. They were pretty hard-core Solaris, and slowly began switching their systems to Linux.
Pie-in-the-sky is unobtainable by definition. Are you claiming that LSE won't be able to implement a trading platform with lower latency and better uptime than their current system? Or are you just claiming that LSE & MilleniumIT are being a little too optimistic in their press releases? Because the latter of those two is probably true.
You made a very generic post about pie-in-the-sky cheap outsourcing to Sri Lanka. You appeared to have little-to-no actual knowledge of the subject, since none was communicated in you post (except the mention of Sri Lanka, which was gleanable from the first comment to the article on the site it was originally posted). You do not appear to be familiar with MilleniumIT.
You call yourself a realist... yet realistic perspective is dependent upon knowledge of the subject. It's well known that most trading platforms are faster than the piece of crap they had on the LSE... often more than 25ms faster, which means that it was faster to trade on Euronext.
But you know... whatever man... you can try to backtrack and defend your reactionary post however you want... you simply made claims that don't stack up to reality.
"Trolls they were, but filled with the evil will of their master: a fell race..." -- J.R.R. Tolkien on Olog-hai
Yes, the list is contiguous in memory but that list is just a list of object pointers. The data is scattered around the heap just like the linked list data is scattered around the heap. Fast access to the object pointer does not yield any speed boosts. In C++ you could create an array of actual objects and then all the objects are contiguous in memory and incrementing to the next object is incrementing a point by sizeof(theObject). For small objects, you might be within the range of the memory cache on each increment. The managed object system most likely cannot possibly put the actual objects into contiguous memory and so you still have the cache misses when dereferencing the object pointers.
So, tell us again who understands access characteristics of linked lists and array lists better?
SEC Proposes Ban on Allowing Stock Flash Orders (dated September 19th 2009)
Flash traders have direct connections to the NYSE exchange and pay large sums just for bandwidth to make sure the trades are almost real time. Goldman Sachs is a key participator in this.
That said, their trades often have no human interaction and generally are computers following trading algorithms only a block away from the exchange with a direct fiber line to the office. It would be impossible otherwise.
Some traders have been raising a stink over this, but generally the miliseconds do count.
From http://seekingalpha.com/article/150397-flash-trading-goldman-sachs-front-running-everyone-else
Of course I don't know how the LSE handles flash trading or even wants it but I'm going to assume they need everything to be as real time as possible. You just don't hear the finacial firms complaining about the disparities simply because they have the money to set up the transactions their servers pretty much next to the exchange itself (if not in the same building).
"I am the king of the Romans, and am superior to rules of grammar!"
-Sigismund, Holy Roman Emperor (1368-1437)
If you're trying to shave run time on complex functions down to sub millisecond times, I would expect that bounds checking, type safety, and thread safety are low on your concerns.
Curiously enough, C# lets you drop both bounds checking and type safety to exact same extent as plain C, with corresponding performance gains.
It should also come as absolutely no surprise that a C++ pointer based linked list running native locally on the OS performs faster than a .Net Generics List running as CLR in the .Net run-time environment.
What do you mean by "performs faster"? Iteration? Indexing? Insertion at front? Insertion at end? Removal? This is a surprisingly vague statement...
I can bet you $1000 that System.Collections.Generic.List<int> will significantly outperform std::list<int> on indexed access on lists of significant size, for example, simply because the former is array-backed, and the latter is a doubly linked list. This is just to show how meaningless your comparison is.
Now, yes, if you write idiomatic C# code for a linked list (using GC heap allocated objects and tracked references), it will be slower than equivalent C++ code because of all the safety checks (like null checks). But, of course, you can also use C# raw pointers and structs to write exact same code you would write for a linked list in C, and that would work just as fast (since it would compile to pretty much the same native code in the end).
Purportedly a single day of problems?
The exchange shut down during a high-volume trading session. That's not purported, that's fact. What's purported is the number of times HVTs observed execution delays on the LSE at other high-volume times... and that's one reason Euronext has been claiming increasing market share from LSE.
"Trolls they were, but filled with the evil will of their master: a fell race..." -- J.R.R. Tolkien on Olog-hai
A generic list, even if it is array based, is going to be on the stack an array of pointers to other points of the stack and the heap.
If you use STL, then std::vector will also allocate its backing store array on heap.
On the other hand, if you use C#, you can use stackalloc to get a stack-allocated, non-GC-tracked array.
Managed .NET arrays (not stackalloc or unmanaged heap allocated) will still be slower because there are bound checks for element access (though JIT can eliminate them sometimes when it sees that they can never fail).
Mutable generic collection classes are even more slow, because they also have safeguards to do things like throwing an exception if you get an enumerator for a collection, then remove an item from that collection, and then try to move the enumerator (whereas in C++, doing same thing for a vector would just render all active iterators invalid, and their use would lead to a crash at best, and silent data corruption at worst). This is achieved by storing a "version number" for a collection (just as plain int) which incremented it on every insertion/removal - and which enumerators check against every time you move them. Naturally, this increment happening on every insert also slows things down.
Yes, the list is contiguous in memory but that list is just a list of object pointers ... In C++ you could create an array of actual objects and then all the objects are contiguous in memory and incrementing to the next object is incrementing a point by sizeof(theObject). For small objects, you might be within the range of the memory cache on each increment. The managed object system most likely cannot possibly put the actual objects into contiguous memory and so you still have the cache misses when dereferencing the object pointers.
Not necessarily - this isn't true for any primitive types like int or float, and this isn't true for any user-defined structs.
Unlike Java, C# lets you define your own types that don't have to be heap-allocated. For such types, exact same technique that you describe for C++ can also be used.
LSE isn't going to run setup.exe (sorry ./setup.so), they're going to have to do some large-scale integration work and customization to make it work with their system...?
Huh? Trading platforms are trivial applications. Send data down the wire. Commit it. Get data back. Typically, these systems have multiple servers per stock offered at the exchange, each of them acting as a market maker/auctioneer to each others (trivial, a 10KB binary can do it, VERY QUICKLY). Each of the machines buffer trading history until it can be sent to the clearing house.
There's little need to "customize" Linux. Linux already deals with the networking part just fine.
The issue is writing the software using an easily maintainable, testable, and rigorously provable language. Credit Suisse is using Haskell for this purpose, very successfully. The only real difficulty is implementing the exchange rules regarding sorting the stock orders. That's going to be a real issue in any language. Sorting large sets is always expensive (but can be done in parallel).
After all, I am strangely colored.
Well, someone certainly thought LSE was proof of something, why otherwise would they have bragged about it? Now that that bragging has been shown to be moot surely you can understand this modest amount of schadenfreude?
--frank[at]unternet.org
because, if you're going to write a trading platform that truly shows just how good .NET is, you'll want to get Microsoft to show you how. They wrote .NET after all, if they can't do it then no-one can.
so true.
In the end the Tradelect platform cost £40m. Buying the entire MilleniumIT company cost $30. note the currency symbols.
This is all nice and stuff in theory. Every so often, people sometimes like to try to argue that code running under a VM such a java or C# with .Net are "as fast" or faster than machine-compiled code from C or C++ because of JIT and runtime optimizations and whatnot.
In case you haven't noticed, I'm not arguing that. I'm arguing that C# has all low level operations that C has, which allows you to write C# code on the same level of abstraction as C# code. Naturally, pointer arithmetic gets compiled to same native instructions in any language. Optimizer can improve things somewhat, and I won't argue that .NET JIT optimizer is on par with, say, gcc, but that difference is very circumstantial, and small even in worst cases.
Unfortunately, the reality just doesn't follow the theory. In real-world benchmarks, managed code is not faster than pre-compiled machine code. Period.
I never claimed it's faster, either. It is still slower even with hand-tuning that I've mentioned, simply because you cannot kill GC entirely (though if you never allocate from managed heap, GC will simply never run).
Also, please don't drag Java into this. Java and C# are two very different languages by now, with C# having a much richer feature set, which is very much relevant to this discussion - since parts of that feature set are what enables C-like performance when needed. Furthermore, two most common runtime implementations for those languages - Microsoft .NET and Sun JVM - have radically different implementation strategies. As such, you cannot meaningfully translate your Java/JVM experience to C#/.NET, or vice versa.
On the other hand, why have Millennium and everyone else who has developed a reasonably good trading platform chosen too use something other than Windows (Millennium originally used Solaris btw).
Of course the LSE is not interested in using open source, but the fact is that an open source OS was the best solution they could find.
I used to work for Millennium. I have already blogged about what I thought of the deal.
"It doesn't take a rocket scientist to work out that a GC-based, VM-based language that has layers of intermediate execution is going to be slower than is required for a trading system."
Actually, this is only true in an ever decreasing set of circumstances.
See here for an explanation of some of the common reasons why this is often not the case:
http://www.idiom.com/~zilla/Computer/javaCbenchmark.html
Also here are some benchmarks:
http://kano.net/javabench/
These sites are focussed on Java, but the points are applicable to .NET also as it's on par nowadays. In .NET you also get the option of using unmanaged code anyway so you can have areas that don't require the VM to underlie execution.
I'd imagine the real problem in this case was a combination of poor project management with poorly skilled developers in an attempt to make the profit margins for Microsoft and Accenture as big as possible. The net result though, as you can see, is quite bad. I do not believe for a second .NET was the problem as there is no reason it can't be used in a way that performs as well as or better than a C++ application. It would use a bit more memory to achieve that performance, but memory is cheap enough for this to not be an issue for most cases nowadays, particularly when you factor in the benefits of security and resilience you get from the managed parts of the codebase.
Get the facts, moron. Only 12% of the Linux kernel work is done by unpaid developpers. Red Hat makes a lot of money and the London Stock Exchange will not suffer from these 2 crashes that Windows caused. Crashing capitalism is more or less a Microsoft thing I guess. So shut the fsck up.
Here be signatures
You can't do this in registry:
Speaking as a real-time programmer, GC and memory allocations are enormously damaging to system performance. You really do need to switch to an almost statically allocated approach, with no memory allocations in real-time execution segments. The x86 architecture has special instructions to make the use of Base Pointer, Stack Pointer, and Index Pointer based memory access usable. If you ever program on a less powerful processor, like an 8-bit PIC microcontroller, you would quickly discover that indirect memory accesses have significant timing penalties. Direct memory access, where data is at fixed locations in system memory, can be accessed in a single instruction on almost all architectures.
The second problem is that dynamic memory allocation has an unbounded maximum execution time. It can also be incredibly difficult to prove that memory accesses do not fragment, and that the program can execute in bounded memory space. Proving finite execution times and finite resource issues are major issues for a real-time system. In soft real-time systems, some forgiveness is tolerable. However, if you are in a language like C# and discover that one block of code is rate limiting because of memory allocation issues, how do you overcome the problem? In C/C++, you can statically allocate the memory blocks and work around the problem. In Java/C#, the issue is pretty much the end of the project.
Simply put, you can't have algorithms like that in programs with bounded maximum execution times. What happens if the XML file is corrupt? Excessively large? A pathological case deliberately designed to take down the London Stock Exchange? An unbounded tree based on a customer provided data file is a bug in a LSE style application.
Whenever I am looking at code blocks that need to execute quickly, the first thing I look for is blocks of code with unbounded memory, or unbounded execution times. C# encourages using these blocks of code. Real-time software requires using a small subset of available computer science techniques. Language and library support for this must be present.