The London Stock Exchange Goes Down For Whole Day
Colin Smith writes "TradElect, the Microsoft .Net based trading platform for the London Stock Exchange, was offline for about seven hours, meaning that their 5-nines SLAs are shot for approximately the next 100 years. The TradElect system was launched back in June of 2007 and was designed for increased speed and system capacity."
...now if only my wife would do that! /rimshot!
most of the american stock exchanges have been going down all year.
09 F9 11 02 9D 74 E3 5B D8 41 56 C5 63 56 88 C0 is the magic number.
Assuming 8.5 hour trading day (0700-1530) and 250 trading days/year. Maybe a squirrel caused the problem ... ;-)
Hulk SMASH Celiac Disease
5 nines does not mean what you think it means.
So what happens when this happens again?
Ignore this signature. By order.
It was an ugly day of finger-pointing and near-fixes, but in the end, it just left all the financial firms standing there staring at the Exchange. Definitely was a big deal--and it seemed like a lot of volume spilled over to US markets, creating volume related issues here.
"Nature doesn't care how smart you are. You can still be wrong." - Richard Feynman
.... a method of controlling the market.
But Patch Tuesday is tomorrow?
Get your own free personal location tracker
Since when is 7 hours even close to "a whole day"? Maybe you meant "almost a whole business day"?
It could have been MS's fault, or it might not have been... not clear from the article.
Stupid sexy Flanders.
Looks like someone needs to brush up on their buzzwords, specifically "mission critical" and "services no longer required".
"As God is my witness, I thought turkeys could fly." A. Carlson
Please reboot me.
That's just nuts. I don't understand the rationale for that at all, in this day and age.
No different then what can happen on a unix box I suppose.
.Net is pants!
I wish people would get into the habit of linking to the single page version of the FA.
The summary implies that TradElect was responsible for the shutdown, but according to the stock exchange itself, it wasn't the case. They say instead it was a network problem.
Oh, she does... just not with you.
nudge nudge, wink wink.
Dedicated Cthulhu Cultist since 4523 BC.
is going to bring back a lost day of trading.
..people realise that Microsoft products just aren't suitable for mission critical stuff.
And why wasn't there a backup system?
Perhaps the bit you're missing is that windows isn't quite as bad as the /. crowd likes to say it is. Especially if its an older (translation: fixed & stable) variety like win2k or even nt4.
"and was designed for increased speed and system capacity"
and see - it went down far faster and more completely than the previous system would have been able to. So that's progress. It's all in how you present it.
I guess that didn't work out so well.
The London Stock Exchange Goes Down For Whole Day (Score:0, Offtopic) ...now if only my wife would do that! /rimshot!
by Anonymous Coward on Monday September 08, @04:25PM (#24924597)
Reply to This
Re:The London Stock Exchange Goes Down For Whole D (Score:0, Offtopic)
by east coast (590680) on Monday September 08, @04:32PM (#24924749)
Oh, she does... just not with you.
Damn. Talk about humorless mods. I at least got a chuckle out of that.
My blog
I want a churro.
Too much enthusiasm. You need to convey a sense of exasperated, yet restrained disappointment.
The exchange insists the problem was connectivity, not the trading platform.
Not to sound overly cynical, but I'd hardly expect them to acknowledge the problem if it were the trading platform that was the issue. That'd kind of be business suicide.
So their 9.9999% uptime is screwed?
proud caffeine whore
Perhaps the bit you're missing is that windows isn't quite as bad as the /. crowd likes to say it is. Especially if its an older (translation: fixed & stable) variety like win2k or even nt4.
I'm not sure if you're serious or not, but surely you aren't trying to compare NT4 uptime with the 5 9s of a solid System z platform?
Oh please. Persuasive marketers can get Windows installed just about anywhere including US war ships.
While it is commonly accepted by many techies (and strongly denied by others) that Microsoft Windows is not a suitable platform for that level of computing, sales people often bypass the techies who know better and sell to managers and executives who still believe "you can't get fired for using Microsoft."
With all this said, it will be quite some time (and possibly never) that we will ever know for certain what is at the root cause of the failure. You can be sure that Microsoft is all over this problem both technically and P.R.-wise. They won't let the facts get out if they are damaging. Recall the major power outage that many still believe was caused by a worm attacking Microsoft servers? As far as I can see, the true cause of that failure has yet to be revealed.
But if this was a planned event, or an unplanned disaster resulting from a planned event gone bad (updates, upgrade, other maintenance), you would think they would have provided for mishaps in some way or another.
But as this news story is all I have to go on, there is no indication of cause and so I will not presume this is a Microsoft problem. But it says a lot that NYSE runs on Linux and not Microsoft. It seems SOMEONE did listen to the techies.
No, it's not "quite as bad".
It's worse. With those old ones, it's far, far worse.
Oh, she does... just not with you.
nudge nudge, wink wink.
Your wife -- does she go?
After the malfunction, TradElect was immediately bought by UK's government for $200 billion and all its debts waved. In an unrelated story, medicare tax was raised yet again because of an unexpected shortfall.
.....I mean, there couldn't be any other possible cause for the problem.
My goodness! You believe in God? ("Good lord") That's just nuts. I don't understand the rationale for that at all, in this day and age
To paraphrase what they used to say back in the day about IBM, nobody ever got fired for buying into Microsoft.
...are we scared yet?
Oh, she does... just not with you. nudge nudge, wink wink.
Your wife -- does she go?
More importantly, does she run?
More specifically, does she run Linux?
Ignore this signature. By order.
"Blimey!"
"Whom is the one to blame for this, for I shall kick their arse!"
"They made a bollocks of our stock exchange!"
Perhaps the bit you're missing is that windows isn't quite as bad as the /. crowd likes to say it is. Especially if its an older (translation: fixed & stable) variety like win2k or even nt4.
A) Yes, in fact, it is quite that bad (just not as bad as when it was first released) and
B) There is no "fixed and stable" version of .NET yet. At least none I would hinge my mission critical business on.
"A person is smart. People are dumb, panicky dangerous animals and you know it." - K
Does anyone else remember the "The london stock exchange chose windows 2003 for reliability, they didn't choose linux" ad banners that used to run all over the place, including slashdot if i remember?
Funny how it's all come crashing down...
"The london stock exchange chose windows, but after 7 hours of downtime wishes they had chosen linux".
http://spamdecoy.net - free throwaway anonymous email - avoid spam!
"5-nines SLA"
I had to look this up, so I imagine other people didn't know it either (I thought was was a stock exchange term). First Google search result reveals the answer,
The Battle With "3 Nines" and The Goal of "5 Nines"
30 times burnt, 31st time shy.
There's only so much "But this one is actually really good" that I can accept.
Get your own free personal location tracker
bollocks
That's 'bollocks', mate. You'd know 'em if you had 'em.
Sig this!
I guess this is the response from Microsoft for the EU's finding that Microsoft has violated anti-monopoly laws in Europe. Lesson to be learned? Don't fuck with MS.
The LSE going down is a big deal. The US exchanges have been trying very hard to displace LSE's strong hold in the EUROPEAN markets. With the merger of NYSE/Euronext and NASDAQ/OMX this cuts market share and faith in LSE as everyday passes. Additionally with continued tech issues, NASDAQ could reinvigorate their bid for LSE again! I work for a data major data vendor, and I know from experience the NYSE and NASDAQ are much more reliable than their European counterparts. Also LSE going down today is huge, considering the news on Fannie/Freddie, WAMU, Lehman, and the WRONG news on United Airlines. Many arbitrage opportunities were lost for LSE traders.
The new platform has been designed to the highest levels of resilience with comprehensive back up, which includes dual processing at two sites and recovery from component failure within a second.
from LSE TradElect system goes live , OnWindows.com, 18 June 2007
http://www.computerweekly.com/Articles/2006/09/26/218637/city-prepares-to-test-new-trading-platform.htm
I bet the fingers are pointing today - Accenture (formerly Arthur Andersen) India vs HP vs Microsoft.
STFU Twitter you silly little cunt!!!
.NET garbage collector: "Oops, that wasn't garbage!"
'a';DROP TABLE users; SELECT * FROM DATA WHERE name LIKE '%'... if you're reading this, it didn't work.
Why is Microsoft getting dragged into this discussion? There's no mention of them in the main article, nor in the itworld.com article linked above. And yet this story gets tagged with "microsoft" and given the Bill/Borg icon just because TradElect uses .NET? I'll admit Microsoft has it's share of issues, but let's reserve credit/blame for when it's actually due.
Oh, she does... just not with you.
nudge nudge, wink wink.
Your wife -- does she go?
More importantly, does she run?
More specifically, does she run Linux?
More relevantly, does she run TradElect?
Let me explain computers to you. See, the developer uses a set of platforms, languages, integration components, etc.. to deliver his functionality to the end user. A failure at any level can cause the application to fail. It could be application logic, network issues, hardware issues, integration with third party systems, a dipship systems administrator, etc...
And yet the 90-105 IQ SlashDweeb set comes out in numbers with no data and says "lolz Windoze! .NET haha!". Crikey.
Even if MS is able to make Windows good at what it is and generally reliable, what it is is not a high-SLA platform intended for mission critical systems, so there's really no excuse. I don't think NSA/CIA/DoD would say, "The security model of Windows isn't quite as bad as the /. crowd likes to say it is. Sure, we haven't reviewed it, but the IT guy says it will help us leverage synergy to effect better ROI."
AHeehehahahahahaa! Microlimp's operating system will never be "Enterprise Ready". They should have stayed running on OpenVMS and continued to enjoy flawless 10+years of uptime, instead of this pathetic "five-9's". VMS on any platform FTW!
If they only promised nine fives of reliability they'd be back up to snuff by Wednesday.
look for more info here: http://www.londonstockexchange.com/en-gb/products/membershiptrading/tradingservices/Incident/LIVE
That this story hasn't been picked up by the major network's (CNN, Fox, MSNBC) websites yet. It's three bloody hours old.
Sig this!
What day was that? Or do you mean after the Second World War, when the UK was right royally shafted by the USA, and we have just finished pay our debt of on.
So in other words, Microsoft single-handedly brought down a London Stock Exchange? That's way too much power. FIGHT THE POWER!
"The best way to accelerate a Macintosh is at 9.8m/sec^2" -Marcus Dolengo
I want my wife. It is my ex that I want to disappear.
There's a lot of unknowns...
1. Were they running fault tolerant hardware? i.e. Like Stratus ftServer's or was it some cluster setup?
2. Was it the OS/Database/.Net app?
I'm sure all the details will come out.
Yes Francis, the world has gone crazy.
Was it back up before the bell rang and did the Micro$oft stock plumit? in a dutch dialect dot-net translates into "doesn't work" ;-)
Oh, she does... just not with you. nudge nudge, wink wink.
Your wife -- does she go?
More importantly, does she run?
More specifically, does she run Linux?
More relevantly, does she run TradElect?
No, she goes down on it.
Ignore this signature. By order.
http://www.downoneveryoneorjustme.com/
(it's a parody of http://www.downforeveryoneorjustme.com/ )
I've been on projects time and time again and they keep wanting to use the latest and greatest only to have it break a few months (if their lucky) after deployment.
Assuming it was developed a little while ago they should have stuck to something more robust where people know the bugs and workarounds. .net 1.1 was buggy as hell, .net 2 is a lot better but I wouldn't call it mature. maybe in a few years .net will become a platform that you can write more critical systems on but not yet (and that's leaving out the windows side of things)
thank God the internet isn't a human right.
That's why, at my work, we promise 9-fives.
Windows does suck, building any mission-critical system on a fundamentally botched foundation is begging for trouble, and knowing that TradElect was built on quicksand is prima facie evidence of negligence. IOW, it probably failed because windows sucks, not the other way around.
Let me explain computers to you
Let me explain stock exchanges to you: if they go down during a trading day, a lot of people lose a lot of money. In years past, this kind of work was typically done on Tandem, Stratus or IBM systems which were so reliable that any unscheduled reboot merited a visit from the factory.
BTW, I've worked on trading systems for Salomon Brothers, Phibro Energy, JP Morgan, and UBS/Warburg. If anyone had suggested running mission-critical back-office apps (like the system of record of a major stock exchange) on windows, they would have been laughed out of the room. I'm astounded that the LSE could be so sloppy.
-jcr
The only title of honor that a tyrant can grant is "Enemy of the State."
Your big stick is useless. Your economy is toast. You are going down.
Now tell me again about McCain?
The exchange goes down at the same time that the US buys two huge, failing monoliths?
"Turn the bloody thing off! We'll just blame Microsoft and see how the rest of the markets shake out, shall we?"
If brevity is the soul of wit, then how does one explain Twitter?
What the hell are you smoking??!? I worked for one of the top switch/router manufacturers for 7 years and this is FAR from true in their shop and pretty much everyone else's. Talk to any technical call-center rep for the top 5 or so router manufacturers and I am sure they can tell you many horror stories...
No different then what can happen on a unix box I suppose.
Note that the current system is built around a large cluster of 2.2GHz servers, while the unix-based system it replaced (which coped perfectly happily with a substantial portion of the same traffic) ran from a smaller cluster of much slower servers.
The primary purpose for the new system, introduced less than a year ago, was to expand capacity. For it to have failed within a year due to lack of capacity basically means that it has failed in that objective.
If she's down then who will do the dishes and laundry? How do you reboot her? Does it really take 7 hours? Don't they make drugs for that?
What.. what's a wife?
I poked around a bit but there's not much public information out there. If the problem has been diagnosed no one's talking.
"The ability to delude yourself may be an important survival tool" - Jane Wagner -
President of Exchange: [Randolph Duke has just collapsed with shock] Mortimer, your brother is not well. We better call an ambulance.
Mortimer Duke: Fuck him! Now, you listen to me! I want trading reopened right now. Get those brokers back in here! Turn those machines back on!
[shouts - it echoes pathetically throughout the trading hall]
Mortimer Duke: Turn those machines back on!
"FDA staff reviewers expressed concern about the number of patients who were left out of the study because they died."
I think you need to brush up on your history.
Lets see.. if by "that day" you mean "Black Wednesday" the US did not bail the UK out, the UK simply dropped out the ERM.
It's so easy to bash Bush as you put it, because it's all he is worthy of. He has been, without a doubt your worst president ever, should we stand by and be silent whilst his crony Military Industrial Complex buddies suck the blood out of your economy by War and Legislation means. Check the balance of payments in your economy before Bush and today, Check how the US is rated for human rights, and how you appear as a nation to other nations.
No one came to "aide" the Georgians because they attacked South Ossetia (probably as a diversionary tactic so that Israel and the US can attack Iran). Trial Baloon as it were, consolidate world opinion against Russia beforehand. There are 4000 Israeli military advisors in Georgia, ever wondered why?
A very dark "winder" will fall? The Russians are back? did they ever leave? (why is it always "Us" and "Them") If you aren't with us you are against us.
Socialism for America? is that really such a bad thing? America is the only wealthy western country without a national healthcare system, your private healthcare system denies basic healthcare to people without health insurance (and even some with health insurance in order to prop up the profits of private companies). Your GINI rating of 47 rates with the likes of Mozambique and Mexico. It makes me laugh how Socialism is considered a dirty word in the US.
I know I am wasting my keystrokes, You can't turn a brainwashed republican that has probably never even travelled outside the US extensively.
Oh and please stop selling us Windows, especially for running stockmarkets. Linux is far more stable, but maybe Linux is too "socialist" for you having been created by such communists.
isn't that how Microsoft does their Xbox numbers?
Why not do the same here since it'll only take a few million in advertising to get the general public to believe in the 5-nines trustworthy computing, most secure OS, blah blah blah stuff.
Is that this is good news for most of /. readers. A lot of big corporatations are being reminded of how important it can be to not cut corners when hiring programmers and IT.
"The ability to delude yourself may be an important survival tool" - Jane Wagner -
Perhaps .NET is not directly at fault here.
Interestingly, the main reason .NET (specifically, C#) was being adopted in our company is that "it gives access to far cheaper programmers than your legacy C++ types."
I guess programmers are not only more expensive because they have a "legacy language" on their resume after all.
Hmm, and with MS a US-based company to boot... ;)
I'm not seriously into any conspiracy theory here, but it certainly is an interesting juxtaposition. Were I on the management team of any of the European exchanges, with the US exchanges breathing down our collective neck for our trading business, I'm not so sure I'd be happy selecting a US-based company as one of our key IT vendors. I don't know about what anyone else might think...
Cheers,
"What in the name of Fats Waller is that?"
"A four-foot prune."
No, he'd waggle his arse .
A fanny would be a vagina in Britain.
Come on +5 informative!
In what way is this post a "Troll"?
"In the past six years, there have been no production outages at the London Stock Exchange, and the new systems running on Microsoft technologies are critical to maintaining this 100 per cent reliability record."
http://www.microsoft.com/casestudies/casestudy.aspx?casestudyid=200042
"XML is like violence. If it doesn't solve your problem, use more." - Anonymous Coward
So what was wrong with the network.. rats chewing on the cables?
I don't get it, but lucky for me he doesn't either
Oh, ye of lesser cynicism. I also, long ago, used to believe that language features could improve software reliability. Nowadays the idea just makes me cackle -- in actuality the universe just invents better idiots.
- "History shows again and again how nature points out the folly of men" -- Blue Oyster Cult, 'Godzilla'
It collapsed under the weight of all the people trading in Shoe Circus shares.
It's called redundancy. Yes, a single router will fail. 2 at once? Likely not. 3? No.
The article here blames it on some sort of botched upgrade.
I sold my boss on nine 5's! Hah!
No, actually the Windows system (10 ms per transaction) was a 13x speedup over the older system (135 ms per transaction), followed quickly by an addiditonal 50% speedup (6 ms per transaction). The Windows system was just recently updated to double performance again (3 ms per transaction), so it's now 45 times as fast as the unix-based system it replaced.
You may be able to fault it on reliability (though the olde system wasn't perfect either), but you can't fault it on performance.
Socialism: a lie told by totalitarians and believed by fools.
"It could be application logic, network issues, hardware issues, integration with third party systems, a dipship systems administrator, etc"
.Net based trading platform ..
But it wasn't any of the above. The Stock Exchange failed after a failed upgrade of the Microsoft
davecb5620@gmail.com
No, actually the Windows system (10 ms per transaction) was a 13x speedup over the older system (135 ms per transaction)
No. That means it was 13x quicker, not 13x faster.
Dewey, what part of this looks like authorities should be involved?
Often what I've found in performance upgrades like this, doing the same task it is 10x faster; the marketing people seeing that we now have this capacity decide to something new widget (mainly to automatically change a color, etc) to management that developers are rushed to deliver and the color changing widget now makes everything run 10x slower so the net end performance result is the same (but you do have fancy colors).
Of course it is very unlikely that MS achieves five 9s on any installation, let alone as an average.
Engineering is the art of compromise.
You can talk up system z all you want, but when it comes right down to it, most of the outages problems are caused by incompetence, not hardware failure. Because of this, I've actually seen a Win2K based system beat zOS based systems a few years in a row. It frequently has little to do with the hardware, or even the OS.
"The UK's major banks and hundreds of City trading firms will begin testing the London Stock Exchange's new core trading platform early next month, ahead of its planned launch in the summer of 2007 ..
.. Tradelect .. will rely on high-speed middleware developed in-house, which was created using Microsoft's C# programming language and the .net Framework"
Accenture built the Tradelect platform in India between late 2004 and March this year
davecb5620@gmail.com
Yup. And I bet the people behind the system still got their obscene bonuses last year, and will probably get them this year as well (for "solving" a problem when they fix this and ensure it "can't happen again").
People build faulty systems, and sell them, even when they KNOW that they will fail.
I was involved in building a new infrastructure for a banking business. The infrastructure was supposed to meet certain acceptance criteria. My boss got lots of "proof of concepts" from lots of vendors (read: Suckers), to show it would meet that criteria. Guess what? The one or two that showed a glaring hole we missed? Those vendors got booted from the PoC. Their equipment returned, but none of their phone calls, and they were never mentioned again (even though they were correct and we weren't actually meeting the acceptance criteria).
Why did this happen? Because my boss didn't want to risk calling the delivery into question, and risk his big fat completion bonus (even though he was delivering a flawed system, and had even obtained proof of it).
People lie and steel and cheat, especially when there is money involved. Good thing there is no money in the stock market. ~
probably if you check the contract this will be chalked up to one of those events that isn't covered by the 5 nines..
I'm not sure I understand the distinction you're trying to draw, but total transaction capacity of the system increased along the same lines.
Socialism: a lie told by totalitarians and believed by fools.
d'oh. So it was all built on .NET 1.1, no wonder. They need to upgrade to .NET 3.5 and all will be good. promise.
6.40K transactions/second ought to be enough for everyone.
I have discovered a truly marvelous proof of killer sig, which this margin is too narrow to contain.
If only they built it with 6001 hulls!
"When I worked in academia .."
davecb5620@gmail.com
Is it "Talk like an Ass-Pirate Day" already?
No, TLAP Day is next week.
In times of universal deceit, telling the truth gets you modded -1 Troll
Mainframes, AS/400 - pricey but high reliability with 2-3 decades of baseline.
PC's / Windows - cheap, "good enough" for home use and light business use, new as hell and subject to unknown problems- and with a known history of issues too.
I use a pc daily... but there is no way I would put any 99% uptime application on it where huge amounts of money or lives were at stake. It's fine as a client.
She was like chocolate when she drank... semi-sweet at first and then increasingly bitter.
Here: http://www.londonstockexchange.com/en-gb/products/membershiptrading/tradingservices/Incident/LIVE
Notice that there were several unsuccessful attempts to bring it back up.
What's really pitiful, LSE has just a fraction of data/trade volume of major US exchanges like Nasdaq or NYSE and still, their systems are regularly getting hosed, albeit not as much as today's meltdown.
Hopefully in coming years LSE will lose market share to Nasdaq/Europe, BATS/Europe, Chi-X and other electronic markets - that should teach them well.
I'm not sure I understand the distinction you're trying to draw,
Latency versus throughput. If the new system processed those serially while the old could handle 130 in parallel, then the old system would be 10x faster even though the new was 10x quicker.
but total transaction capacity of the system increased along the same lines.
Yes, after throwing massive amounts of hardware at the problem.
Dewey, what part of this looks like authorities should be involved?
but it does have a lot to do with the development. They chucked the old, stable (but "obsolete" and slow) systems for something shiny and new. In this case, Windows and .NET 1.1 written by a consultancy with Indian developers.
I doubt reliability really factored much into it, using the newest coolest stuff came first. Possibly for marketing reasons. Remember this was 4 years ago, .NET had pretty much just come out (we ignore v1.0 which was practically a preview release), you know it couldn't have been as good as the MS marketing man said it was.
Yes, only on Slashdot. While I do run sites on Linux, what makes you think that Windows is not capable for mission critical system? Just because some Slashdot flamers told you so?
Linux/Unix and Windows can both be used for mission critical systems based on my experience. The real issue is having a good architecture.
most of the american stock exchanges have been going down all year.
It may be funny but when billions of dollars and even lives are at stake, software unreliability is not that funny. Software should work like hardware and should never fail unless there is a physical breakdown. Hardware is reactive, parallel and synchronous. Likewise, software should be reactive, parallel and synchronous. Until computer scientists realize this simple truth, we will continue to have catastrophic software failures and pay the consequences. It's time to say goodbye to the antiquated computing models of the last century. It's time to put the obsolete ideas of Turing and Babbage to rest in the age of multicore processors and massive parallelism. And please, kill the damn threads already; there is a way to implement rock-solid parallelism in a computer that does not involve threads at all. More at the links below.
Why Software Is Bad and What We can do to Fix It
How to Solve the Parallel Programming Crisis
They'll have to aim for 0.99999% now, but with Microsoft's help, I'm sure they'll be able to do it!
Gddmmnt, I have vivid imagination, you just gave me a terrible nightmarish sort of a vision of the Borg bending over and waggling his naked VAGINA! How do I poke out my mental eye?
You can't handle the truth.
From the article,
.NET Framework, with support from Microsoft...
"...The new technology platform has been developed using the Microsoft
Yup, that ought to explain it...
Nothing sucks like a Vax, nothing blows like a PowerMac G4
It will be remembered, as the Ballmer Day.
7 hours? Is that all you can do? We managed a 3 day outage earlier this year at the Ho Chi Minh City stock exchange.
http://www.bloomberg.com/apps/news?pid=20601087&sid=aCTlooFV6H0Y&refer=home
>It's called redundancy. Yes, a single router will fail. 2 at once? Likely not. 3? No.
Saw triply redundant systems fail twice in my career as a net admin.
I'm willing to bet your life on the reliability of triple redundancy.
-fb Everything not expressly forbidden is now mandatory.
Well, if she runs Linux, i doubt she runs TradElect...
Oh, she does... just not with you.
nudge nudge, wink wink.
Your wife -- does she go?
More importantly, does she run?
More specifically, does she run Linux?
More relevantly, does she run TradElect?
No, she goes down on it.
Aaaaand we've come full circle. Well, at least she has. The OP is still going it alone.
That all sounds fine and good, but the reality is that the previous version (written in COBOL) did the EXACT same thing 8 years ago.
Mainframes are not immune to downtime and if they go down, they certainly don't recover as quickly as PCs do. Have you ever booted one up? Yikes! Hope you have half a day.
Anyway, it sounds like a version upgrade or a network problem hosed it, either of which was equally likely if a mainframe were at the core.
Peter predicted that you would "deliberately forget" creation 2000 years ago...
Not necessarily; if it failed due to a lack of capacity and capacity was truly added (when compared to the old system) then it succeeded perfectly (though the matter of how it failed might be another matter entirely).
Rishi Chopra
www.rishichopra.org
Oh, yes they will, they'll just fail in new and exciting ways. (Bug causing your router's CPU to hit 100%? Redundancy won't help you, 'cause once you fail over, the other one will just go up to 100%, too.)
If she's down then who will do the dishes and laundry? How do you reboot her? Does it really take 7 hours? Don't they make drugs for that?
What.. what's a wife?
It's like a mother, but requires less therapy.
It goes from God, to Jerry, to me.
Say no more, say no more, know what I mean? Nod's as good as a wink to a blind bat. http://www.youtube.com/watch?v=4Kwh3R0YjuQ
I guess what they mean is that they complete a thousand trades in three seconds.
can only assume the mods aren't in the "trainline" then.
A horse can't be sick, you know, even if he wants to.
The "5 9's" of the System z platform weren't exactly meeting the needs of the NYSE (hence their switch to Linux & pSeries):
http://searchdatacenter.techtarget.com/news/article/0,289142,sid80_gci1254860,00.html
http://www.itjungle.com/big/big052008-story01.html
Though, to be fair, the NYSE also had a huge, embarrassing outage of its own in 2006 IIRC (not to mention a well-documented outage in 2001 when from a software bug pushed to their mainframes) - I guess there's no such thing as 100% uptime...
Rishi Chopra
www.rishichopra.org
If the issue turned out to be network is /. going to post a retraction article? Both the summary and the comments are pretty pathetic.
I fully appreciate they have an agenda and a cause. No problem. But for such a popular site, if they're going to throw stones they should do at least wait until the target is verified.
Damn traiters.
It's about the same thing when people say that "XP does not crash, it's faulty device drivers that crash".
If a system should be reliable, then it should be reliable, no excuses accepted. It does not matter if it's system bugs, application bugs, hardware failures or power outages, a system that pretends to achieve 99.999% availability should take all that into account.
The operating system is not at fault if the power goes down, of course, it's a sloppy engineer that designs a system without redundant power supply. But, likewise, a sloppy engineer will prefer a system that lets him configure and operate it by click-and-drag, instead of a carefully designed and tested set of procedures.
A critical system should NEVER depend on an operating system that does not have a proper batch language. That should be a compact and powerful script language, using TEXT files for configuration that can be hand edited if needed, that can be stored and archived in a version control system, so that bugs can be tracked.
.Not, crappy MS Appserver kits, etc. and all that aside ... we know the drill and we know that huge chunks of the MS runtime stack is a big heap of stinking do-do. No news here. And we also know that MS got their advertised-all-over-the-place LSE figurehead .Net enviroment (which as of now has turned into a major type A PR-screwup) by bribing the shit out of some LSE execs and decision makers.
But all that aside, ... seriously, WTF?
I mean, if I'd build a system like this, which, as far as I can tell is way up in or very close to the top ten of "mission-criticalness", just a few steps below 'Nuclear Power Plant' 'Air Traffic Control' 'Software in Space' and Medical Devices, I'd be super-über-f*cking-sure I can run a Steamroller over the entire live datacenter while the hot spare kicks in without missing a beat. No matter what software-tapestry some crack-smoking exec told me to build it on. I'd think they'd have MS pay the extra 250% of hardware it needs to keep this sort of thing running on .Net *and* have them help in laying out the cluster-topology and planning/paying/consulting/lobbying for the extra room, time and resources it takes to run it. To be honest, LSE on .Net dead for an entire workday would be unplausible for a piece of fiction written by a MS hater. And yet it happend. Absolutely unbelievable. Not even did I expect .Net to be this bad. Apparently I was wrong.
Some high-up heads are gonna roll over this. At MS and at LSE. That's for sure.
We suffer more in our imagination than in reality. - Seneca
>It's called redundancy. Yes, a single router will fail. 2 at once? Likely not. 3? No.
Saw triply redundant systems fail twice in my career as a net admin.
I'm willing to bet your life on the reliability of triple redundancy.
was the triply redundant system running different routers from different manufacturers on completely separate power/ac/everything? Running 3 boxes from the same vendor with likely the same firmware rev and sometimes built the same day (since they were purchased/upgraded together) is not 'triply redundant'.
-- the cake is a lie
I don't think that says anything. If system a runs on windows and system b runs on linux and if system a crashes, linux is better than windows? No.
You have hit the nail right on the head.
I'd like to suggest another factor: Stability-conscious developers -- those that know about race conditions, memory leaks, atomic transactions, and the like -- tend to gravitate towards operating systems that make it easy to put their ideas into practice.
That isn't to say Windows is inherently unstable, it just means that it is more difficult to write a stable and reliable application on that platform. And even if you think you got all your bases covered, you can still get blindsided by depending on poorly written code churned out by some .NET developer who was happy enough to ship something that appeared to work most of the time.
The good developers then shrug and say Windows is not suitable for critical computing, and go back to UNIX-ish platforms or whatever they are more comfortable with. Rushing into that void are legions of Windows developers who are also happy enough to ship something that appeared to work most of the time, and the cycle continues.
And then he went and picked Microsoft .NET for his system? Let me guess: he was gay, right?
Pass the popcorn !
music lover since 1969
I love to bash windows as much as the next guy, but I don't see the connection.
Developing an application on .NET does not transfer all your responsibility to Microsoft.
Modding me -1 troll doesn't make me wrong.
You find that on entry level hardware raid cards nowadays. You usually have 256M of cache, and a battery that lasts 12h without power. That makes writing a transaction as fast as writing to RAM.
I mean, that might be what they worked on, but it's kinda pointless; what's interesting is the # of transactions per second, and that can usually be improved at the expense of individual latency. For example, databases can be configured to wait a few milliseconds to group transactions, so as to write several to disk in one single write/sync.
/rimshot!
Here you go.
Comment removed based on user account deletion
"No one has ever been fired for choosing Microsoft...."
I've heard this repeated over and over again in throughout numerous industries, over the course of my 20 or so odd years involved with technology.
I think this age old saying is about to go the way of the dinosaurs.
Nick Illidge Financial Markets Sales Manager at Microsoft UK "We are delighted that the London Stock Exchange has selected the Windows platform to base a significant part of its business on. This is further evidence of the enterprise scalability of the Windows franchise. We see our relationship with the Exchange and Accenture as a strong partnership. The Exchange is bold in its technology vision, Accenture provides the capability to deliver this vision, and Microsoft is providing the core technology to help provide the business benefits that the Exchange is looking for."
David Lester CIO at the LSE says ... that the LSE "is the only exchange in the world not to have had a single outage in six years."
"This is all about the question, 'How are we going to take over the world?'" says Lester, "... I believe this system -- because it's fast, agile and reliable -- will help us compete better. Our current system has to go down for four hours every evening to get ready for the next day's trading," he says. "The batch processing is '80s and '90s technology. You can't run a global market with a system that has to be down for four hours."
Here's a great factoid
Before joining the Exchange in 2001, David worked for Thomson Financial and Accenture.
There are places where the networks are not touching,and there are places where they are-Boeing's Lori Gunter
I see no difference in Oracle performance on Windows vs Linux. Performance is usually more about architecture than the platform. You can make highly available networks with Windows, people do it all the time, much more often than a lot of Slashdotters seem to think.
Bad admins can ruin any deployment, bad coders can make the admins life hell trying to keep the app alive.
I haven't heard anything to make me think it was a capacity issue, I have heard that this system is an order of magnitude faster than the previous Unix system so I'm unsure of who's right since there doesn't seem to be any actual data yet. Of course one problem with making a system perform faster is that contention can become an issue if database access isn't properly handled. Not a problem with any particular DBMS.
Make no mistake, a project the size of this one is hard to do successfully, a fault in the initial deployment doesn't mean that the system can or should be scrapped.
In my particular case, we converted a mission critical VB app to asp.net C# and boy did we run into problems initially, of course it had nothing to do with the platform as it turns out most of the business rules weren't as solid as we were led to believe leading to literally hundreds of new requirements.
They should have kept the old RISCOS Acorn machines as backup. Unsure if they ever had any downtime on that system.
here their status page.
http://www.londonstockexchange.com/en-gb/products/membershiptrading/tradingservices/Incident/LIVE
same below, but may not render properly
Incident Updates
Time Market Status Exchange Action
Client Impact
Client action
6.43pm Market Closed
The Exchange regrets the earlier interruption to trading and is conducting further investigations. It is in the process of confirming all of the steps necessary to ensure trading can commence as scheduled tomorrow.
Further updates this evening will be published on this website.
Monitor this Website.
4.49pm Market Closed Closing auction has now finished. Closing prices, where relevant, have been disseminated.
4.21pm Closing Auction
This is to inform you that due to on-going connectivity issues to resume a fair and stable market the Closing Auction will commence from 16:21 onwards. The Closing Auction will uncross as scheduled at 16:35 onwards (subject to a 30 second random period).
4.00pm Continuous trading
Standard trading schedule will be followed for the remainder of the day.
3.45pm Auction
The auction will uncross at 16:00 BST (subject to a 30 second random period) at which time continuous trading will resume.
There will be no further change to the remainder of the trading day. Therefore, the Closing Auction will commence as scheduled at 16.30 and uncross at 16:35 (subject to a 30 second random period).
From this time market maker quotes in both quote and order driven markets will be firm.
Prepare to resume trading
3.30pm Auction
The International Order Book and International Bulletin Board will NOT be available for automatic execution for the rest of today.
No closing prices will be issued in these trading segments (IOB, IOBU, ITBB and ITBU) today.
The remaining Trading Segments will remain in an auction phase. A further update will be provided.
3.11pm Auction
We will be re-enabling connectivity from 3.15pm
Connectivity will be phased and following completion all order book segments will remain in an auction phase.
Once connectivity is established orders can be entered and deleted, but no electronic execution will occur until the uncrossing and commencement of continuous trading.
2.38pm
Auction To ensure consistent connectivity we are suspending connectivity to trading for a short period from 2.45pm
Once connectivity is established orders can be entered and deleted, but no electronic execution will occur until the uncrossing and commencement of continuous trading.
During this time customers are required to reset their log on connection status to ensure legitimate connections can be established once connectivity is re-enabled
2.20pm Auction
We are continuing to establish connectivity with our customers. This process is taking longer than expected.
A further update will be provided.
Once connectivity is established orders can be entered and deleted, but no electronic execution will occur until the uncrossing and commencement of continuous trading.
1.13pm Auction
We are continuing to establish connectivity with our customers.
A further update will be provided shortly.
Once connectivity is established orders can be entered and deleted, but no electronic execution will occur until the uncrossing and commencement of continuous trading.
12.30 Auction
We are continuing to establish connectivity with our customers.
Continuous trading will re-commence at the end of the auction period. We will provide at least 15 minutes notice of when we plan to end the
IIRC, Brazil Bovespa had a small glitch last month or two.
Back in the day when Wall Street and financial markets ran on Solaris systems (AFAIK), this shit wasn't common.
Now it's probably going to become *acceptable* for stock exchanges and aviation reservation software to crash.
Apparently, there's a new generation of a-holes on the system administration markets who grew up with Windows and the Blue Screen of Death, that thinks it's acceptable for operating systems to crash, once in a while. Is it evolution?
Main difference between the BSD license and the GPL license: one is from California and the other is from Massachusetts
Picking Windows is a really bad way to start out.
Your aunt probably runs it, but is it really enterprise ready?
you had me at #!
n/t
you had me at #!
Windows is just consumer junk, and not even very good consumer junk.
Kickbacks are almost certainly at work in a deployment like this.
you had me at #!
meaning that their 5-nines SLAs are shot for approximately the next 100 years.
Bah, that's nothing. The school district I work for (very big one) bought a $50 million web-based maintenance and operations tracking software package that fails all the time. We like to joke that it has "almost two nines of availability!" The supplier of the software (whose salesmen claimed it would "do everything you need, right out of the box") says that it can be fixed for another $30 million. Your tax dollars at work!
If a job's not worth doing, it's not worth doing right.
There's all kinds of ways Microsoft is known to influence deals like this (especially high profile deals). No sane person would want their bank, Navy, doctor, hospital, airliner, public transport running Windows anywhere: But throw enough money under the table and suddenly really bad ideas look really attractive to decision makers. Little else can explain these Microsoft deployments.
you had me at #!
i wonder what the spin would have been if it was a linux system or apple?
If you mod me down, I will become more powerful than you can imagine....
Microsoft is the posterchild; they even break the law to get those advantages.
you had me at #!
yeah, and when linfags has fails there is always someone else to blame. had this been linfux you guys would have been screaming about india or the hardware or god knows what. get your head out of your boyfriends ass and realize that there is a lot of potential for other failure here yet.
ah, what the fuck would you know anyway. the most advanced thing you've probably ever worked on is an 802.11g home network with wireless printing. fucking sack of shit.
does anybody else hate those huge ms ads on linux.com as much as I do?
Saw triply redundant systems fail twice in my career as a net admin.
Give me the scenario. It probably involved power. Getting redundant power systems that work is such a hassle this is normally the pain point.
The London Stock Exchange did not run mainframes 8 years ago. (At least not what most people think of when they see the word "mainframes.") They ran Tandem/HP NonStop systems.
Like a disk can write that fast. =/
Well, other than the title, remember when M$ had all those ads stating "London Exchange using MS Exchange!" You know, the stupid one on computer websites showing a guy running around in the press room with the new big headline? I guess they should have stayed quiet on that one.
Is your wife into photography, eh?
Snap snap, grin grin, wink wink.....say no MORE!!
Light travels faster than sound. This is why some people appear bright until you hear them speak.........
How delightfully embarassing to Microsoft in general and the .net platform in particular!
Wait...is .net good? Some /.ers respect and like it...
So...confused...Help!?
>Give me the scenario.
My favorite was a Network Appliance 760 Filer having three WAFL drives fail.
It did not involve power or cooling. The only time we ever lost power was during the Northridge quake.
-fb Everything not expressly forbidden is now mandatory.
Nope, not Twitter for once. Big chunks of the Baton Rouge area are still without power after Gustav. And if Ike turns north, it may be longer before we hear from him again.
I mean I don't wish ill on the guy, and I hope he stays safe through these hurricanes, but goshdarnit, Slashdot is a little better place without him.
Maybe he meant rim-job...
There are two rules for success:
1. Never tell everything you know.
Oooooh, some absurdly rich people didn't get any richer today... ohnoes!
en tee.
Hail Eris, full of mischief...
E pluribus sanguinem
Ok, so this means that the next time a mission-critical Linux-based app goes kaput, I'll see a headline on slashdot that reads, "Linux-based system goes kaput." Right? Oh, I forgot, nothing ever goes wrong in Linux world. All Linux programmers are creme de la creme, all Linux administrators are top-notch engineers, and nobody ever botches an upgrade or releases a sour version. In fact, every IT mishap of the last 30 years can be directly traced to pointy-eyebrowed, mustache-twirling marketing villains who convince hapless stakeholders to install Microsoft products. And then they tie the hero to the railroad tracks, foreclose on the family farm, and steal sweet little Lulabelle away from her fiancee.
I've neevr seen that defintion of "faster" vs "quicker" before, but latency vs throughput are words people other than you use in this way. ;)
Yes, that's the whole premise behind commodity hardware: you can throw "massive amounts" of commodity boxes at a problem, and it's still cheaper than some Sun enterprise box. Of coruse, it takes non-trivial software to make things scale well. Google has demonstrated that this works. This is probably why Sun is barely in business these days.
Socialism: a lie told by totalitarians and believed by fools.
But it says a lot that NYSE runs on Linux and not Microsoft. It seems SOMEONE did listen to the techies.
Not really. I work for an IT sales company. There are plenty of times when we try to sell Linux solutions to our customers and management is all for it but it gets shot down by the "techies" because they want a Windows solution because that is what they are used to.
The stock exchange only lasts for a few hours a day. 40% uptime, maximum, would be needed.
I am the richest astronaut ever to win the superbowl.
That was the their first mistake. What were they thinking? You need a 3 highly available Unix clusters with three SANs. You need three to elect a quorum. If you don't know what a quorum is you shouldn't be attempting to design system that is supposed to deliver on a 5-nine SLA. Each geographic location should include 1 cluster and 1 SAN. All three locations networked with dark fiber. fiber routing should be set up so that a cluster can fail over to a SAN in another location. As far as Hardware is concerned, I would go with a cluster of IBM P6-570 and use an EMC Symmetrix DMX SAN at each site. .Net trading platform.. I have to laugh! Microsoft .net = 5.none SLA! .Net is only good for people who would like to create a light duty website. Under a load it breaks. The London Stock Exchange proves my point.
Who the heck designed this?
What.. what's a wife?
WIFE: Specialized form of WIFI, indicating one of two stations engaged in a (semi-)permanent point-to-point link, the other station typically called HUSBAND. Unsecured transmission often leads to packet loss 9 months after initial association, resulting in long-term elevated QoS requirements. Roaming is usually forbidden by link protocol, although experiments with mesh networks have been reported. DOS attacks often lead to severed links, litigation and possibly material and financial damages.
The Hacker's Guide To The Kernel: Don't panic()!
I think you're mixing your metaphors.
Quack, quack.
Um WTF? I agreed with you RIGHT up until you started talking about things like NT4. NT 4 may have been good in its day, for hardware that could run it reasonably, but by modern standards it is neither fixed nor stable. It also is lacking a lot of APIs and system calls which means that modern software often just won't run on it.
There's no place I could be, since I've found Serenity...
It's amusing that there are 404 comments right now
Not anymore there aren't you insensitive clod.
In ten years of parallel processing research that is the first time that I've someone draw that distinction between faster and quicker.
It must be a .... very localised distinction that you're aware of.
Slashdot: where don knuth is an idiot because he cant grasp the awesome power of php
Technically, it went down on her....
-A
Pick any two.
Win2K (and any other Windows based solutions) cut lots of corners in regards to security, so being the one that stays up the longer (I will concede this for the sake of argument, in my experience Wintel does not come even close to even Linux or Solaris) is not necessary a great thing if you are the most vulnerable.
Ask MS people where do they store cryptographic information related to Active Directory and prepare to be entertained.
And what about security?
You make it sound like all it was a complete success of a Windows implementation.
IANAL but write like a drunk one.
If he knew that in many big shops MS is beginning to be a brand with serious problems at the higest level, he would re-evaluate is position.
IANAL but write like a drunk one.
... about the kind of people that posts here.
I never found a techie in the City of London or Canary Wharf that wasn't reading and/or posting here.
A techie rumour here has to be given some serious consideration because there are far more insiders that it would be apparent at first.
IANAL but write like a drunk one.
No, he'd waggle his arse .
A fanny would be a vagina in Britain.
What makes you think he didn't already know that?
In Soviet Russia?
Me lost me cookie at the disco.
I should have said more.
It goes from God, to Jerry, to me.
With an imaginary pencil, duh.
BTW: Thanks for Leet Key.
Me lost me cookie at the disco.
Say no more.
Looks like the LSE is the one taking it in the shorts today.
Got Code?
"you can't get fired for using Microsoft."
Maybe not but you sure can get fired when your system epic fails and stops trading for a day.
Seriously though given the length of the downtime, I doubt this was caused by an operating system problem. If that were the case, you'd just reboot the affected system and hope it doesn't go down again before you can get a patch, no? Even Vista doesn't take that long to boot.
I'll match your 5 nine mainframe with my 7 nines Tandem...
Regardless.. windows (just as nearly any platform) can easily support any x nine's application. It's just a question of how well that application scales horizontally.
From a hardware perspective, it's actually cheaper, but your cost is made up in ensuring your application is distributed redundantly enough to support the 5x fault rate.
Perhaps you haven't read the typical M$ EULA, it is always the customer's configuration fault, when it is not the customer's configuration fault it is a hardware fault, when it was neither the customer or the hardware's fault, then some other software installed on the system is responsible for at least 50% of faults, naturally enough as most business PC have M$ Office installed in parallel to M$ Windows that kinda makes sense ;D. So 5 '9's still intact and with the next rewrite of the M$ EULA 10 '9s' are guaranteed 'er' promised 'er' hinted at.
Chaos - everything, everywhere, everywhen
Bah, my YTD uptime for most of my systems is 99.995 without clustering running Windows 2003. We are a java shop not a .net shop though but java isn't perfect by any stretch of the imagination.
There are 4 boxes to use in the defense of liberty: soap, ballot, jury, ammo. Use in that order. Starting now.
... Talk about a 6 with five 9's. [BUTTHEAD]uh huh huh huh[/BUTTHEAD]
a bug. But I suppose you'd suggest that your redundant routers not use the same firmware versions...but then there are bugs for firmware version X talking to version Y when occurrence Z happens. If you've never experienced such a thing, then you haven't been doing real IT for very long...or you've just been really, really lucky.
Latency IS important, especially for institutional investors or trades on a mercantile exchange. One of the most critical being arbitrage, or buying something from one person and immediately selling to another at a higher price - instant profit. You just have to be the one to spot the price differential first, and it can come down to milliseconds.
A lot of finance firms buy boxes colocated in the exchange just for the latency.
But Bill Gates is a cunt. So in reality he'd just wiggle himself.
The Windows system was just recently updated to double performance again (3 ms per transaction), so it's now 45 times as fast as the unix-based system it replaced.
That smoking hole in the ground was once a very fast rocket :-)
Seriously, if it was a Linux 2.6 system I wager it would be faster on the same hardware, and would not have crashed and burned for an entire trading day.
Have you got your LWN subscription yet?
Why the heck they were using MS Windows for this type of environment is stunning... Transactional processing which is the bulk of this type of setup is where Solaris and Linux excel. Any company that builds a system like that on .Net should be thown out on the street.
In short.. Not to rock on Windows, but different platforms always offer different strengths..
While I agree in part, frankly when it comes to large volume transaction processing a better choice would be an OpenVMS or z/OS platform (or something similar), although these types of platforms have a clunky feel (and are of course closed source) they are designed specifically for these kind of applications.
...now if only my wife would do that! /rimshot!
Don't feel too bad. My wife won't give me a rimming either.
http://instantrimshot.com/
You get what you pay for. Any decent storage system has mirrored caches, each with their own battery, and do weekly tests of the battery automatically. Assuming it is also on a UPS, you shouldn't need to have it use the battery very long except in Katrina-like circumstances.
Anyway, you must have a small shop, because you CAN'T disable the write cache on midrange and high end storage arrays.
Based on this description, seems to me that "arbitrage" is a nice word for inserting yourself into a trade which has nothing to do with you for the purpose of bleeding both the seller and the buyer out of some profit without producing or contributing anything of value. Making it more difficult would make the actual productive parties in the trade better off, and likely help economy as a whole.
Or, to put it even more bluntly: arbitrage, as described by you, is a nicer name for parasitism.
Forget magic. Any technology distinguishable from divine power is insufficiently advanced.
Lets all use this as an example next time the boss suggests moving some business critical stuff to .NET...
I mean, it's easy to blame MS and .NET for the problem (and include me in the people that wouldn't be surprised if it was something like WGA failing :-), but SEVEN hours?
AFAIK that's plenty of time to reboot (cough) so that must have been pretty catastrophic. I have a feeling it's going to cost them more than just compensation, not being able to trade at one of the most active days must cause a whole lot of people to walk. The timing couldn't have been worse.
Insert
I wish people would get into the habit of linking to the single page version of the FA.
Oh come off it. If it were a permanent link, then you'd have a point, but that Reuters article will vanish in a few days like all good wirefeeds do. It's not really worth it for Slashdot, or anywhere else, to link to wirefeeds since they're gone quickly. So it matters little if it's the all-in-one or multi-page if it's a wirefeed. For permanent articles, however, I'm with you: all in one is easier to manage and read.
What you need is a major paper who has taken the wirefeed and put their name at the top. That will be around a few months or even years from now.
Beta is broken and the link to classic doesn't work. Stop wasting our time or there won't be anybody left here.
The word network can be used in many contexts. Here it looks like TradeElect fell flat.
Danska Banken and others pushing a M$ Uber Alles agenda (and willing to bankrupt their own company to achieve it) have has similar problems and also use similar caged, vaguely worded statements that dance around the cause. You'll get a few articles that will point out that using M$ .Not, especially Sharepoint, is a major cause. The rest will avoid the topic or be vague.
Beta is broken and the link to classic doesn't work. Stop wasting our time or there won't be anybody left here.
Staring at Blank Screens Is
A Guinness Book of Records
FLASH MOB?
If it's running on .Net, it isn't running on a mainframe.
Deleted
a battle between software engineers to build bigger and better idiot-proof programs and God building bigger and better idiots.
So far God is winning.
Well, if you're willing to sacrifice correctness, I can offer you a system with submillisecond (even subpicosecond) performance...
Today was a heck of a reason to change that belief. I bet there are a lot of absolutely pissed-off LSE customers who totally agree...
more like .NOT. the only systems that have achived 9 nines (yes you read that right) are systems built on Erlang. Stupid choice no doubt encouraged by many a lavish MS back hander.
meaning that their 5-nines SLAs are shot for approximately the next 100 years
No worries. They'll just buy SLA credits. Like carbon credits, but for uptime.
WTF did a moderator mark this as flamebait? The poster was right, HA is a) hard and b) expensive.
I designed some of the HA stuff many years ago for Eurex. We used OpenVMS and had two clusters (over 40Km apart) for the main and standby with the standby system also being used for development with a flick of a switch the standby cluster could take over in production. We had no SANs in those days but used Digital's Hierarchical Storage Controllers. These days it runs with SANs but the host systems still run VMS and there are now product specific clusters.
The next level down there are access points containing communications servers providing connectivity to member systems and routing to the hosts which are scattered around the globe. A member normally has connectivity to two access points. The only single point of failure for a member is where both lines come together for the last few metres into their building and some idiot digs a hole in the road.
See my journal, I write things there
Statement on the cause of the crash mentioned "connectivity" which is consistent with the Infolect message passing system. The Infolect system based on MS .Net and SQL server and HP proliant servers.
http://whitepapers.silicon.com/0,39024759,60237581p,00.htm
Infolect was also blamed for outage in November 2007
http://www.computerworlduk.com/technology/networking/messaging/news/index.cfm?newsid=6089
NYSE recently moved to Red Hat.
did somebody made an ALT F4 at the system?..
Well, I have to laugh at your ignorance of .NET. It's perfectly suitable for a 5-nine SLA; pick on Windows or pick on the hardware they chose, but .NET as a development platform for a heavy transaction server is perfectly fine and better than picking a dynamically typed scripting language, the buggy abomination which is Java or the error prone C++.
Arbitrage is important but these days on a major market like LSE, it is less so than algorithmic trading (i.e., VWAP or TWAP). This is where you deliberately split a large order into multiple smaller orders so it does not affect the market price too much. As according to MiFID, brokers should have connectivity to multiple exchanges where you need to sniff out hidden liquidity and execute orders where the price is the best. This means really low latency and exchange proximity services. Proximity services don't really help arbitrageurs because you can only be close to one exchange.
See my journal, I write things there
..."A fanny would be a vagina in Britain" ... and down under. (Australia you pervert!)
Like this? http://www.rim.jobs/
No sig for now.
I work in London as a freelancer in IT in Investment Banking. My professional experience was mostly with IT Products/Services companies.
Although I haven't worked in the LSE, from the places I've worked in around here I came out with the impression that most people in IT in this industry are amateurs (and that includes those in other geographical locations).
Any kind of more advanced IT concepts such as technical analysis, software/hardware architecture, iterative software development processes are pretty much either not done or done by people you don't have clue about what they're doing.
I'm hardly surprised with what happened in the LSE.
Outsourced development.
If you don't care how reliable it is, any fool can write you a system that's twice as fast as your current one. And you'd be a fool to buy it
And thus is a demonstration of how Microsoft treats the most important business. Anything requiring serious uptime you should not consider using Microsoft products, I'm surprised that an organisation with a reputation of the London Stock exchange took such a risk.
Windows is, however, good for games - so credit where credit is due.
My ism, it's full of beliefs.
A trading system without the control of the source code? Are they mad or insane?
Why modded troll? It is possible to get high uptime figures with a lone system. You can't take it offline, but hell, I could probably run my PC for a year end to end without issue. The problem occurs when I try to scale that and make, say, 200 PCs all run for a whole year without issue.
Well, TBH, .NET only runs on Windows. Picking on Windows and picking on the .NET choice is as such the same thing.
News about the Kettle Open Source project: on my blog
was it showing blue screen of death on every single screen in LSE?
They (Accenture) also developed the first implementations of Eurex & Xetra. They worked but had to be substantially rewritten before volume ramped up.
See my journal, I write things there
Welcome to banking. You'll love it here.
"The outrage came at an embarrassing time..."
The Freudian slip helped make that sentence more correct than they'd anticipated.
It should have said "The outage came at an unfortunate time..." The "outrage" came a few minutes later, when they were embarrassed.
where is the data that support this statements?
It is not inserting into a trade, it is creating a trade when a difference in value is noticed. Arbitrage levels out prices between different exchanges, allowing people to trade on either without worrying if they would be getting a better price elsewhere. It is parasitic in the same sense that the oil pump is parasitic in a car - it doesn't add any power, but things turn much more freely with it in place, and it exacts a charge for doing so. Essentially, an arbitrageur is constantly shopping around for bargains an evening them out, meaning that ordinary traders don't need to do so.
Consciousness is an illusion caused by an excess of self consciousness.
You're right about that. Also, the smaller tick sizes and narrower spreads these days prevent as much useful arbitrage too.
I would assume though that if you had to choose, you would still want to have the lowest latency LSE feed you can achive as a priority over others (for a London-focused business at least). As you know, it still sees the highest trading volumes and so reflects what the current market price is with a slightly finer granularity. I'd want to have the lowest latency on that information that I could get.
Interestingly since MiFID, people are doing a lot more volume on ChiX and Xetra and the ratio vs LSE is increasing rapidly. It's not just a case of guaranteeing best execution but also reducing trading costs too. Now there's also Turquoise which in theory should be the cheapest exchange to do business on, so we'll see where that goes. Yesterday's incident can only play into their hands of th enew kids on the block...
When I was developing on both platforms, both environments needed "rebooting" at about the same frequency. I don't manage the server farm at our company, but the linux servers tend to be down more frequently than the windows servers (and there's a lot more windows servers). Also when they go down for something other than a reboot, they tend to stay down far longer. Guess our management bypassed the "reality check" when they installed Linux? Or maybe they installed the OS particular software ran on?... hmmm
The exchange was running the whole time. You think the guys that built TradeElect haven't heard of clusters? Yeah they're probably running it on a single node dell box right?
It was a connectivity outage numnut.
Your wife -- does she go?
No, she comes.
Or had someone else by them...
I'm not going to change your sheets again, Mr. Hastings.
I'd like to suggest another factor: Stability-conscious developers -- those that know about race conditions, memory leaks, atomic transactions, and the like -- tend to gravitate towards operating systems that make it easy to put their ideas into practice.
And what exactly do other systems have that Windows doesn't in those areas?
It surprises me how friends in the banking industry say how free of process their jobs are. If something is wrong, they just fix it (with electronic gaffer tape), no tickets, no process, no reviews, on the floor testing.
Scary really.
I find it unbelievable how much crap people are willing to put up with, especially over the last 18 years with all versions of Windows available.
I got burned by Windows 2.0 and 3.0 in 1990, and vowed to never use anything from Microsoft again. Although it was my first job, I did have experience with more stable things.
I only once ran Windows 95 for some time, when I got my first CD-burner (back in 1998), but that was all I ever have used it for. I first used DR-DOS, then OS/2, then finally Linux (and my wife has a Mac now).
The problem is that an issuer can't go direct to a multilateral trading facility like Chi-X or or Turquoise). They must go to a regulated market like the LSE or Xetra for a primary listing.
See my journal, I write things there
will you apologize?
This may require removing Bill Gates' love plug from your mouth.
It doesnt matter how much site, hardware and storage redundancy you have if your application is retarded.
Hardware failures should mean a few minutes downtime now days, clusters might save you a few times with failovers but generally put a noose around your neck when it comes to every other aspect of working on them. SRDF just means you replicate fuckups at the speed of light to your DR - BCVs just mean you have a rollback point and lose transactions, backups mean the same thing but it takes longer to bring online.
Point is, infrastructure protection only takes you so far - if its your application or data that is broken prepare for a long night, or several long days.
Um, if you set up a network with this fiber, it would no longer be dark. I'm not sure what you think dark fiber is, fiber optics that are cool, edgy and a little bit menacing I suppose.
When Argumentum ad Hominem falls short, try Argumentum ad Matrem
I have a feeling that the 'normal' IT situation was to blame for this.
Preamble: Technical Expertise provided a wonderful architecture that was HA and robust, fast, and scalable.
Bean Counters looked at the cost and said "You Tech guys spend too much money."
IT architects: "How much is your data worth?"
Bean Counters: "Not this much. Look we don't really need all of these systems. My home system has been working for 4 years with no problems. And I've talked with Microsoft Execs and they will cut us a deal for their platform. Now go away, I've just decided how the architecture will be done. Why did we hire you anyways?"
There are no loopholes. It's either legal or it's not.
The speedup is more due to new hardware than anything magic about Windows. In fact, a .net system is likely to need larger servers than the old C/COBOL thing they had before.
3ms is still quite a bit slower than the NYSE, who (I think) claim 1ms for their linux-based system. But I imagine there are other factors here, like physical distance, the precise definition of 'transaction' and whether that's the guaranteed or typical speed.
But does she blend?
Unicode in Slashdot
Goog job they did not use .64K which Steve Jobs said was enough!
You need to see this particular Dilbert cartoon, which is very much like what you describe :)
http://www.dilbert.com/fast/2008-09-09/
Coz eternity my friend, is a long *ing time.
I see no difference in Oracle performance on Windows vs Linux. Performance is usually more about architecture than the platform. You can make highly available networks with Windows, people do it all the time, much more often than a lot of Slashdotters seem to think.
Sure you can. The problem is that when you see a high profile Windows deployment like this (and, in fact, a lot of smaller deployments as well), you'll frequently find that MS has paid the consultants and told them what technology to use. So you often end up with a compromised architecture that's designed to show off a fancy new feature of the latest MS server software, rather than a properly engineered architecture that's as simple as possible (but no simpler), which is the way architecture should be done.
An example I know about in more detail: my company was hired about 6 or 7 years back to fix a catastrophic web site fuckup... the existing consultants hadn't finished the site in about a year and didn't seem to be capable of doing so, so the business owner came to us with 4 weeks before it was due to launch. It was a fairly simple web system, nothing fancy, a reverse auction system that allowed suppliers to bid using a back-end system on requests for supply that customers entered in the front end. The problem? Microsoft had paid the consultants to implement the back end with ASP and Access (!) while Macromedia had paid them to implement the front end entirely in flash version 4 (that's before flash supported actionscript). Every minor change to the user interface (e.g., "no I don't want to show 10 results on that list, let's go for 15") required hours of work including negotiations backwards and forwards between the ASP programmers and the Flash designers concerning the data structures (flat by necessity, so filled with items like 'result13title' and 'result13description' and stuff like that) that were passed between the client and the server.
Needless to say we ripped it all out, reimplemented in HTML & Javascript, and they launched with a much better site. But this is a cautionary tale: if you aren't careful, you can end up with a solution paid for by somebody who's trying to sell something, rather than one that's actually designed to fit your needs. And that solution may do something particularly well that you don't need, while it utterly fails to supply something you do need. In this case, it seems reliability (I would guess LSE's top requirement) has been sacrificed for performance (probably on the list, but I doubt it was top of it).
3ms is still quite a bit slower than the NYSE, who (I think) claim 1ms for their linux-based system. But I imagine there are other factors here, like physical distance, the precise definition of 'transaction' and whether that's the guaranteed or typical speed.
I don't know about NYSE, but this system was supposed to have redundant implementations at two different locations with a 1-second failover between them. Obviously, in the event of a failure, that didn't work, but I presume they did at least implement it. This would likely have a substantial effect on transaction latency. I dunno if NYSE has a similar system or not?
one more blackhearted sentiment from a seasoned IT guy. Microsoft has no business stating "five nines" of anything other than profit. not since Xenix.
the last fortune 500 i worked for had to have the exchange cluster failed over at least twice a month, each failover costing in the neighborhood of 15 minutes (if it worked, which half the time it didnt.)
the sharepoint server cluster routinely, with all the grace of a candelabra about the skull, would try to failover and fence but managed to fence more active nodes than dead ones. brings new meaning to STONITH.
im sure Red Hat is having a good laugh over this NYSE mess.
Good people go to bed earlier.
Um, wrong - Ever heard of the Mono Project?
Mono provides the necessary software to develop and run .NET client and server applications on Linux, Solaris, Mac OS X, Windows, and Unix.
http://www.mono-project.com/
In ten years of parallel processing research that is the first time that I've someone draw that distinction between faster and quicker.
It's not technical jargon, but one heard a fair amount in real life. Most recently in the Olympics we'd hear about athletes who were quick starters but who lost to faster runners after the launch.
Dewey, what part of this looks like authorities should be involved?
I love armchair admins.
Um, wrong - Ever heard of the Mono Project?
Mono provides the necessary software to develop and run .NET client and server applications on Linux, Solaris, Mac OS X, Windows, and Unix.
http://www.mono-project.com/
Glad you fell for Microsoft's marketing campaign. There is a reason they don't crush mono. It gives a illusion that there is choice. Name me
Unfortunately, I have to agree. This is mostly true simply because of the expedience of having to fix something, or add something and roll it out so that the trader or the desk can start making money right away.
Most places that I have worked, the average lifespan of an IT manager/developer etc working on something any more advanced than cookie cutter java web apps is under 3 years. Projects fail, traders skip jobs, desks are reshuffled, priorities are changed on a near daily basis... Not the ideal environment to setup a "real" IT shop.
LSE boo-boo looks more like bad/inadequate planning and design than anything else. I dont know if .Net is the culprit here, I dont see how it could be. It is simply a technology platform and from what I know about it, it does what it says it does and it is possible to get a good estimate of how the platform will behave under "real" conditions. I find it hard to blame the technology for this fiasco, this was human failure, not failure of tech.
http://www.computerweekly.com/Articles/2006/09/26/218637/city-prepares-to-test-new-trading-platform.htm
I bet the fingers are pointing today - Accenture (formerly Arthur Andersen) India vs HP vs Microsoft.
Anderson Consulting has always been a JOKE in the consluting world. The joke about them was "When you hire Anderson consulting, they back school bus up to your back door." And that is pretty true. I worked with them years ago on a consulting gig. They were hired to impliment data replication at a HUGE client (Cyprus Amax). They brought in all these fresh out of college folks, conned the client into purchasing a 1/2 million dollar 12 cpu HPUX box to do it with. They had some young girl running around with a clipboard, a pad of paper and a pen asing us what we needed the system to do. We tried to tell her WE weren't the ones who were going to be using it (we were working on a seperat project). That she should be talking to the BUSINESS people - the ones who need it. 6 months later they went to the client and said "we can't quite figure out how to get data replication working. But this would make a nice batch processing box."
Yeah, when I hear the name Anderson - I immidately think - FAILURE! Didn't they not long ago get into hot trouble for a huge BOTCHED SAP implimentation?
The Truth is a Virus!!!
Its scary how close to the truth parent is. I was working with our Telco provider once, debugging some latency I was seeing on our dedicated Internet circuit. A BC came up to me and suggested we use cable modem because its cheaper and he gets fantastic download speeds at home.
In my experience, for stability with market data, VOS on Stratus seems to hold the edge over the open systems and Windows.
http://www.stratus.com
For those unfamiliar with Stratus (which is probably just about everybody) they, with now defunct (?) Tandem, have long competed for a share of the financial markets requiring fault tolerant solutions. There are a number of bourses and equity houses that still rely on Stratus. However, I know of a couple of places that are trying to replace their Stratus platforms with Linux, with mixed results.
Like mainframes, they're rock solid, but the talent pool to support these technologies is dwindling, forcing companies to make some real tough choices in their platform strategies.
Since we don't have a full story yet with what happened at LSE, I'm not going to jump to any conclusions about what failed them. Admittedly, I as Unix guy am wishing it turns out to be a Windows failure.
At one point around 2005 I applied for a job at LSE. They had just brought on a new CEO (whose name escapes me) and were in the process of building this system. They were about to replace a Tandem machine that distributed about 500 stock advice messages a second with a cluster of 120 (!) servers running a .NET system. That implies a load of around 4 messages per second per server.
When the architecture was described to me I remember thinking 'that's brave'. I did express an opinion that .NET wouldn't have been my first choice for something like this. Apparently the decision was driven by the new CEO wanting to modernise the LSE and offer services that could not be built on the back of the Tandem platform. It had the feel of a technical decision being driven by non-technical management and didn't inspire confidence in the LSE.
It was being built by one of the big-5 firms (Accenture IIRC) and who insisted that the platform would work and provide the uptime. I would have thought 4 messages per server per second on modern wintel server should allow headroom for substantial spikes in volume but evidently it didn't.
No, arbitrage is a necessary component of orderly markets. A perfectly balanced market has 0 arbitrage opportunity.
Example: ETF Conversion/Redemption. The ETF is priced in real-time based on the price of its components. Fast systems are able to detect small discrepancies in the price of some of the components and the basket as a whole and are able to execute trades ( say BUY ) on the components and the contra trade ( SELL ) on the ETF and then redeem the ETF from the components to pair off the contra trade.
This arbitrage always works to keep the components perfectly in line with the ETF itself. I think that is a good thing.
Well.. for the guyz that need to develop robust system, don't ever dream of .NET for idiots tech.
Every BANK in the world who care about reliable, robust system run over JAVA with big fat SERVER that are not INTEL made.
Microsoft + PC are made for the little company that can't invest in real computer technology... and the big problems here, is that JAVA and his tech are FREE to use... or mostly... So, stop paying for software license and put that money on reliable server.
Jourdespoir
But unlike Linux, applications run on Windows, which certainly can impact uptime. And as for it being everywhere...well, that's because everyone knows the basics of how to use it. Self-perpetuating cycle. cri mor nub.
Oh but it does... Microsoft (and the TCI) made a decision that native coders could not be trusted, so a system based on Microsoft technology has code written in a single sourced, closed source language compiler that produces intermediate code that is interpreted by closed source runtime and runs on a closed source operating system. Now any way you slice it, that reduces the scope of the parts that can be verified to the highest level source code, which can be examined and judged on the basis of, "it looked like it would work".
I don't believe that this was a technical glitch.
I think people should investigate whether or not certain parties benefited, financially or otherwise, from such a prolonged suspension of trading. I think analyzing the trades that most rapidly transitioned to alternative trading exchanges, and had the highest volumes or dollars, would be very interesting and revealing...
This news story makes me think of a fictional "singularity" themed story I read yesterday:
http://www.ssec.wisc.edu/~billh/g/mcnrsts.html
Is it happening?
Yes, that happens a lot.
The root of the problem is you have very skilled technical people trying to deal with business... a proper IT manager can and would sell the proper solution to the higher-ups.
They can still save face by claiming 6-eights up time, which is more than 5-nines.
Will this affect the adage that nobody ever got fired for picking Microsoft?
Really, it has been widely known for a couple of decades that no Microsoft system really has industrial grade reliability. If the trading system was down, then many millions if not billions were lost. Somebody should be fired for this.
Everybody knows 3 people with my name.
.Net is only good for people who would like to create a light duty website.
I beg to differ. I have been involved as an architect with several enterprise .NET applications some of which are web sites that use various types of clustering technology. In all cases, each was able to handle a high amount of traffic with a reasonable amount of downtime.
The one in particular that stands out is a highly scalable and available credit card transaction processing system. It processes millions of credit card transactions a day, utilizes a distributed computing platform that allows for active failover with little delay in trasaction processing and allows software updates to be applied without requiring the entire system to be taken down.
To develop enterprise level applications, it takes good developers and careful planning. With most languages and frameworks, the quality of the application built is related to the quality of the developers.
While you may be able to say that many of the off-the-shelf components included with .NET are not enterprise ready out of the box, you can certainly build enterprise level software with .NET. The .NET framework is very robust and reliable, it's up to the developers to use it to its full potential.
We'll make great pets
No, actually the Windows system (10 ms per transaction) was a 13x speedup over the older system (135 ms per transaction), followed quickly by an addiditonal 50% speedup (6 ms per transaction). The Windows system was just recently updated to double performance again (3 ms per transaction), so it's now 45 times as fast as the unix-based system it replaced.
You may be able to fault it on reliability (though the olde system wasn't perfect either), but you can't fault it on performance.
i hope you're kidding because that explaination is total bullshit, nothing more !
Do you think they use same old HW and Network ?
Why, do you like wasting money or are you simply a masochist?
Also he said support was crucial for his company. If something went down, he wanted to be able to call someone immediately. He couldn't afford to just post a question on a message board and hope someone replies. He wanted contracts with 3rd party support that had experience with similar huge enterprise systems that he had.
RedHat is more than happy to take your cash, and offer 24/7 support.
What's with this "message board" bullshit? This guy is just a moron, or trying to rationalize the perks he gets from having his company buy MS. Yes, newsflash: those big companies pay bribes. I've seen it happen, not with MS but with a company closely related to it.
I think the unclear point is: Why can't the person you bought from, and the person you sell to just trade between themselves without someone intervening to take their cut.
If the answer is "They could, if not for some guy getting in the way" then arbitrage does seem to be an unneeded burden
If you can't see the value in jet powered ants you should turn in your nerd card. - Dunbal (464142)
Your scenario has a little problem:
"IT architects"
As far as i can tell, in IT finance around here, any experienced IT architects they might have had (there are some signs that at some point in the far past somebody actually though about things like code reusability) have long been fired (many years ago) or maybe left in disgust.
Even if they still have any IT Architects they're either:
Around here the typical IT career path for a technical person usually ends at a Senior Developer kind of position - you need to become a manager to go above a certain pay level - and there are no advanced technical positions like Technical Analyst or Technical Architect (the closest is Business Analyst, but around here those are usually filled by people promoted from Secretarial positions).
They have no insider highly qualified/experienced technical IT people which are respected enough to even talk to the Bean Counters.
That's a big part of the problem and that's why unholy alliances like Accenture-Microsoft can push Windows and .NET as appropriate for high-availability mission critical systems - there is no qualified, respected insider technical voice that will say "This is not going to work and here is why".
They could, except that it would be a PITA. You would have to have accounts open on all the exchanges which traded the security you want to deal in, and check round every time you wanted to do a deal. And so would everybody else who wanted to do the same thing. It is a bit like supermarkets doing their own price comparisons against the opposition. It is simply more efficient for half a dozen arbitrageurs to do that kind of repeated checking around than 5,000 insurers, pension funds and banks to do so. Arbitrageurs typically make of the order of 1% on their transactions, and it is not worth the other investors scavenging this tiny percentage. If price differences rise above tiny, the investors would certainly shop around. But, basically, if it is worth a busy investor with other things on his mind than shopping around, it even more worth an arbitrageur doing so. It is no more parasitic, to my mind, than a retailer buying in bulk and selling on at a higher price. The end user always has the option of buying in bulk, but usually doesn't want to.
Consciousness is an illusion caused by an excess of self consciousness.
MOST of your systems, that says it all. If you design a system you must be a little bit more certain then that "most" of it will be up for the availability your are promising.
I got a consumer HD that so far lasted for 6 years, non-stop. By your logic therefor consumer HD's are fine for a server enviroment because they last for 6 years running 24/7.
If 100% of you systems run 99.999 availabibilty, THEN you can come back. The acceptable error-rate is 0.001% FOR ALL YOUR SYSTEMS. Lets say that you have 10 2003 systems and MOST is 9, then you only got 90% availabilty, you fail, back to the drawing board.
It is hard to get true HA. MS can't do it, doesn't mean you can't use it for little projects. I have seen many a succesfull project launch on PC's thrown together at a local computer store and shoved into a rack. Worked fine, but only an idiot would guarantee any availability on it. Prove me wrong, put in your contracts that you guarantee HA for your clients. See you how your boss/lawyer reacts.
MMO Quests are like orgasms:
You may solo them, I prefer them in a group.
I watched NBC and ABC national TV news last night and so absolutely no coverage of this event. There was even a mention of markets around the world with the news of Freddie and Fanny government backing. Anyone else notice this? This was a major event that went unnoticed for a good portion of the US.
Microsoft had a big ad campaign last year celebrating their conquest and migration of the London stock exchange to Windoze from unix.
I hope that the linux based NYSE doesn't get this problem. Bad for publicity.
See also:
http://tipotheday.com/2008/09/08/microsofts-foot-in-mouth-london-stock-exchange/
Your experience echoes mine in that it wasn't Windows fault that the design sucked. You can create perfectly robust systems and if you let them pay you to implement something wrong then you should be fired.
Of course I'm not referring you specifically in my statement. Here's another example for you since it applies to my company. HP sponsors us so we get a lot of money from HP to buy HP equipment. Does that mean we always buy HP even when it's not appropriate? Of course not! We went with a NetApp SAN instead because it fit our needs far better than an EVA would.
The bottom line is that mismanaged projects will not meet goals while well managed projects on most platforms these days are quite capable of staying up. For instance the uptime on my NetApp SAN is 100% for the year due to clustered filers allowing me to upgrade firmware without taking the system down. The same applies to my Windows clusters, take a node down at a time, apply updates, bring it back online and repeat until the cluster is updated.
On a side note, I never understood why so many people were so hard for flash, it's rare to ever see a good use of it.
I don't agree with Microsoft's practices of paying consultants to recommend products but it's up to the company hiring them to perform due diligence to ensure they are getting their money's worth which should guarantee that the consultants would be recommending solutions that would actually work. Of course in reality so much blind faith is put into consultants that it's astounding anything ever gets done at all.Many consultants have been brought in over the years I've been here and ultimately I end up deploying everything instead.
Follow these stories a bit more closely and it is ALWAYS the network that is blaimed, no matter what. It is easy, normal industry blames IT, IT blames the network.
Getting five nine's is far more then just choosing an OS, after all, you got to account for ALL hardware failures, and they will fail, and power failures and even people digging up your cables. The OS barely gets a look in, when you after all have to account for a server being able to fail for any number of reasons, a crashed windows hardly matters anymore. Your system should be able to work around any failure, hardware, software or human.
That is hard, very very hard.
But the fact they choose .Net is telling, if you followed this, it has been very clear MS has been throwing a lot of money around to get high profile projects to switch to windows from other systems. That is suspect, was .Net choosen for its capabilities or because MS bought its use?
This is setup where you want the absolute best and frankly MS doesn't, yet, have the reputation in this area. That is nothing anti-MS, if you go and ask MS for a brochure they will simply not have anything for sale here. You talk to IBM or Sun for this kinda stuff. Not MS.
If you absolutely must use Windows/.Net you would run it as virtual servers on a cluster or mainframe, let the big guys deal witht the hardware and keep the actual program itself purely in a virtual enviroment where it can be transferred and kept running constantly no matter what happens in the real world.
IF this system had actuall physical boxes running windows 2003, they are insane.
The actual problem by the way seems to have been an update gone wrong. This is more IT speak. It means the company that supplied the network refused to take the blaim this time.
We most likely will never know what the real problem was, because it is often a combination of circumstances that could have been avoided with proper management and anyone silly enough to blaim it on his boss won't be around long.
At the moment we got a story of a high-profile project that MS was very happy to advertise had chosen its product, failing rather miserably. To a lot of us having in the past had to deal with MS less then reliable products, it is nice to see MS embarrased a bit.
Think of it like this, you might buy Ikea, even like their product and have no trouble with them. But if you made me work in a restaurant with an Ikea kitchen I would serve your still beating heart to your mother for her birthday on an Ikea platter.
MS is fine, for small stuff, you do NOT use it when you guarantee 99.999% reliabilty. The fact that you claim at the end they run on .NET/Windows Server shows you know very little about the problem, HA project don't run on Windows or for that matter Linux.
MMO Quests are like orgasms:
You may solo them, I prefer them in a group.
This is simple statistics here. This isolated incident is an obvious aberration, an isolated anomaly, incongruous with the result set.
As such we need to use a weighted sample variance, and discard the anomaly from the results.
Opinion:=TMyOpinion.Create(Me);
Isn't a 1 year uptime really good for Windows software?
Having to work for a living is the root of all evil.
Clippy: "It appears your trading system has crashed. Would you like to:
1. Schedule a press conference?
2. Warm up the corporate jet and head to the Bahamas?
3. Email shareholders an apology letter?
4. Sign up for premier-level support?
-- Posted from my parent's basement
Which version of .net were you using, I've found it to be really buggy.
thank God the internet isn't a human right.
Symbionese Liberation Army?
I don't know either, but maybe someone didn't find his home system comparable to the load the London Stock Exchange supports.
Bah, my YTD uptime for most of my systems is 99.995 without clustering running Windows 2003. We are a java shop not a .net shop though but java isn't perfect by any stretch of the imagination.
That's easy. I have machines that have been 100% uptime since November 2003, when the power went, the UPS drained, and the generators didn't start.
As systems get more complex, downtime gets longer.
A really simple system will rely on 20 components. Each of those 99.995% uptime leaves you with 99.9% uptime -
A medium-complexity system will rely on 200 components, your uptime is now down to 99%.
A truly complex system will rely on 2000 components, and will be down 35 days a year.
Of course, you have a variety of "downtime". At any one time the "internet" is "down" -- 100,000 webservers even with 5-9s up time will be down more often then not.
FWIW we had more downtime from a clustered windows file server than we had from the temporary replacement desktop PC with a linux software raid-5, much of that downtime was due to mcaffee, but a fair amount (about 5 hours in a year) due to culmative windows issues.
Why is Bill Gates the icon for this? Because it ran on .NET? We don't know what the problem was--it was probably programmer error.
Should we put Bjarne Stroustrup's mug up there every time a program written in C++ crashes? Of course not.
Don't get me wrong, I'm a Mac loving-M$ hating-slashdot user just like the rest of you, but I think the instant insinuation that M$ is to blame is ridiculous.
The current generation is 45 times as fast as the previous generation? Stop the presses! For this to be a valid comparison you'd need to compare against the current generation. Current generation Linux systems get sub-millisecond transaction times, according to http://searchenterpriselinux.techtarget.com/news/article/0,289142,sid39_gci1312063,00.html
"In late April, Red Hat Inc. set two new benchmarks in financial services: processing transactions in less than a millisecond with greater accuracy than previously recorded."
So Linux is faster anyway...
The point is you shouldn't be running mission critical systems on new and shiney (it's bound to have bugs) you should be running it on old and reliable (or at least where the bugs and workarounds are well known)
That's only true when your mission is old and boring.
My God, it's Full of Source!
OUTSIDE_IP=$(dig +short my.ip @outsideip.net)
Home system, hardly. I run the datacenter for an S&P 500 company.
There are 4 boxes to use in the defense of liberty: soap, ballot, jury, ammo. Use in that order. Starting now.
Can you provide some examples please? Perhaps specific Microsoft KB articles? I've used 1.0 all the way up to 3.5. I've never encountered any issues with the .NET CLR that would lead to random instability or crashes (GPF's, page faults, whatever). Most of the bugs (there used to be a huge laundry list on Microsoft's website) that I've seen were that parts of the built-in classes provided in the .NET Framework didn't behave as intended or in the most useful manner. In any case where Microsoft came up short with their out of the box tools, we wrote better versions of whatever it was that didn't work the way we wanted it to and we did it with... .NET (C# language specifically).
I never said the .NET Framework is perfect. Every language and its built-in toolkit has shortcomings but to say that .NET is not suited to build anything other than a "light duty website" is a very strong statement. In .NET's case, this claim is patently false. Can you build an enterprise level application that has high availability characteristics using the built-in tool set Microsoft has provided with .NET, most likely not. Can you make up for that deficiency by writing your own set of classes (isn't that what software development is about anyway?) that provide the functionality you want, absolutely.
We'll make great pets
Glad you fell for Microsoft's marketing campaign. There is a reason they don't crush mono. It gives a illusion that there is choice. Name me
OK, nicolas.kassis, I hereby name you, Fluffy, Viscount of New Budapest.
The site broke just this once since it was launched in September of 2005. I'd say that the Microsoft servers did a fantastic job hosting the load.
My lame blog.
we are all sloppy and careless, because we can get away with it
If you're not micromanaging those factors (memory allocation, bounds checking, type checking) yourself, it means your concerns have evolved to a level above them.
That's the whole point of abstraction.
I am writing this as a programmer who is reasonably comfortable with the following levels:
The interplay of these levels is interesting: A grammar as used in a compiler can be expressed in a declarative language, as can code transformation rules; etc.
It strikes me that the definition is probably recursive, as data itself is perhaps the highest level of all - it is self-descriptive (programs being data, of course). So all programs and data can be positioned in the hierarchy and can be refactored accordingly. In other words, an example of data or program can have a particular semantic meaning in one level (e.g. machine code), but can be re-expressed (usually more concisely) in a higher level, more expressive form.
Wow, I'm rambling, it's late.
you had me at #!
Bill Parish described it in detail.
you had me at #!
Strange things happen ...
I thought this is so funny I even blogged about it ...
Get The Facts - The Highly Reliable Times (are over)
-- "As a human being I claim the right to be widely inconsistent", John Peel
Ah that makes sense now. Language can be quite subtle at times. Thanks for the example.
Slashdot: where don knuth is an idiot because he cant grasp the awesome power of php
It's "you can get rich (under the table) by choosing Microsoft."
Seriously - you think they don't have uses for that cash stockpile? Ask around. Soft bribes are all over this. Let's face it, why would you specify MS if nobody paid you a lot to go against common sense!
you had me at #!
prostheste epiloges kai pshfiste gia aytes
Compare it with Linux on NYSE, which servers 500 millions transactions per day - more than 17000 transactions per second. But it capable to handle much more (at spikes).
LSE is much smaller than NYSE, did you understand that?
This makes a lot more sense with the trans-exchange trading aspect pointed out. Thanks for the explanation.
If you can't see the value in jet powered ants you should turn in your nerd card. - Dunbal (464142)
A connectivity outage that lasts a whole working day? Yeah, right.
yes. software connectivity.
this wasn't a hardware failure or site failure.
Regardless, the guys assertion that only unix can do clustering is retarded.
And a resiliant cluster doesn't protect you from software bugs. The same bugs exist in the same software installed on the other hosts.
Hi I am a journalist at Computer Weekly and I have been following the London Stock Exchange going down story. I am trying to get feedback from people affected. Do you think the stock exchange should come clean about exactly what happened technically. I wrote the story below following feedback from various people. thanks karl http://www.computerweekly.com/Articles/2008/09/10/232269/stock-exchange-told-to-come-clean.htm
with .net 1.1 I've had things like projects only running in debug mode and crashing when in Release mode. Functions just not working properly, but then they work when you test the code block in isolation e.g. the XML stuff replacing + with a space
encryption routines failing for 1 in about 1000000 calls
Path.IsPathRooted failing randomly.
Most of the bugs went away when I moved everything over to .net 2.0 but then there were a few different bugs in .net 2.0.
I wouldn't trust anything that unstable for any real world application it's just too much of a pain to develop for
thank God the internet isn't a human right.
I'll give you an example. The Java compiler cannot guarentee that a program will not raise a NullPointerException during its execution. In contrast, the Glasgow Haskell Compiler can guarentee that no null references will ever exist if the program passes compilation.
And I can "solve" a halting problem by inventing a language that runs all programs in a loop, so they can't halt.
Re-reading my statement I can see I missed out a word that changes the whole meaning, so it's no wonder you seemed less than impressed ;)
I meant to type:
The Glasgow Haskell Compiler can guarentee that no null reference exceptions will ever exist if the program passes compilation.
Haskell allows null references, or the equivalent thereof, but the compiler will ensure that you never refer to a null reference when you expect a value.
For instance, the following Haskell program will not compile:
The compiler knows that (5 `div` 0) could be None (the equivalent to null), and None * 10 is a nonsensical statement. You have to explicitly account for both possibilities before it compiles:
This is a pretty common pattern, so Haskell provides a shortcut:
But whilst shorter, this isn't exactly more readable. So Haskell provides the 'do-notation' syntax sugar to make it look better:
And there we have code guarenteed to be free of null pointer exceptions by the compiler, whilst still allowing null pointers to exist.
That's not "triple redundant" by any stretch of the imagination. That's 3 drives in the SAME DEVICE using the SAME virtual filesystem. Of top of it, this is NetApp's buggy proprietary crap. Use a real filesystem.