Data Storm Caused Nuclear Plant To Shut Down

Re: The reason? by Clockworkalien · 2007-05-19 09:02 · Score: 5, Funny

All of the plant employees were looking up Starcraft 2 news.

--
I am on the road crew. This is my stop sign.

Shut down? by Anonymous Coward · 2007-05-19 09:04 · Score: 5, Insightful

>Investigators want to know whether the data storm could have been initiated from outside the plant.

Do invesigators also want to know how a "data storm" could have caused a nuclear plant to shut down?

Re:Shut down? by Detritus · 2007-05-19 09:30 · Score: 2, Informative

RTFA, bozo.

--
Mea navis aericumbens anguillis abundat
Re:Shut down? by Sj0 · 2007-05-19 15:49 · Score: 2, Informative

It looks like it was a modbus plus network. We're talking a proprietary physical layer on up, specifically designed for PLCs to communicate with one another.

If there was a communications problem and a PLC blinks out of existence on a mission critical system, it's only the safe thing to fail the entire system to prevent damage to people, the environment, and equipment.

--
It's been a long time.

nothing to see, move along. by SuperBanana · 2007-05-19 09:10 · Score: 5, Insightful

Some choice quotes, emphasis added:

An investigation into the failure found that the controllers for the pumps locked up following a spike in data traffic -- referred to as a "data storm" in the NRC notice -- on the power plant's internal control system network. The deluge of data was apparently caused by a separate malfunctioning control device, known as a programmable logic controller (PLC).

"Conversations between the Homeland Security Committee staff and the NRC representatives suggest that it is possible that this incident could have come from outside the plant," Committee Chairman Bennie G. Thompson (D-Miss.) and Subcommittee Chairman James R. Langevin (D-RI) stated in the letter. "Unless and until the cause of the excessive network load can be explained, there is no way for either the licensee (power company) or the NRC to know that this was not an external distributed denial-of-service attack."

Wow. Just...wow. As if you needed more proof that this wasn't a hacking attempt:

"The integrated control system (ICS) network is not connected to the network outside the plant, but it is connected to a very large number of controllers and devices in the plant," Johnson said. "You can end up with a lot of information, and it appears to be more than it could handle."

Seriously, how stupid do you have to be to think "OMG, Haxxors?" Answer: work at Homeland inSecurity, or be a Congresscritter. They already figured it out. It was a controller for a specific piece of equipment that flooded the network and triggered a bug in the variable-frequency-drive controllers for pumps.

--
Please help metamoderate.

Re:nothing to see, move along. by MECC · 2007-05-19 09:31 · Score: 2, Funny

Never hire windows admins brandishing the moniker "network admin".

--
"We are all geniuses when we dream"
- E.M. Cioran
Re:nothing to see, move along. by A+Bugg · 2007-05-19 09:45 · Score: 5, Informative

I work at a nuke plant as a system engineer. One of my systems are the reactor recirculation pumps, these type of pumps. I know for a fact there is no way hackers could "data storm" my pumps and there is extreme doubt in my mind that the same thing could happen at Browns Ferry. The pumps digital control system isn't even near any outside network.

However, I will fully put the blame on the PLCs. Those little suckers come in handy but if you don't completely understand every line of code and every instruction they can f_ck you over.

I also love how they say "well if you can't prove it wasn't, then it must have been".
Re:nothing to see, move along. by Anonymous Coward · 2007-05-19 10:17 · Score: 5, Informative

You just have to love Browns Ferry don't you? This is the same plant that had wired its control cabling for two nuclear reactors through the same area. Then they had workers check the air tightness by using candles near their flammable insulation. It wasn't air tight and the flame of a candle was sucked into the insulation. Thus a fire broke out, $100 million of damage occurred, and control was lost of their two nuclear reactors for something around 8 or more hours. Why 8 hours? Because their fire team tried to fight the fire with portable CO2 extinguishers. Yes, for 8 hours. Until the local fire department (which they previously obstructed) put it out with water in 5 minutes. Idiot designers and idiot employees. I'm surprised that plant didn't have a meltdown before TMI. But boiling water reactors are a little harder to destroy.
Re:nothing to see, move along. by jd · 2007-05-19 13:31 · Score: 3, Interesting

I believe this is the nuke plant that is supposedly using Windows NT to handle SCADA (Supervisory Control and Data Acquisition) functions, and I know it's the plant that has shown gross incompetence in relation to fires (see assorted other postings). Internal systems are not supposed to be connected to external networks. It's unlikely this one was - not because they're smart, but because I'm not certain they'd know how. We can therefore eliminate external causes. Sadly, we have passed from the Age of Englightenment and the Age of Reason into the Age of Paranoia and the Age of Dementia, so that is likely the attribution we can expect from the Department of Homeland Insecurity.
A random fluctuation in internal traffic levels seems equally unlikely. Why? Because it has worked for some time, and I doubt the reactor was doing anything unusual at the time. A true network storm is unlikely - the term exists, but describes an astronomically rare situation. If a network is flooded, it is either near or at capacity. A network storm is when capacity is exceeded in a way that is self-perpetuating. The last time I remember the term being used in a public forum was I think over twelve years ago when a public demonstration of the multibone caused a cascading router flap that shut down a large segment of the Internet backbone due to total gridlock. It wasn't just that nothing else could get through - nothing AT ALL could get through.
What does this leave us? It makes it extremely unlikely that the network traffic per se had anything to do with the shutdown. Much more likely is a cumulative error in the devices involved that merely happened to turn into a fatal bug at roughly the same time as the network spiked. It might be network related, but nobody here can seriously believe it was network caused. Networks may be polled, in which case network traffic that escapes being polled is simply never seen. Network drivers may also be event-driven, but if the interrupt handler is buggy - which would usually mean the handler can be interrupted by itself indefinitely - it's hardly the fault of the network.
In other words, this is a gross programming error that the coders and managers are desperately trying to blame on something - anything - other than their own ineptness. It might merit Scott Adams making a Dilbert cartoon over, but that's it.

--
It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)
Re:nothing to see, move along. by (negative+video) · 2007-05-19 14:02 · Score: 2, Insightful

A random fluctuation in internal traffic levels seems equally unlikely.

Look up "Poisson distribution". At low packet rates, large rate fluctuations by random chance are the rule. You also have to consider events that can trigger a common packet rate spike, such as a a non-critical subnet being power cycled. Combine this with a device that has an overflowable packet buffer and you have a recipe for inevitable failure.

A true network storm is unlikely - the term exists, but describes an astronomically rare situation. ... A network storm is when capacity is exceeded in a way that is self-perpetuating.

At work we recently had a cheap router near the edge that decided to start echoing broadcast packets. ARP traffic was not pretty, and DHCP got so confused that the Windows clients went all plug-n-play and started making up their own addresses. The core routers automatically detected the repeated packets and decided to go into cycle-breaking mode: automatic rolling network bisection. Unfortunately they had the smarts to find cycles on their own ports but not echoes from a misbehaving device, so that actually made the network more confusing. Eventually IS had to manually bisect the network until the talky node could be found.

In other words, this is a gross programming error that the coders and managers are desperately trying to blame on something - anything - other than their own ineptness.

It's an honest description of the final event that resulted in the system failure.
Re:nothing to see, move along. by Firethorn · 2007-05-19 16:55 · Score: 2, Informative

Having to work with a seperated network myself, I'd have to agree about doing as little as possible with it.

In my case it's for two reasons. One, the disconnected network is considered the critical one, and is far more locked down than the one connected to the internet. Second, the one connected to the internet is the one used 99% of the time.

Anytime we touch a system there's a chance we'll screw it up/break it. Our treatment of the isolated network is pretty much 'don't fix what isn't broken'. It wasn't too long ago that we had a P200 still acting as a PDC on it. It worked, we didn't touch it.

--
I don't read AC A human right
Re:nothing to see, move along. by kasperd · 2007-05-19 23:31 · Score: 2, Insightful

A random fluctuation in internal traffic levels seems equally unlikely. Why? Because it has worked for some time, and I doubt the reactor was doing anything unusual at the time.
This is not about the network being highly loaded with lots of packets comming from all sorts of places. This is about a single device for some reason flooding the network. I have seen the results of units flooding a network with broadcast traffic. I don't consider it highly unlikely for one unit to eventually start doing that because of a design flaw. Somebody should take a closer look on the design of that PLC to see if there is a likely explanation. Maybe a physical defect could have caused it to send a broadcast packet and afterwards think it had not been sent yet and send it again and again. Maybe the explanation is something else. There is no way I can say for sure without having seen the PLC.

Network drivers may also be event-driven, but if the interrupt handler is buggy - which would usually mean the handler can be interrupted by itself indefinitely - it's hardly the fault of the network.
If the handler could interrupt itself, it would probably result in a stack overflow and crash the unit. But that is not the most likely bug to introduce. A more likely and almost as bad problem would be if by the time the interrupt handling ended, it would immediately take another pending interrupt. In that case it would never be processing more than one interrupt at the same time, but yet it would spend all of its CPU time handling interrupts. The unit would appear locked up, but would come back to life shortly after the flooding stops. I have seen the later happen with Linux machines (I don't remember which kernel version, I think 2.4.something). I later repeated the experiment with a Windows ME machine, which also locked up, but didn't come back to life when the network cable was disconnected. This situation was quite easy to test, just loop a cheap 100Mbit/s switch back to itself. It would probably take a 1000Mbit/s network to actually cause this with the last generation of CPUs. I don't know if switches and/or network drivers have been improved to avoid the exact scenario I tested.

In my case this was not a problem, but of course in some critical systems, it can be. I see at least two problems. Units not tested against this scenario, and having redundant units communicate to each other over the same ethernet. Of course just having two ethernets does not solve the problem of one of them being able to take down units. Redundant units protect you against physical defects in one unit, not against design flaws.

--

Do you care about the security of your wireless mouse?
Re:nothing to see, move along. by whoever57 · 2007-05-20 01:38 · Score: 2, Informative

Why should the readers have to bear the burden of proof? It's your assertion, you get to show evidence.
Gawd, another one.
1. It wasn't my assertion -- I did not make the original post about Browns Ferry. Try reading next time!
2. I just happened to hear an article on PBS about Browns Ferry the day of this post.
3. As I mentioned before, you can confirm it using Google. Here, I'll even show you how to find it using google
4. What is it about "/. is not an encyclopedia" that you don't understand?
There may be many case where one might claim that a post on /. is pure BS, but in the case of the great-grandparent post, the facts are easily confirmed.

--
The real "Libtards" are the Libertarians!
Re:nothing to see, move along. by DerekLyons · 2007-05-20 20:20 · Score: 2, Insightful

A random fluctuation in internal traffic levels seems equally unlikely. Why? Because it has worked for some time, and I doubt the reactor was doing anything unusual at the time. A true network storm is unlikely - the term exists, but describes an astronomically rare situation.

When investigating an accident you cannot ground rule out an occurence that is unlikely or rare - unless you have positive evidence that said unlikely or rare condition did not occur, or positive evidence of another cause. "Unlikely" and "rare" are not synonyms for impossible.

In other words, this is a gross programming error that the coders and managers are desperately trying to blame on something - anything - other than their own ineptness.

Absent any facts (as opposed to opinions presented as facts), what precisely is your evidence for this conclusion?

Standards! by 26199 · 2007-05-19 09:12 · Score: 5, Insightful

You'd hope that in something as critical as a nuclear power plant the answer would be, very quickly, "no, it didn't come from an external source because that's impossible". Followed by detailed analysis of the logs to determine which internal system screwed up.

That said, the article is a bit sparse on actual technical details, so my derision may be unwarranted.

Re:Standards! by AudioInfecktion · 2007-05-19 09:36 · Score: 3, Interesting

As it should be. The point is this. Any of the computer/network equipment that actually runs the plant should not be connected to the outside, period. All normal computers for office work, typing up non-classified reports and reading slashdot should be on a whole seperate network. Idealy, there should be 3 networks since the plant should only have certified equipment connected to it that won't cause what happened here to take place unless something was truly malfunctioning. I'd be a little scared to find windows boxes, and even most unix/linux things connected to the plant network.
Re:Standards! by mrchaotica · 2007-05-19 09:37 · Score: 4, Insightful

You'd hope that in something as critical as a nuclear power plant the answer would be, very quickly, "no, it didn't come from an external source because that's impossible".

Actually, power plants have to have a connection to the outside world. Why? Load-balancing for the power grid. If another plant goes down somewhere, this plant needs to know about it so that it can adjust output to compensate. For that, all the plants need to be hooked to a communications grid, which could conceivably be hacked (even though -- I would hope -- it's not connected to the Internet).

--
"[Regarding the 'cloud,'] ownership was what made America different than Russia." -- Woz
Re:Standards! by legirons · 2007-05-19 10:15 · Score: 2, Interesting

You'd hope that in something as critical as a nuclear power plant the answer would be, very quickly, "no, it didn't come from an external source because that's impossible

Indeed.

Unfortunately, sometimes our favorite software supplier is involved...
Re:Standards! by Artifakt · 2007-05-19 10:47 · Score: 5, Interesting

This actually can be avoided (and AFAIK current designs do). Fast, electronic level response to avoid blackouts and such requires very much less time than changing reactor output would either allow or facilitate anyway, so the direct machine to machine communication links don't really need to go to the power cycle control systems at all. Instead, rapid response grid balancing is done at external switchpoints. For the newer designs, these are outside the whole plant at substations, let alone just outside the core areas. Between these links and reactor control systems, there's supposed to always be an air gap.
Given that, any hacking would have to include a social engineering element designed to fool the operators into making the wrong decisions. If we include that stipulation, yes, it's quite conceivable. If we postulate someone bridging the air gap, maybe by something as simple as hooking a laptop that also contains a wireless card into the control network, then a non-social engineering attack becomes conceivable, but not really otherwise.
DOE and NRA doctrine is that adjusting reactor output based solely on a trigger event outside the core instrumentation is supposed to always require a high level human decision. Supervisors are also at least supposed to be trained to the point where they can make these decisions without adding any more response time than a conventional, (i.e. hydroelectric or coal based), plant would need for their human level decision events. (Yes they have them. For example the four TVA dams that supply Alcoa aluminum face a whole series of individual and joint human level decisions every time Alcoa's main furnace system glitches, and these have to include how long Alcoa expects them to need to dump power elsewhere, and for each of them, what options the other three dams are considering).
The DOE does not legally presume that reactors are even as responsible for balancing the grid as conventional plants, but given how much older a lot of the conventional plants are, it's pretty easy to do much, much better than is strictly required, and it should be noted that, in the last New York blackout all the cascade effects and switching failures happened in 1940's era or earlier fossil fuel plants, and the worst points were 1930's or even 1920's era designs. Still, the rules are that if the conventional plants are failing at load balancing, even if the grid is experiencing severe cascade failures, the nuclear sites will let the whole thing crash rather than take the risks of trying to stabilize the grid by actually modulating their reactions.

--
Who is John Cabal?
Re:Standards! by dbIII · 2007-05-19 16:12 · Score: 2, Interesting

Fast, electronic level response to avoid blackouts and such requires very much less time than changing reactor output

It's called hydro - or sometimes even pump storage. Conventional thermal power is cheap but it takes a long time to increase output unless there is already spinning reserve. Non-conventional thermal power still takes time BECAUSE IT IS NOT MAGIC unlike what we are led to believe by those that want to build a few hundred 1950's style plants painted green. Nuclear power possibly would be a mature technology by now if some effort had been put in over the last few decades, but for now it's just a new and expensive way to boil water sold as the peaceful side of the bomb.
Re:Standards! by dbIII · 2007-05-19 19:27 · Score: 4, Informative

As one of those who would like to see hundreds of new nuke plants,

After some R&D and building some prototypes of promising new designs I'd be right with you - but our current best bets are things out of South Africa (pebble bed) and India (accelerated thorium) done on very small buidgets with very small teams and they need more work. The mainstream is just chasing taxpayer supplied pork. If they were after more than a handout they would be putting in some effort - instead they spend orders of magnitaude in PR, advertising and outright bribes than R&D.
As for costs - you can't just conveniently ignore capital costs. If you could hydro, wind, solar etc would win every time even in those places where it would be a stupid idea or where the capital costs are far too large for the return. Nuclear power is a possiblity in those places that have the infrastucture of a weapons program but everywhere else you would have to build up an entire industry from scratch. Iran is the best example currently where that is taking place and it has cost them a fortune to do so - hence few people think it is for purely civilian purposes there. In South Africa it was possible to take people from the weapons program to develop pebble bed. It is also far too big an investment for private enterprise - hence no new plants getting built while governments had cold feet on the issue and the "new generation" designs from companies like Westinghouse are just tweaked 1950s designs painted green.

You missed one.... by iknownuttin · 2007-05-19 09:13 · Score: 4, Interesting

FTFA: "What is happening in this marketplace is that vendors will build their own (network) stacks to make it cheaper," Peterson said. "And it works, but when (the device) gets anything that it didn't expect, it will gag."

Sounds to me that the vendors under-engineered their network and still charged mega-bucks for it. The auditors, I'm sure, are making the most out of this to justify their fee.

Nothing to see, move along - I'll say!

--
I prefer Flambe as apposed flamebait.

Political FUD by Bellum+Aeternus · 2007-05-19 09:23 · Score: 4, Interesting

As usual, the American government is looking to extend its control over things. "Oh noes, look what terrorists might have done. Homeland security needs more funding and less oversight to prevent this in the future." When will people learn to assume the government is lying first, then wait for them to prove themselves right later?

--
- I voted for Nintendo and against Bush

Re:Redesign the entire infrastructure by Detritus · 2007-05-19 09:25 · Score: 2, Insightful

When you get back to the real world, let us know. You don't just wave a magic wand and completely redesign and reimplement a highly complex safety-critical system.

--
Mea navis aericumbens anguillis abundat

Re:Redesign the entire infrastructure by Joe+The+Dragon · 2007-05-19 09:27 · Score: 2, Informative

It's not the IT people PCL are coded by EE not IT people.

What network technology were they using? by Angostura · 2007-05-19 09:29 · Score: 4, Interesting

Isn't it a bit odd that they were using a non-deterministic network - something like Ethernet, by the sound of it. Back in the early 90s, I was always told that networks like Ethernet were great for office apps, but not where you wanted guaranteed times for message delivery. For that token ring, FDDI and the like were better. What is the network infrastructure of choice in a nuclear power station?

Re:What network technology were they using? by mplex · 2007-05-19 12:00 · Score: 3, Insightful

Using Ethernet is not odd, that's literally all there is these days. Sure, there are technologies like Infiniband, but Ethernet is far and away the cheapest and most widely supported networking standard. It sounds like they were experiencing a broadcast storm from a locked up device. I can't tell you the amount of times I've seen stand-alone devices lock up on a busy network because of a bad TCP/IP stack. Often times they will flood packets, especially broadcast frames. There are protections against bad devices such as broadcast limiters and a number of features that protect and limit unauthorized or undesirable traffic.

Ethernet isn't perfect but it's the only realistic option. Managed properly, it can be very reliable. The biggest problem I see from this article is that there is a lack of regulation and testing of the equipment that goes in to these plants. These poor TCP/IP stacks should have never gotten past the testing phase when it comes to a nuclear power plant.
Re:What network technology were they using? by Eravnrekaree · 2007-05-19 12:16 · Score: 3, Interesting

I find it particularly astounding that a nuclear power plant control network would have any connectivity to an external network. The article mentions the traffic flow may have come externally. That a nuclear power control system is anywhere near the internet really is quite disturbing. The article also mentions infected Windows computers contributing to the outages in 2003. I find it interesting that computers involved in electrical grid would be connected to the internet or have such lax security, and even run Windows of all operating systems at all. It really is inexcusable for security to be so poor. Simply keeping network programs running as non priveleged users in a jail one would think would be basic, to protect against exploits and systems becoming corrupted.
Re:What network technology were they using? by Bo'Bob'O · 2007-05-19 17:42 · Score: 2, Insightful

This are PLCs we're talking about, there are loads of network, protocol and connection systems, proprietary or otherwise, for all ranges of complexity.

Even stupider by packetmon · 2007-05-19 09:30 · Score: 4, Insightful

After yet re-reading, I find this government even more insanely stupider than I would have hoped for... Such failures are common among PLC and supervisory control and data acquisition (SCADA) systems, because the manufacturers do not test the devices' handling of bad data, said Dale Peterson, CEO of industrial system security firm DigitalBond.

"What is happening in this marketplace is that vendors will build their own (network) stacks to make it cheaper," Peterson said. "And it works, but when (the device) gets anything that it didn't expect, it will gag." So you mean to tell me pretty much there is no enforcement for manufacturers to maintain compliance on their products even if those products are going into a nuclear *ANYTHING... Which on the worst case scenario could cause catastrophe, yet we have regulatory commissions on the flow of ketchup, regulatory commissions/directions/etc., on weight loss products, lipsticks, etc. (FDA), but this place is not concerned with nuclear plants. Sinful.

--
Infiltrated dot Net

Re:Even stupider by fluffy99 · 2007-05-19 10:40 · Score: 2, Informative

This is pretty common. Also consider that the PLCs are usually custom programmed by the end-user and bad data is usually not tested by the programmers either. Heck, there are tons of commercial network devices that behave very badly when face with too much or incorrect data. Try running a full-blown security scan on your network and see what pukes. I have to go power cycle a bunch of Intel piece-of-crap print servers every time I do a port scan. Don't even get me started on the crappy snmp implementation on some major brand UPSs and HP JetDirect cards.

Brown's Ferry *AGAIN!?!??!* by ewhac · 2007-05-19 09:32 · Score: 3, Informative

People with longer memories may recall that Brown's Ferry had a massive fire a couple decades ago that burned in the wire racks underneath the reactor control room, very nearly destroying the staff's ability to control the reactor at all. It became a cause celebre among the anti-nuclear crowd alongside Three Mile Island.

At least their reactor failed to "off" this time...

Schwab

--
Editor, A1-AAA AmeriCaptions

Re:Brown's Ferry *AGAIN!?!??!* by cascadingstylesheet · 2007-05-19 11:41 · Score: 2, Informative

>At least their reactor failed to "off" this time...

It didn't just "fail to off", they manually shut it down. They followed procedures and placed it in a safe condition. No need to sensationalize it.

Re:Redesign the entire infrastructure by mrcdeckard · 2007-05-19 09:34 · Score: 4, Insightful

i think the fact that an unforeseen erroneous condition caused the plant to *shutdown* and not *meltdown* is a pretty good indication that it was designed quite well.

There will always be unforeseen situations. The key is for the system to shutdown in an orderly fashion. In programming, this is accomplished through use of error traps.

Now, the hysteria surrounding terrorism is another thing the plant engineers have to worry about.

i just wonder if and when we get to put this hysteria behind us, and get along with our lives. unfortunately, terry gilliam's brazil is on a constant loop in my mind these days. . . .

mr c

--
"Physics is like sex. Sure, it may give some practical results, but that's not why we do it." - R. Feynman

a cat by eille-la · 2007-05-19 09:38 · Score: 2, Funny

A cat fell asleep on a keyboard

Re:No kidding by StarfishOne · 2007-05-19 09:49 · Score: 4, Funny

Tor networks are generally not *that* fast.. so causing a data storm is not likely. ;)

Sometimes such connections are sooo slow, it makes users cry. They don't call it onion routing for nothing, eh? ;P

It's not stupid. by twitter · 2007-05-19 09:51 · Score: 5, Insightful

Seriously, how stupid do you have to be to think "OMG, Haxxors?" Answer: work at Homeland inSecurity, or be a Congresscritter. They already figured it out. It was a controller for a specific piece of equipment that flooded the network and triggered a bug in the variable-frequency-drive controllers for pumps.

As someone who used to work in system's engineering for a sister BWR, I think the inspection is a good idea. Oh, there's dumb and there's nuclear dumb but this is not a case of either. Nuclear dumb involves putting machine guns nests inside the plant. Finding the root cause of the accident is a good idea.

Handwaving about a PLC device won't do. What ultimately caused the PLC malfunction needs to be answered at a component level. There's going to be something wrong with it and that should be reported and every other device like it needs to be ripped out and trashed. If there is not component failure, there's a software problem which also must be understood.

Yes, it could have been hackers. The "internal control network" might at some point hits a desk that's connected to the wider world. It could be something mundane and unintentional, like an operator's virused up laptop.

An outage like that is something that's going to have both NRC and corporate ass-chewers looking at everything. Corporate might want to paint a nice picture for the NRC, but the poor devil that lies to them goes to jail. In either case, the problem will be identified and eliminated.

You might also have noted in the article that this is not the first plant to go thumbs down over some winblows born virus. In 2003, the slammer worm caused havoc at an offline Ohio plant. Yes, that was hackers. They did not mean to do it, but the plant's systems were open to it and failed. That's not acceptable from any standpoint.

Despite the better advice of the computer people at the plants, Entergy is a big M$ Partner. They take the big dogs out fishing and sell them the works. Ten years ago, M$ had something worth while and interesting. It was used in places it should not have been. Worse, the flaws from ten years ago have not been addressed or fixed. A good clean up is in order.

--

Friends don't help friends install M$ junk.

Re:The last thing we need by Cctoide · 2007-05-19 09:52 · Score: 4, Funny

ENL4:RG3 UR FU3L R0:DS! Z1R:C0NIUM R3:INF0RC3M3NT - CH3:4P35T PR1:CES!

--
"Let's face it, it's a good story. Accuracy would kill it."

Life at a power plant. by twitter · 2007-05-19 10:03 · Score: 2, Informative

Firstly I would re-design that entire infrastructure and rid that power plant of incompetent IT people.

You need to find the root cause. You don't know it yet, so you don't really know what to do.

Chances are, the cause has been written up by the four or five systems engineering people in charge of the plant. They ARE competent, but they are never given the resources they need.

Why wasn't there any failover who knows.

There was a failover - they overrode the broken thing. Had the operators been gassed, the plant would have turned itself off when the water level got too high or low. This is a big deal but ultimately the plant was safely shut down and no one got hurt. It's designed to do that even if you could shear the feed water pipe off and they did not let the new fangled control network mess with that.

--

Friends don't help friends install M$ junk.

Saturday is bought to you by the color Orange by BillGatesLoveChild · 2007-05-19 10:08 · Score: 2, Interesting

> data storm

Is that a nice way of saying they were downloading pr0n?

> US House of Representative's Committee on Homeland Security called further investigate

Boss: "So we don't have the backups for the first two weeks in April"
Employee: "Yes Boss. They were obviously misplaced by terrorists"

When Homeland Security is done, my refrigerator door was left ajar last night. I think it was terrorists too. Think I'll phone this one in.

The way I think the conversation went by rush22 · 2007-05-19 10:13 · Score: 4, Funny

"Ok, techie, give me the jist of it."
"It seems the problem was with the NC9828A chip"
"Oh? And what was the problem?"
"It melted, basically. It went bonkers."
"Ah, and then what happend?"
"Err... it caused the shutdown."
"But how?"
"Well, I presume the AH-982's got deluged with data, so they shut off."
"Ah, so it was some sort of data thing."
"Kind of, the failing chip would start sending data in the network t--"
"Hey, it's like a storm of data! Hah! I get it!"
"Umm, basically."
"Oh man. A data storm! I better tell the NRC"
"Ok, sure."

Later...

"Sir, I have the cause of the shutdown, it was caused what the tech guys here would call a data storm."
"A data storm? Wow. So your reactors got a bunch of bad datas, right?"
"Errr.. kind of, the microchips melted."
"Data can do that?"
"Yeah, it's like a storm on our, uh, logic networks. I guess that can melt the microchips"
"Uh oh. Maybe this storm came from outside the plant! One of those hacker attacks!"
"Hmmmm, the guy said it melted, but I suppo--"
"Oh crap I better inform Homeland Security!"
"Ok, sure."

Later still...

"Yeah, we had a data storm and it melted the reactor networks."
"How did this data storm happen?"
"I don't think they know yet, but it messed up big time."
"My God. Do you realize this could be Al Qaeda?!!"
"Could realize wha--"
"Al Qaeda! Terrorists. Internets terrorists."
"I don't know if the reactors are hooked up to the Interne--"
"Listen. Keep this quiet, but make sure you tell everyone you know. These reactors are not safe! No one is safe from the terror!"
"Well, it was a data storm. Can terrorists make data storms?"
"Yes. They caused your meltdown."
"No, no, the microchips melted down because of the storm. A meltdow--"
"In the terror business, there's more than one type of meltdown, you just let us handle this."
"Ok, sure."

Storm in the tubes by cyberianpan · 2007-05-19 10:36 · Score: 4, Interesting

I've worked in IT a while now & have never heard of a "data storm". This reminds me of

And again, the Internet is not something you just dump something on. It's not a big truck. It's a series of tubes. Ted Stevens We have plant managers concocting an odd metaphor that will only further confuse senators. Why can't they just use actual language - is it because they are deliberately trying to confuse the issue to avoid blame ? The same way the red herring of terrorism is being floated re this ? In fact it is more serious that

1) They can't describe what happened

2) They can't tell if outside interference, whatever the nature occurred

3) That this might have an internal/design cause
... than if "terrorists" did it.

Re:Storm in the tubes by ichigo+2.0 · 2007-05-19 11:49 · Score: 5, Insightful

Because "spike in network traffic" sounds lame. Data storm, OTOH, sounds cool and dangerous. Contact Jack Bauer quickly! We need to open a new port for the nucular plant, so the terrorists don't destroy us! And while you're at it, give us more money so we can prevent these awful storms in the future!
Re:Storm in the tubes by Jugalator · 2007-05-19 14:57 · Score: 2, Insightful

I've worked in IT a while now & have never heard of a "data storm".

Maybe it's the precursor to a logic bomb!

Wow, can't you request article deletion from Wikipedia on the basis of "ridiculous term"?
Or better yet, mind erasing for the very same reason... :-p

--
Beware: In C++, your friends can see your privates!
Re:Storm in the tubes by binarysins · 2007-05-19 15:17 · Score: 2, Informative

I usually hear them called packet storms, but they happen and "storm" is usually somewhere in the description. In fact, we were just troubleshooting exactly that at my work last week and the network admin used the exact phrase "packet storm".
Re:Storm in the tubes by Anon99 · 2007-05-19 18:41 · Score: 5, Informative

>I've worked in IT a while now & have never heard of a "data storm".

I used to work as embedded developer, and we used that term.

It was used in embedded communications when one or several devices went bonkers and flooded common bus.
Bit like packet storm, but without IP or other packet protocol, so it was called data storm.

It stands to reason that in nuclear plant there are a lot of old fogeys, so company jargon might be bit outdated and odd sounding to outsider.
Re:Storm in the tubes by bloobloo · 2007-05-20 06:51 · Score: 2, Informative

The plant I'm working on the design of at the moment will have a VPN connection so that we can monitor it's performance from abroad. Running private cables over 7000 miles would not be feasible.
Re:Storm in the tubes by RockDoctor · 2007-05-20 20:56 · Score: 2, Informative

4) Apparently the computers which control a nuclear plant are connected to the public Internet, allowing anyone in the world to send them commands, viruses, or random garbage,

Might I recommend you to RTFA?
The "data storm" appears to have been on a internal network (not seemingly connected to anything apart from other internal networks), where a data acquisition and control device barfed on some bad data and started to spew garbage onto the network. Inadequate data validation combined with inappropriate or ineffective error handling. Software fault.

--
Birds are not dinosaur descendants;birds are dinosaurs, for all useful meanings of "birds", "are" and "dinosaurs"

Re:Redesign the entire infrastructure by Ajehals · 2007-05-19 12:29 · Score: 3, Interesting

i think the fact that an unforeseen erroneous condition caused the plant to *shutdown* and not *meltdown* is a pretty good indication that it was designed quite well. really? you think that the loss of a power plant for a period of time due to network traffic is a sign of "quite good design"?

This might sound unreasonable but I would never expect a power plant (which has a lot of things depending on it) to shut down unless there was a major failure of a component or some other safety risk. Network traffic on its own, or its effects shouldn't ever be the cause. In a nuclear power plant you control ALL the nodes attached to the network, the nodes attached should not be in a position where they can saturate any individual node to the point of failure, especially if that failure causes a shut down of something as critical as a power station.

I can think of times where I have seen massive network spikes usually caused by issues with routing on fairly non-trivial networks, or loops where mistakes have been made and policies have not been followed, (lack of sleep or lack of patience), but then comparing an advertising companies internal network at 3am, or a paper factories network at midnight to a nuclear power station is taking it a little far.

There will always be unforeseen situations. The key is for the system to shutdown in an orderly fashion. In programming, this is accomplished through use of error traps.

That would be fair if we were talking about a software failure after some sort of unforeseen environmental issue, it would even be OK if an auto plant stopped production because of an unforeseen fault, and whilst power plants should certainly fail safe, they should be robust enough that a situation where failure is the only option is extremely difficult to achieve. whatever happened to redundancy?

Now, the hysteria surrounding terrorism is another thing the plant engineers have to worry about. As for the external angle terrorism or not, I doubt it. If there is a system that can be brought down by weight of traffic, and that system is important enough that failure requires a power-plant reboot (:)) then there needs to be an air-gap. Someone up thread suggested an employee's laptop with a virus as a possible method of infection.. Who in the hell allows an unchecked laptop of any description onto their LAN? never mind a network that also contains components that run a power plant!!

I would suggest that this is hype to 1) keep terrorism at the top of everyone's agenda, and make people feel unsafe, after all that sells papers and grabs viewers (which in turn sell advertising) 2) deflect some of the negativity that this incident would produce (I wish that I could blame terrorists for my mistakes sometimes... "no that project plan... I haven't got it, but I'm checking to see if my poor time management is caused by terrorism or simply my inability to organise my resources properly") and 3) Security risks presumably attract additional funding, sureley it would be nice to get an extra few million in the next budget.

Honestly, this probably shows a component failure and some poor design, understandable, but unacceptable in this area. If and I say If with some considerable doubt, this turns out to be, or is reported as an external event, then whoever enabled external network access to what appear to be critical systems within a nuclear power plant on the US mainland need to be identified and punished, together with the contractors who built or maintained it, the managers or consultants that assessed and managed it and the politicians who have responsibility for public safety. But as I said, it will probably turn out to be a simple component failure and some poor design.

Good news. by Sj0 · 2007-05-19 15:46 · Score: 3, Informative

Great news, guys. This is going to be a non-issue. People are freaking out because a digital device is involved, and freaking out because a nuclear power plant was involved, but I do industrial control system and DCS design for a living, and I'll tell you right now, that you simply can't access control networks from the outside. There are seperate, often redundant networks, and even then, depending on the way the plant was designed, we're talking modbus plus or something that PCs don't normally access.

--
It's been a long time.

Wait a cotton pickin' minute here... by Torodung · 2007-05-19 18:02 · Score: 2, Interesting

Who in God's name connects a plant's coolant regulation systems to the Internet? How could it be an outside agent when the "data storm" happened on the plant's INTERNAL network.

The article says that explicitly. "Internal network." The DHD is worried about outside agents penetrating the plant personnel, not someone with a laptop uploading a virus like Jeff Goldblum in "Independence Day."

If there *was* such a "data storm" attack, it would _have_ to be caused by an inside saboteur. The plant needs to focus on HUMAN security, not computer security. Either that or they need to reconsider a faulty design.

But can we try, just try, not to write completely hysterical baloney? Hysterical baloney is a tradmark of "Homeland Security," and they might see fit to sue.

--
Toro

Network stack has too high priority by Esben · 2007-05-19 20:50 · Score: 3, Insightful

I have actually seen such a problem myself: Controllers crashing because someone was testing the network. The problem was, ofcourse, that the CPU spent a lot of time to handle the amount of packages on the network and therefore didn't have time enough for it's real-time application. (It didn't help that the platform didn't support DMA.)

Solution: Make the network interrupt handler threaded and prioritize it below the real-time application. Sure, that doesn't help the SCADA performance, but you have to make sure that the real-time application meets it's deadlines no matter what is going on on the network. I simply don't buy that you can secure a network stretching over more than 1 meter against "data storms."

Data Storms Have Lots Of Causes by maz2331 · 2007-05-20 15:36 · Score: 3, Informative

A "data storm" can be caused by lots of things, even an unstable driver causing a NIC to spew garbage packets. Or an application that hits a bug and begins spewing to the network. Or a failure of Spanning Tree causing network loops to arise (which can really mess up an Ethernet).

The wierdest I ever saw was a situation at a school where the entire network (built around high-end Cisco switches) crashed hard. It took 3 hours of troubleshooting and disconnecting various segments to finally pin down the cause. It was a little mini-switch that some teacher attached to the LAN that somehow had a meltdown and began spewing "valid" Ethernet packets with all kinds of random garbage source and destination MAC addresses, random payload, and valid checksums. No hosts were attached to the mini switch, so it had to be something in its microcontroller going haywire. This cause every switch to go nuts trying to maintain its forwarding tables ("show cpu" was 100% utilization) and resulted in no traffic going anywhere. It even crossed VLAN boundaries since all the switches had "trunk" ports using tagged VLANS, so the garbage packets still made it through the entire LAN.

These things happen sometimes. Network gear is generally pretty robust, but can still fail in wierd ways.

Slashdot Mirror

Data Storm Caused Nuclear Plant To Shut Down

53 of 178 comments (clear)