Data Storm Caused Nuclear Plant To Shut Down
rs232 writes to let us know that the US House of Representatives Committee on Homeland Security called this week for the Nuclear Regulatory Commission to further investigate the cause of excessive network traffic that shut down an Alabama nuclear plant. Investigators want to know whether the data storm could have been initiated from outside the plant.
Sounds to me that the vendors under-engineered their network and still charged mega-bucks for it. The auditors, I'm sure, are making the most out of this to justify their fee.
Nothing to see, move along - I'll say!
I prefer Flambe as apposed flamebait.
As usual, the American government is looking to extend its control over things. "Oh noes, look what terrorists might have done. Homeland security needs more funding and less oversight to prevent this in the future." When will people learn to assume the government is lying first, then wait for them to prove themselves right later?
- I voted for Nintendo and against Bush
Isn't it a bit odd that they were using a non-deterministic network - something like Ethernet, by the sound of it. Back in the early 90s, I was always told that networks like Ethernet were great for office apps, but not where you wanted guaranteed times for message delivery. For that token ring, FDDI and the like were better. What is the network infrastructure of choice in a nuclear power station?
As it should be. The point is this. Any of the computer/network equipment that actually runs the plant should not be connected to the outside, period. All normal computers for office work, typing up non-classified reports and reading slashdot should be on a whole seperate network. Idealy, there should be 3 networks since the plant should only have certified equipment connected to it that won't cause what happened here to take place unless something was truly malfunctioning. I'd be a little scared to find windows boxes, and even most unix/linux things connected to the plant network.
> data storm
Is that a nice way of saying they were downloading pr0n?
> US House of Representative's Committee on Homeland Security called further investigate
Boss: "So we don't have the backups for the first two weeks in April"
Employee: "Yes Boss. They were obviously misplaced by terrorists"
When Homeland Security is done, my refrigerator door was left ajar last night. I think it was terrorists too. Think I'll phone this one in.
You'd hope that in something as critical as a nuclear power plant the answer would be, very quickly, "no, it didn't come from an external source because that's impossible
Indeed.
Unfortunately, sometimes our favorite software supplier is involved...
1) They can't describe what happened
2) They can't tell if outside interference, whatever the nature occurred
3) That this might have an internal/design cause
This actually can be avoided (and AFAIK current designs do). Fast, electronic level response to avoid blackouts and such requires very much less time than changing reactor output would either allow or facilitate anyway, so the direct machine to machine communication links don't really need to go to the power cycle control systems at all. Instead, rapid response grid balancing is done at external switchpoints. For the newer designs, these are outside the whole plant at substations, let alone just outside the core areas. Between these links and reactor control systems, there's supposed to always be an air gap.
Given that, any hacking would have to include a social engineering element designed to fool the operators into making the wrong decisions. If we include that stipulation, yes, it's quite conceivable. If we postulate someone bridging the air gap, maybe by something as simple as hooking a laptop that also contains a wireless card into the control network, then a non-social engineering attack becomes conceivable, but not really otherwise.
DOE and NRA doctrine is that adjusting reactor output based solely on a trigger event outside the core instrumentation is supposed to always require a high level human decision. Supervisors are also at least supposed to be trained to the point where they can make these decisions without adding any more response time than a conventional, (i.e. hydroelectric or coal based), plant would need for their human level decision events. (Yes they have them. For example the four TVA dams that supply Alcoa aluminum face a whole series of individual and joint human level decisions every time Alcoa's main furnace system glitches, and these have to include how long Alcoa expects them to need to dump power elsewhere, and for each of them, what options the other three dams are considering).
The DOE does not legally presume that reactors are even as responsible for balancing the grid as conventional plants, but given how much older a lot of the conventional plants are, it's pretty easy to do much, much better than is strictly required, and it should be noted that, in the last New York blackout all the cascade effects and switching failures happened in 1940's era or earlier fossil fuel plants, and the worst points were 1930's or even 1920's era designs. Still, the rules are that if the conventional plants are failing at load balancing, even if the grid is experiencing severe cascade failures, the nuclear sites will let the whole thing crash rather than take the risks of trying to stabilize the grid by actually modulating their reactions.
Who is John Cabal?
This might sound unreasonable but I would never expect a power plant (which has a lot of things depending on it) to shut down unless there was a major failure of a component or some other safety risk. Network traffic on its own, or its effects shouldn't ever be the cause. In a nuclear power plant you control ALL the nodes attached to the network, the nodes attached should not be in a position where they can saturate any individual node to the point of failure, especially if that failure causes a shut down of something as critical as a power station.
I can think of times where I have seen massive network spikes usually caused by issues with routing on fairly non-trivial networks, or loops where mistakes have been made and policies have not been followed, (lack of sleep or lack of patience), but then comparing an advertising companies internal network at 3am, or a paper factories network at midnight to a nuclear power station is taking it a little far.
There will always be unforeseen situations. The key is for the system to shutdown in an orderly fashion. In programming, this is accomplished through use of error traps.That would be fair if we were talking about a software failure after some sort of unforeseen environmental issue, it would even be OK if an auto plant stopped production because of an unforeseen fault, and whilst power plants should certainly fail safe, they should be robust enough that a situation where failure is the only option is extremely difficult to achieve. whatever happened to redundancy?
Now, the hysteria surrounding terrorism is another thing the plant engineers have to worry about. As for the external angle terrorism or not, I doubt it. If there is a system that can be brought down by weight of traffic, and that system is important enough that failure requires a power-plant reboot (:)) then there needs to be an air-gap. Someone up thread suggested an employee's laptop with a virus as a possible method of infection.. Who in the hell allows an unchecked laptop of any description onto their LAN? never mind a network that also contains components that run a power plant!!I would suggest that this is hype to 1) keep terrorism at the top of everyone's agenda, and make people feel unsafe, after all that sells papers and grabs viewers (which in turn sell advertising) 2) deflect some of the negativity that this incident would produce (I wish that I could blame terrorists for my mistakes sometimes... "no that project plan... I haven't got it, but I'm checking to see if my poor time management is caused by terrorism or simply my inability to organise my resources properly") and 3) Security risks presumably attract additional funding, sureley it would be nice to get an extra few million in the next budget.
Honestly, this probably shows a component failure and some poor design, understandable, but unacceptable in this area. If and I say If with some considerable doubt, this turns out to be, or is reported as an external event, then whoever enabled external network access to what appear to be critical systems within a nuclear power plant on the US mainland need to be identified and punished, together with the contractors who built or maintained it, the managers or consultants that assessed and managed it and the politicians who have responsibility for public safety. But as I said, it will probably turn out to be a simple component failure and some poor design.
A random fluctuation in internal traffic levels seems equally unlikely. Why? Because it has worked for some time, and I doubt the reactor was doing anything unusual at the time. A true network storm is unlikely - the term exists, but describes an astronomically rare situation. If a network is flooded, it is either near or at capacity. A network storm is when capacity is exceeded in a way that is self-perpetuating. The last time I remember the term being used in a public forum was I think over twelve years ago when a public demonstration of the multibone caused a cascading router flap that shut down a large segment of the Internet backbone due to total gridlock. It wasn't just that nothing else could get through - nothing AT ALL could get through.
What does this leave us? It makes it extremely unlikely that the network traffic per se had anything to do with the shutdown. Much more likely is a cumulative error in the devices involved that merely happened to turn into a fatal bug at roughly the same time as the network spiked. It might be network related, but nobody here can seriously believe it was network caused. Networks may be polled, in which case network traffic that escapes being polled is simply never seen. Network drivers may also be event-driven, but if the interrupt handler is buggy - which would usually mean the handler can be interrupted by itself indefinitely - it's hardly the fault of the network.
In other words, this is a gross programming error that the coders and managers are desperately trying to blame on something - anything - other than their own ineptness. It might merit Scott Adams making a Dilbert cartoon over, but that's it.
It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)
It's called hydro - or sometimes even pump storage. Conventional thermal power is cheap but it takes a long time to increase output unless there is already spinning reserve. Non-conventional thermal power still takes time BECAUSE IT IS NOT MAGIC unlike what we are led to believe by those that want to build a few hundred 1950's style plants painted green. Nuclear power possibly would be a mature technology by now if some effort had been put in over the last few decades, but for now it's just a new and expensive way to boil water sold as the peaceful side of the bomb.
Who in God's name connects a plant's coolant regulation systems to the Internet? How could it be an outside agent when the "data storm" happened on the plant's INTERNAL network.
The article says that explicitly. "Internal network." The DHD is worried about outside agents penetrating the plant personnel, not someone with a laptop uploading a virus like Jeff Goldblum in "Independence Day."
If there *was* such a "data storm" attack, it would _have_ to be caused by an inside saboteur. The plant needs to focus on HUMAN security, not computer security. Either that or they need to reconsider a faulty design.
But can we try, just try, not to write completely hysterical baloney? Hysterical baloney is a tradmark of "Homeland Security," and they might see fit to sue.
--
Toro