BP Gulf of Mexico Rig Lacked Alarm Systems
DMandPenfold writes "BP's monitoring IT systems on the failed Deepwater Horizon oil rig relied too heavily on engineers following complex data for long periods of time, instead of providing automatic warning alerts. That is a key verdict of the Oil Spill Commission, the authority tasked by President Barack Obama to investigate the Gulf of Mexico disaster."
Three Mile Island, where the complaint was that there were too many alarms going off.
Things will always fail in weird, unexpected ways - that's why you need humans in the loop.
http://www.nytimes.com/2010/06/21/us/21blowout.html?_r=1&pagewanted=all
'Failure of management' and regulators given blame for disaster
http://www.chron.com/disp/story.mpl/business/7367856.html
How British oil giant BP used all the political muscle money can buy to fend off regulators and influence investigations into corporate neglect.
http://www.newsweek.com/2010/05/07/slick-operator.html
This wasn't a technical failure - it was a failure brought out by greed and corruption. The blow-out was only the symptom, and addressing the symptom isn't going to prevent similar incidents from happening again.
We've seen this before - the mortgage disaster and bank bailouts, the savings and loan disaster, etc.
Start by fixing campaign financing - private donations only, strict annual limit per capita, no 3rd party involvement, etc.
-- Barbara
When will we get a governing body that can punish or apply fines for this and enforce those fines or punishments...seriously, we need to evolve with these types of companies that spit all over international laws (or lack of)
Exactly; the private sector cannot be trusted to do things safer/more efficiently/better. This is exactly why strong government regulation, especially when it comes to environmental and health issues, is needed.
Haven't they been on Nagios Exchange recently? check_catastrophe.pl has been out for like 3 years!
check_catastrophy -H blowout-preventer716.haliburton.com -w ANY_LEAKS - c ANY_FRIGGIN_LEAKS
I think everyone's familiar of that phenomenon regarding the alarm that cried wolf due to all the car alarms. Rarely do people even turn their head when they hear a car alarm.
Competent professionals don't do that. The problem with car alarms is that they aren't aimed at professionals, competent or otherwise, they're aimed at the general public and the mechanism they use isn't typically going to assure that anything is going on.
Competent professionals like the ones that are supposed to be running rigs should know to check them out every time and not turn the alarm off withotu ascertaining that the alarm is in fact false. Disabling an alarm should only be done when there are adequate contingency plans in place to handle if the condition happened and how they would respond.
I used to work security at a high rise and we'd often times have alarms turned off on portions of the building. It was the only way to ensure that under certain circumstances that work wouldn't cause a false alarm. It was done in a controlled way with plans in place to make sure that there was somebody keeping an eye on it while the work was being done, and that the alarms would be turned back on when they could be.
And every time that building had an alarm go off which wasn't a known cause, it was always investigated promptly. Alarms that go off repeatedly need to be fixed, not disabled.
I don't even want to know how much tax payer money was pissed away for that "key verdict" - having worked with quite a few monitoring and alarm systems for years I can tell you that most of the time "automatic alarms" get ignored and in fact can cause worse problems when an actual real alarm does occur because of how the operators tune them out - seems like they completely missed the mark on this - the real problem is most likely where you would expect it, the people running the system - human error I am sure !
You don't even have to ignore the alarm that isn't there. But I don't think the "alert" that we're discussing is the big klaxon/flashing sign reading "OIL LEAK," or an oil pressure light with electrical tape over it. What the article indicates was missing was an automatic method of indicating that a failure was imminent. As far as the cost of determining this: learning from mistakes can be expensive. Not learning from mistakes is likely even more so.
I am not a crackpot.
Actually, there were BPs in a redundant configuration but when the control was lost the main failed to operate and the backup's batteries were in too poor condition to work. As with most disasters there were a myriad of contributing factors. After looking at numerous reports (everyone is certainly trying to make sure their investigations are public) it looks like:
1. Familiarity breeds contempt. Alarms shut down or ignored partly because of annoyance and partly because incorrect conclusions were made about the state that the well was in, leading to a dangerous situation and disastrous consequences. Not unlike pilots in poor visibility conditions who ignore their instruments and distrust them leading to controlled flight into terrain.
2. Money trumps safety. There was tremendous corporate pressure to bring the well in. In the oil production world, almost everything is done by contract with petroleum producer owning and operating very little of what is going on. Rigs, crews, services are all contracted to do certain jobs and the competition is fierce. No one wants to be the company that could not do the task or who were late getting it done. Consider: if some different decisions were made and the well was brought in safely but say two or three months late and with several million more dollars spent, we would have never heard about anything and some of the well contractors, including individuals such as the rig boss, contract engineers, may have been looking for work elsewhere.
I'm interested to see if anything changes after all of the investigations, a la airline safety after a TSB investigation.
Have a peek at the Norwegian sector. We've been doing this shit since the 70s and try damn hard to not have another Alexander Kielland...
http://en.wikipedia.org/wiki/Alexander_L._Kielland_(platform)
The norwegian petroleum oversight is something... The regulators are ruthless when it comes to compliance and better yet... they are not directly controlled by politicians ;)
The cost of one fuckup is too much to allow people to cut corners.
I sure as hell dont in my job... and I do it for a living. When we have the option of doing it right, or doing it fast.. we pick right. Every time. I dont care if the customer is pissed at things being delayed. We do it -right-.
Unfortunately, a single alarm configuration on a "tag" could cost anywhere from 10k to 100k dollars.
The configuration isnt all that hard or time consuming but the testing of the system after the modification is brutal. At least here where it has to be certified to be allowed into operation ;)
Operator: "I cant do that, that has to be run through the PCDA office and certified by the technical staff first."
Manager: "Ok, I'll submit the paperwork"
PCDA: "This is a bad idea, lets fix it instead..."
Or something like that is how it goes here :p
If it even passes the manager. Most of the time the technical staff handles the alarms without telling any 'manager'. The operator responsible for the shift has authority over the day to day operation without any manager interference.
You cant operate if non-techies have more control than the techies over tech questions. It has been tried and abandoned ;)
Transocean Gulf of Mexico Rig, leased to BP, lacked Alarm Systems
Eclectic beats from Leeds, UK
handmadehands.co.uk
They had this exact problem with Texas City-- they didn't do maintenance on the systems, so a subsystem overfilled with volatile hydrocarbons with no alarms going off at all-- and when one alert sounded at the monitoring area, they ignored it. They didn't invest the (relatively) small cost of installing a flare (to burn off excess), so the excess hydrocarbons spilled out into the open. Cost-cutting and an incredibly cavalier approach to maintenance from the London management generated a fucking fuel-air bomb in Texas.
This is one instance where the Brit management, when they changed to Hayward, should have told their investors to "fuck off-- er, give us a few years" and spend the necessary money to get their facilities up to snuff, or decommission the facilities that are too costly to maintain. Alas, profit motive proved more powerful than basic empathy or responsibility.
"We are Microsoft. You shall be assimilated. Competition is futile."
Doing the change: 3-4 hours of work.
Organizing the update to the controller in the field?
- Requires a look into what could be influenced by the change
- Requires in some cases an 'offline' load of the controller which can only be done at a time of a maintenance downtime (once a year at most, sometimes every 2-4 years)
Documentation:
- Documentation of what functionality changes for operators
- Update of system configuration diagrams
- Update of various tag info in the plant documentation system
Install:
- A job package must be written detailing every change made to the system.
- A test package must be written with a full test suite to check that nothing broke during the change. People make mistakes and this is important.
Now... How much will all this cost?
When I'm working on jobs like this the company I work for charges about 170 bucks an hour...
4*3 hours (The change, verification and signoff, various overhead)
5*2 hours (Field work, included travel time etc, x2 for 2 people)
8*3 hours (documentation, x3 due to document controllers, various overhead)
6*2 hours (job/test package)
5*5 hours (testing)
83 hours, 170 bucks an hour, 14110 USD.
This is a fairly average estimate of what something would cost on -our- side of a very small change. If hardware is involved it rapidly skyrockets in cost.
In addition there is a myriad of people that need to check and verify the change on the -other- side of the fence. Namely the owner and/or operator of the plant.
All these time-consuming road-blocks put in place are barriers against making changes that could breach safety. They look arcane and silly to quite a lot of people but they are there for a reason.
Most of the accidents where I work happen when someone do a quick tiny change. One that "wont cause any issues" except that it turns out it does.
To see why small changes can have huge impacts have a look at this book: http://www.amazon.com/What-Went-Wrong-Histories-Disasters/dp/0884150275
I realize it would be horribly boring reading for anyone not interested in it :p
The problem is who is the competent professional who is working on alarms?
Is it the maintenance team who is backlogged with bullshit alarms that go off under normal process conditions because someone decided that it would work to prevent some disaster which may occur?
Is it the process / technical team who decided yet another alarm will be cheaper than re-designing the process to meet the safety guidelines?
Is it the console operator who has gone mental at the alarm going off constantly in the middle of the night and has requested the bypass?
Is it the control engineer who has approved the bypass for the same reasons without a process safety review?
Ask different people on what they do with alarms and you'll get different answers even within the same discipline. We have two process safety engineers at our refinery with two distinctly differing opinions. The one thinks advanced warning is god and requests a process alarm be put on everything, and that every instrument becomes a layer of protection. The other wishes that this was 1960 where signals were pneumatic and adding an alarm to the operations console cost a frigging fortune, because back then we had only sane and highly critical alarms.
The former is winning in my opinion. An alarm goes off in our control room every 2 minutes. There is a list of standing alarms for each area on each console operator's screen, and some alarms even go unacknowledged. Many more are bypassed. But all for what? In reality when something goes wrong the operator screen is flooded with priority 1 critical alarms and the operator can acknowledge maybe 1 or 2 before they just start playing on instinct and training to bring things under control.
If you get to the stage where you are relying on an alarm you have lost. Relying on operator intervention for process safety is the absolute last resort.