British Airways Says IT Collapse Came After Servers Damaged By Power Problem (reuters.com)
A huge IT failure that stranded 75,000 British Airways passengers followed damage to servers that were overwhelmed when the power returned after an outage, the airline said on Wednesday. From a report: BA is seeking to limit the damage to its reputation and has apologised to customers after hundreds of flights were canceled over a long holiday weekend. The airline provided a few more details of the incident in its latest statement on Wednesday. While there was a power failure at a data center near London's Heathrow airport, the damage was caused by an overwhelming surge once the electricity was restored, it said. "There was a total loss of power at the data center. The power then returned in an uncontrolled way causing physical damage to the IT servers," BA said in a statement. "It was not an IT issue, it was a power issue."
We all know that this outage was caused by bad faith outsourcing to unqualified persons. Who are they kidding?
https://www.theguardian.com/bu...
Oh yeah, power surges are to blame! haha no.
The dangers of knowledge trigger emotional distress in human beings.
BA has a DR site independent of the primary that suffered the power issue. But volume groups were not being mirrored correctly to the DR site. When they brought the DR site online, they were getting 3 or more destinations when scanning boarding passes. And since the integrity of the DR site was an issue, it could not be used.
Then the only option is to fix the primary DC, which would have involved installing new servers / routers / switches / etc, configuring them, restoring the data to the last known good state and then bringing it back online. Good luck to anyone trying to deploy new/replacement equipment en masse during the chaos of a disaster. And then restoring data!
Takes days, not hours... unlike whatever RTO/RPO they claimed to be able to meet.