Slashdot Mirror


Lightning Wipes Storage Disks At Google Data Center

An anonymous reader writes: Lightning struck a Google data center in Belgium four times in rapid succession last week, permanently erasing a small amount of users' data from the cloud. The affected disks were part of Google Computer Engine (GCE), a utility that lets people run virtual computers in the cloud on Google's servers. Despite the uncontrollable nature of the incident, Google has accepted full responsibility for the blackout and promises to upgrade its data center storage hardware, increasing its resilience against power outages.

77 of 141 comments (clear)

  1. Whaaa? by SpankiMonki · · Score: 4, Funny

    Permanently erased? How can this be? Doesn't Google keep an off-site backup of my pr0n on tape or DVDs or sumpthin? So much for best practices, I guess.

    1. Re:Whaaa? by GigaplexNZ · · Score: 5, Interesting

      The affected service was Google Computer Engine, meaning that data may be changing. Replication isn't instantaneous, so I'd imagine the lost data was pending modifications.

    2. Re:Whaaa? by Anonymous Coward · · Score: 5, Interesting

      From what I read elsewhere it was new/current data, not even an hour old, and the lightening may have caused things to run off batteries for a bit too long due to the multiple strikes. Seems not unreasonable as an explanation, might be entirely wrong though. Articles implied that users can also backup on their own sites to ensure that they are not behoved to anyone.

    3. Re:Whaaa? by Dog-Cow · · Score: 1, Troll

      No, the problem is that replication isn't instantaneous. If you can't accept reality, please kill yourself.

    4. Re:Whaaa? by sexconker · · Score: 4, Insightful

      All datacenter class storage devices should be backed by battery units with enough capacity to flush all pending writes to disk.
      I have never bought a server that didn't have battery-backed hardware RAID.

      Google, however, runs the cheapest, commodity parts, often refurbished / purchased used, and relies on software RAID and massive replication schemes. Such schemes don't work for new data, as they've found out.

      I wouldn't blame them if their shit got directly hit by lightning and that caused damage (you can't expect anything to survive that), but if we're saying the extended power outage caused data loss, then it's absolutely Google's fault.

    5. Re:Whaaa? by Hognoxious · · Score: 5, Funny

      lightening may have caused things to run off batteries for a bit too long

      Darkening reduces the output of solar panels, so you can't win either way.

      --
      Confucius say, "Find worm in apple - bad. Find half a worm - worse."
    6. Re: Whaaa? by Anonymous Coward · · Score: 3, Informative

      In this case it was disks the customer had specifically requested to be un replicated for performance. So all the data could be gone in a flood and Google would still be within its obligations.

    7. Re:Whaaa? by Anonymous Coward · · Score: 1

      It was on the VAXcluster I administered in the '80s. The "reality" is that until the replication has happened, the data shouldn't in principle be regarded as ready for processing. In practice, a few seconds' write caching to ALL nodes is the limit of acceptability.

      I've come to learn that these days most people doing stuff "with computers" are doing nothing of value, and coincidentally this means that the fucking atrocious reliability and security of outsourcing are considered good enough.

    8. Re:Whaaa? by HxBro · · Score: 1

      I demand my data is backed up before I've sent it to google!

    9. Re:Whaaa? by DNS-and-BIND · · Score: 2

      The purpose of a battery backup isn't to let things run when the power goes out. The purpose of a battery backup is to allow an orderly shutdown when the power goes out, instead of a sudden outage that results in data loss and damaged systems.

      --
      Shutting down free speech with violence isn't fighting fascism. It IS fascism!
    10. Re:Whaaa? by TheCarp · · Score: 2

      Except you don't have to blame them, they took responsibility.

      Pretty sure what they have 'found out' is that paying for the fallout from the occasional freak occurrence and minor data loss is cheaper in the long run than buying more expensive hardware to gaurd against occurrences so rare that they end up on news sites.

      --
      "I opened my eyes, and everything went dark again"
    11. Re: Whaaa? by Anonymous Coward · · Score: 1

      Our datacenter generator fail to start at the last powermoutage we had, the culprit was a squirel

    12. Re:Whaaa? by gweihir · · Score: 1

      Very likely. Cloud providers are routinely lying about their performance, or you need to dig very deep to find what the actual technical assurances are.

      --
      Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
    13. Re:Whaaa? by bobbied · · Score: 1

      Brief is apparently long enough to cause data loss. But power disruptions to running systems don't have to be very long to shut them down.

      However, Google apparently hasn't really engineered a proper solution here. In the data center, batteries are there only to bridge you over until the onsite power generator can come on line or until you can perform a safe and orderly shutdown. Plus, you *DON'T* shutdown the generator or reboot shutdown systems after power comes back until you have enough battery capacity and fuel for the generator built back up to survive the next power transient.

      Few people think beyond the first event, but Google should know better.

      --
      "File to fit, pound to insert, paint to match" - Aircraft Maintenance 101
    14. Re:Whaaa? by gstoddart · · Score: 1

      But but ... it's the cloud. Why would you need a backup in the cloud?

      You mean we've been lied to?

      --
      Lost at C:>. Found at C.
    15. Re:Whaaa? by Darinbob · · Score: 1

      It's a good thing that users routinely make local backups before uploading to the cloud.

    16. Re: Whaaa? by Chexsum · · Score: 1

      its my understanding that google backs up their backups. queue backup song

      --
      Pixels keep you awake!
  2. Re:Cannot be trusted by Anonymous Coward · · Score: 5, Interesting

    Just use Amazon like everyone else. Google cannot be trusted, and I have said that many times. They 1) frequently decide to shut down services users rely on. One of the persistence mechanisms we depended on recently got the head shot, costing us so much money that we decided to move to Amazon, which has a standardized stack, and 2) data loss, and 3) non-existant customer service. Try contacting Google with a pressing issue.... you'll eventually give up.

  3. Oh realy? by SeaFox · · Score: 5, Interesting

    Lightning struck the same place not twice, but four times?

    1. Re:Oh realy? by Anonymous Coward · · Score: 5, Funny

      Cloud to Cloud lightning is about 3 times more common than cloud to ground, so its not that crazy.

    2. Re:Oh realy? by Anonymous Coward · · Score: 5, Informative

      Contrary to popular belief, it's common that ligthning strikes several times at the same place. Something that was attracting lightning, if not destructed by the first strike, will still be a major target standing for the next ones.

    3. Re:Oh realy? by Anonymous Coward · · Score: 4, Funny

      It's cloud computing after all. Of course there will be frequent lighting.

    4. Re:Oh realy? by Anonymous Coward · · Score: 5, Funny

      Another reason not to store your important data in the cloud.

    5. Re:Oh realy? by Anonymous Coward · · Score: 1

      I hate it when I miss an obvious pun. I shall now steal this for every argument against cloud computing, along with all the serious ones.

    6. Re:Oh realy? by Anonymous Coward · · Score: 1

      Agreed ... I have a tree that my city wont let me cut down because it is an old growth tree and it has been hit 3x and every time it happens the EMP from the bold nukes half of the running electronics in that side of the house.

    7. Re:Oh realy? by DigiShaman · · Score: 4, Interesting

      How rapid? I'm of the understanding that the first bolt ionizes the pathway in the air thereby reducing the resistance for the subsequent strikes to follow. All of this occurring within a few seconds.

      --
      Life is not for the lazy.
    8. Re:Oh realy? by Big+Hairy+Ian · · Score: 1

      Somebody was on the roof wearing armour and yelling "Gods a bastard!"

      --

      Build a Man a Fire, and He'll Be Warm for a Day. Set a Man on Fire, and He'll Be Warm for the Rest of His Life.

    9. Re:Oh realy? by slimshady76 · · Score: 2, Funny

      Another reason not to store your important data in the cloud.

      Arrrgh, somebody please mod this up!!! You made me snort my coffee.

    10. Re:Oh realy? by pubwvj · · Score: 4, Interesting

      Lightning striking in the same spot repeatedly is a lot more likely than people think. The reason lightning may have struck a spot is due to there being a good path. Thus lightning is likely to strike that easy path again.

      We have that. We live on a mountain where there is a large copper vein running under us. I have watched lightning strike repeatedly in the same spot.

      There are videos of lightning repeatedly striking tall buildings during a single storm.

      More over, lightning does not need to be very close to do a lot of damage. In a recent storm we had nine nearby strikes - not all in the same spot but spread out over at least a square mile of our land. We lost many miles of wire because of the EMP that the lightning strikes generated got picked up by the wires and overloaded them causing the wires to melt. Some sections of fence wire simply vanished. Google could have had a few nearby strikes that did that. This happens.

      See:
      http://sugarmtnfarm.com/2015/0...
      and
      http://sugarmtnfarm.com/2015/0...

    11. Re:Oh realy? by bobbied · · Score: 1

      If you don't get directly hit by lightning, it can still do a lot of damage to things... There are two ways this happens...

      1. Induced currents into wires close to the strike - This is where the huge amounts of current flowing into the ground from the lighting strike induces currents in other wires. It's basically cross talk, but with a really large impulse input in one wire, you get some pretty impressive signals in parallel wires. This is what fries your corded phones and electronics which are plugged in when lighting hits the tree down the street.

      2. Disruption of the local "ground" reference - Lighting pumps a boat load of charge into the ground at one point and it takes a bit of time for this to dissipate. What this causes is a very large voltage potential to exist between points on the ground. So if you have multiple ground reference points spread around, any device that bridges the gap between two ground references can experience huge voltages. A common example is an electric fence charger which has a ground rod. It will see the voltage differential between it's local ground reference and the power company ground which may be feet apart and see hundreds of thousands volts between the "common" and it's local ground and arc over.

      You avoid lighting damage by first keeping yourself from being hit directly and then being very careful to lower your exposure to the above two risks.

      --
      "File to fit, pound to insert, paint to match" - Aircraft Maintenance 101
  4. In other news... by LatePaul · · Score: 5, Funny

    ... Alphabet's new "personal re-vivification" project is making good progress. The project leader, V.Frankenstein was unavailable for comment however.

  5. Re:Cannot be trusted by Anonymous Coward · · Score: 5, Informative

    The announcement is about Google Cloud Engine. Not about Google's own services (gmail, search, photos, docs, that sort of things). AFAIK, none of Google's own service announced any loss - presumably because they don't rely on a single location.

    From the post:

    > In particular, it was possible at all times to recreate new Persistent Disks from existing snapshots.

    i.e. snapshots were fine.

    > This outage is wholly Google's responsibility. However, we would like to take this opportunity to highlight an important reminder for our customers: GCE instances and Persistent Disks within a zone exist in a single Google datacenter and are therefore unavoidably vulnerable to datacenter-scale disasters. Customers who need maximum availability should be prepared to switch their operations to another GCE zone. For maximum durability we recommend GCE snapshots and Google Cloud Storage as resilient, geographically replicated repositories for your data.

    So, if some poor users of GCE thought a single geographical location can withstand disasters, they now know.

  6. Another Flash Vulnerability ? by Dave+Whiteside · · Score: 5, Funny

    n/t

    --
    who where what when now?
    1. Re:Another Flash Vulnerability ? by Anonymous Coward · · Score: 1

      I bet that sent a shockwave

    2. Re:Another Flash Vulnerability ? by Carewolf · · Score: 1

      No I think this was using Thunderbolt

    3. Re:Another Flash Vulnerability ? by Bongo · · Score: 1

      Oh wait, I never realised my Lightning connector was for charging my iPhone outdoors.

    4. Re:Another Flash Vulnerability ? by Anonymous Coward · · Score: 1

      Witness said "it was like a silver light"

  7. Re:What do you mean permenantly erased...only 1 DC by Anonymous Coward · · Score: 2, Informative

    If you RTFA you'll see they mention it only affected "recently written data" that had not yet made it to persistent storage. So probably only a few hours old at most.

  8. Re:What do you mean permenantly erased...only 1 DC by Anonymous Coward · · Score: 1

    GCE is not a backup solution.

    Google cloud data is geographically replicated.

  9. If my calculations are correct... by h33t+l4x0r · · Score: 3, Funny

    That comes to 4.84 Jiggawatts! No wonder there was outage.

    1. Re:If my calculations are correct... by binarylarry · · Score: 1

      Great Scott!

      --
      Mod me down, my New Earth Global Warmingist friends!
  10. But google said... by Chrisq · · Score: 1

    But google said that that it "....replicates data three times for redundancy. It can afford to be cavalier about hardware failures. So a drive fails. Log it, switch queries on that data to a replica and move on. It's all pretty instant".

    1. Re:But google said... by F.Ultra · · Score: 1

      That article is not about the Google Compute Cloud though.

    2. Re:But google said... by F.Ultra · · Score: 1

      Exactly, I've had Amazon take down EC2 instances and then all data stored there went to /dev/null so if the data is important you have to do backup. That people are shocked about that is very strange.

  11. Re:What do you mean permenantly erased...only 1 DC by Viol8 · · Score: 2

    Thats no excuse. It should be distributed amongst seperate machines in seperate centres instantaniously.

  12. Re:What do you mean permenantly erased...only 1 DC by bledri · · Score: 5, Insightful

    Thats no excuse. It should be distributed amongst seperate machines in seperate centres instantaniously.

    So faster than the speed of light using the infinitely-wide infinite improbability data bus?

    --
    Some privacy policy Slashdot.
  13. Also, adding a couple of lightning rods by Rinikusu · · Score: 1

    And a mad scientist with a cobbled together corpse.

    --
    If you were me, you'd be good lookin'. - six string samurai
  14. Strike one for the cloud! by Alumoi · · Score: 1

    Nuf said.

    1. Re:Strike one for the cloud! by Zak3056 · · Score: 1

      From the summary, it was at least strike four...

      --
      What part of "shall not be infringed" is so hard to understand?
  15. location location location by Skapare · · Score: 1

    location location location ... all data should be replicated in at least THREE locations around the world.

    --
    now we need to go OSS in diesel cars
  16. Re:What do you mean permenantly erased...only 1 DC by Skapare · · Score: 2

    hours to replicate data? why so long? are they still running __________? (insert your most hated system/language)

    --
    now we need to go OSS in diesel cars
  17. Headline Failure by smallfries · · Score: 2

    Come on people. This has the potential to be legend... ary. What a complete failure.

    Even just form a quick punt we could glimpse such lyrical word play as:
    "Lightning strike inside Cloud"
    "Cloud damaged by lightning"
    "Cloud not lightning-proof"

    Please read the fucking Register until you gets it.

    --
    Slashdot: where don knuth is an idiot because he cant grasp the awesome power of php
  18. Re: Cannot be trusted by Lord+Bitman · · Score: 4, Insightful

    I know that's what I base my critical data storage choices on - how fast a tangentially-related service's static front page loaded 15 years ago on dialup.

    --
    -- 'The' Lord and Master Bitman On High, Master Of All
  19. Offsite backups by Meneth · · Score: 1

    This is why you need offsite backups, preferably on hardware under your own direct control.

  20. Re: lightning strike by MasterOfGoingFaster · · Score: 5, Insightful

    Bullshit. I used to design high voltage connections, and tested using a 300kV impulse generator. I've seen a lot of crazy stuff analyzing field failures. You can greatly reduce the risk, but you cannot remove all risk in an above ground facility, as a practical matter.

    I do see lots of silly stuff done, based on myth and lack of knowledge.

    --
    Place nail here >+
  21. Re:What do you mean permenantly erased...only 1 DC by Viol8 · · Score: 3, Informative

    *GUFFAW*

    You're wasted here, you should do stand up.

  22. Re: Cannot be trusted by Antique+Geekmeister · · Score: 1

    I'm afraid that had nothing to do with write speed for storage. It had a great deal to do with deliberately keeping the design very light, using very effective proxies for the very light and consistent images, and keeping tight reins on web designers who might want to front load the Google pages with exciting content that had no relationship to the service.

    I'm confused why you mention it, unless you think that the wise practices and designs that led to such effective and quick interfaces affected their storage designs, as well.

  23. What did you expect? by johnsnails · · Score: 1

    That's why you don't put things in the cloud, put them in the aether.

  24. Re:What do you mean permenantly erased...only 1 DC by ScentCone · · Score: 2

    Thats no excuse. It should be distributed amongst seperate machines in seperate centres instantaniously.

    You can have that services if you want to pay for it. You get that, right?

    --
    Don't disappoint your bird dog. Go to the range.
  25. Did you wipe the server clean? by Muntzsky · · Score: 1

    'What, like with a cloth or something?"

  26. Re: lightning strike by gx5000 · · Score: 1

    Place Lightning Rod "A" in area "B", repeat until coverage and grounding complete.
    Or maybe they should start digging, copy the NSA and have everything underground would solve most issues no ?

    --
    End of Line.
  27. Oxymoron by Dcnjoe60 · · Score: 1

    Despite the uncontrollable nature of the incident, Google has accepted full responsibility for the blackout and promises to upgrade its data center storage hardware, increasing its resilience against power outages.

    If it is uncontrollable, then any changes Google makes won't matter. On the other hand, if using other equipment, hardening the system, installing better grounding, etc. would have kept the loss from happening, then it is controllable. Maybe what they meant to say was unpredictable. Of course, then they would have had to explain why they didn't plan for the possibility.

    1. Re:Oxymoron by bobbied · · Score: 1

      No, not really. As I understand this, what happened is they experienced multiple failures of the AC power coming into the facility which happened to be lighting induced. When the power went out the first few times, the UPS's switched to battery power and everything kept going as the batteries provided the necessary power. Eventually, after repeated dependence on the UPS batteries during the multiple failures, the batteries had no more power left and the UPS output power stopped.

      This is really a process problem. Apparently nobody recognized that they had lost their redundant power system's battery capacity and took the necessary steps to move the processing to a system where full redundancy still remained. Their process failed them, although it apparently took *four* power failures to expose the issue.

      --
      "File to fit, pound to insert, paint to match" - Aircraft Maintenance 101
  28. like lightning on your wedding day by Thud457 · · Score: 1

    Wait...
    Are you saying Google users lost their data from the cloud due to lightning?
    Whoever scripts your reality down there is the most inept of hacks.

    --

    the preceding comment is my own and in no way reflects the opinion of the Joint Chiefs of Staff

  29. Re: Cannot be trusted by Coren22 · · Score: 1

    I base all my decisions on ACs responding to themselves. Splitting a single post up into three posts all acting like different people agreeing makes an AC look that much more trustworthy.

    --
    APK likes to ask for responses to the same things over and over. Maybe he just likes the responses?
  30. Re:How about learning to spell? by Coren22 · · Score: 1

    Around here, there is a guy that runs for sheriff (never really cared if he won) named Moran, it is entirely possible that the poster is a joke.

    http://www.washingtonpost.com/...

    All his signs say "Moran for sheriff", I always read it as Moron for sheriff and wondered why anyone would vote for a moron for sheriff.

    --
    APK likes to ask for responses to the same things over and over. Maybe he just likes the responses?
  31. Re:Lightening is wierd stuff by bobbied · · Score: 1

    Actually, this seems to be a problem with their power redundancy systems not really a lighting protection problem.

    After three lighting induced power outages, the UPS ran out of reserve capacity so on the fourth power outage the UPS dropped the system's power.

    Really, what they have is a process failure... They should have their standby AC generator running until the UPS batteries where charged enough to safely handle another fail over process. Either that, or they should have quickly moved the processing out of the facility until full redundancy could be restored.

    --
    "File to fit, pound to insert, paint to match" - Aircraft Maintenance 101
  32. So... by drunk_punk · · Score: 2

    If I understand this correctly, to get your personal data removed from Google search engines it requires 4 lightning strikes to the exact same location?

    Must have missed that part in the EULA...

  33. no generators? by swschrad · · Score: 1

    come ON, google eyes... slack-writing disks is always a danger. you don't have generator backup for the battery backup, and you're slopping data all over the floor as a result?

    losers.

    --
    if this is supposed to be a new economy, how come they still want my old fashioned money?
  34. first bump should have taken the center offline by swschrad · · Score: 1

    and all data flushed to disk, and resources examined, before they opened for input again.

    --
    if this is supposed to be a new economy, how come they still want my old fashioned money?
  35. you've never been near high places, then, son by swschrad · · Score: 1

    at some point in your life, have you ever listened to AM radio or broadcast television? their stuff gets hit with lightning all the time. if a really good thunderstorm happens to float over their tower, AM radio tends to sound like "Weather Desk Radar shows the //SPLATbzzHummm// county, with //SPLATbzzHummm// t's the latest repor //SPLATbzzHummm//" with the transmitter knocked off the air every few seconds by a direct hit. in NTSC television, the return would be about 3-4 seconds of variable contrast and variable quality sound all over the place.

    many are the films and photos of multiple lightning hits on the Empire State Building, because it's tall enough and prominent enough that folks single it out for their photo assignments.

    --
    if this is supposed to be a new economy, how come they still want my old fashioned money?
  36. instead, check out polyphaser.com by swschrad · · Score: 2

    you left out about 99-44/100 percent of the technology and art of lightning protection.

    --
    if this is supposed to be a new economy, how come they still want my old fashioned money?
  37. depends on who lost data in slack-write centers by swschrad · · Score: 1

    if it's yours, I could give a shit.

    if it's mine, it's war.

    --
    if this is supposed to be a new economy, how come they still want my old fashioned money?
  38. Re:What do you mean permenantly erased...only 1 DC by sysrammer · · Score: 1

    You're wasted here, you should do stand up.

    And get hit by lightning?

    --
    His ignorance covered the whole earth like a blanket, and there was hardly a hole in it anywhere. - Mark Twain
  39. Re:How about learning to spell? by dcw3 · · Score: 1

    There's a Senator, and Representative by the same name. Speaks volumes for Congress.

    --
    Just another day in Paradise
  40. Really? 4 times? Court ordered data? by jimbob6 · · Score: 1

    Well that what Google gets for building a data center at the top of castle Frankenstein.

  41. Re: Cannot be trusted by coolsnowmen · · Score: 1

    Oh Lord Bitman. One person said ALL of google is shit based on shit The responder said, you must be too young to remember what google gave us, and how it was better than everything. How its search engine set the bar, and how its free email was a gift to many many people. And you say...non-sequitur in this case. My critical storage needs!!!