Slashdot Mirror


Extreme Heat Knocks Out Internet In Australia

An anonymous reader writes with news that bad weather caused internet connectivity problems for users in Perth, Western Australia on Monday. But it wasn't raging storms or lightning that caused this outage — it was extreme heat. Monday was the 6th hottest day on record for Perth, peaking around 44.4 C (111.9 F). Thousands of iiNet customers across Australia found themselves offline for about six and a half hours after the company shut down some of its systems at its Perth data center at about 4.30pm AEDST because of record breaking-temperatures. ... "[W]e shut down our servers as a precautionary measure," an iiNet spokesman said late Monday night. "Although redundancy plans ensured over 98 per cent of customers remained unaffected, some customers experienced issues reconnecting to the internet." ... Users in Western Australia, NSW, Victoria and South Australia took to Twitter, Facebook and broadband forum Whirlpool to post their frustrations to the country's second largest DSL internet service provider.

64 of 103 comments (clear)

  1. One ISP is not 'the internet' by pipedwho · · Score: 4, Informative

    Yes, many people were affected, but iiNet is not 'the internet'. All the other big providers were still running just fine.

    You could post the same headline every time someone's modem cable gets knocked out or their router crashes.

    1. Re:One ISP is not 'the internet' by TapeCutter · · Score: 2

      The recent heatwave will have knocked plenty of people offline at the pole, the first hot day in a Melbourne summer is always chaos on the trains for similar reasons (hot metal expands). However this incident just appears to be iiNet's server room air-con that fell over, nothing to do with the weather.

      --
      And did you exchange a walk on part in the war for a lead role in a cage? - Pink Floyd.
    2. Re:One ISP is not 'the internet' by kv9 · · Score: 1

      What do you mean, it has "inet" right in the name!

    3. Re:One ISP is not 'the internet' by bill_mcgonigle · · Score: 1

      You could post the same headline every time someone's modem cable gets knocked out or their router crashes.

      A large chunk of Northern VT and NH were knocked out for most of a day a few weeks back when some server monkey at Fairpoint bend the connectors on a blade server (and they apparently have no redundancy). The restore time was the drive time from Boston on their SLA plus a winter storm going on (coincidentally, but not good timing for maintenance).

      But, shit happens. Was there a Slashdot front pager on about it? Of course not - what kind of drama queens are submitting this crap? Oh, people around here are heavily armed and content, that must be why they don't get so upset (:poke, poke, nudge, nudge:, how's that crime rate?).

      --
      My God, it's Full of Source!
      OUTSIDE_IP=$(dig +short my.ip @outsideip.net)
  2. Re:(in)Tolerance by talis9 · · Score: 5, Informative

    they had multiple failures. The primary air con failed, and so did the backups

    http://blog.iinet.net.au/statement-chief-technology-officer-mark-dioguardi/

  3. Statement from CTO of iiNet by Anonymous Coward · · Score: 4, Informative

    http://blog.iinet.net.au/statement-chief-technology-officer-mark-dioguardi/

    Basically both main and backup aircon went down.

    1. Re:Statement from CTO of iiNet by turbidostato · · Score: 2

      "Basically both main and backup aircon went down."

      Which is only, oh, so unsual.

      There're two (basic) kinds of high avaliability: load balancing and redundancy with their typical failure modes:
      1) load balancing: the surviving part can't cope with the aggregated load and it also goes down.
      2) redundancy: once the main fails, either the migration protocol fails or the reserve doesn't work.

      This is HA 101 knowledge but as long as CTOs can go with an "oh! who could expect two failures in a row!" instead of being fired on the spot we'll see this kind of things happening once and again.

      The URL above should have been his resignation letter.

    2. Re:Statement from CTO of iiNet by paziek · · Score: 1

      Why would you fire someone who just got more experienced - especially in such important field - than when he was being hired?

    3. Re:Statement from CTO of iiNet by thegarbz · · Score: 1

      Because a customer's internet went down due to equipment failure during record breaking weather conditions the CTO should automatically resign?

      May I suggest get a grip and come on back down to reality. From here you can assign and prove fault / wilful negligence.

      As for your HA 101 comment, the world doesn't work on hazards, it works on risk, and it works on minimum acceptable risk. I would rather see the CTO get fired if he spend countless dollars gold plating a system so a few measly customers can get four 9s uptime on their internet. It's a waste of money.

    4. Re:Statement from CTO of iiNet by turbidostato · · Score: 1

      "Because he was supposed to have known better"

      Exactly. And more times than not, it's not only he doesn't know better but that he doesn't care. These failures come from expending more time into politics and CYA than pressing for making a good job.

    5. Re:Statement from CTO of iiNet by turbidostato · · Score: 1

      "Because a customer's internet went down due to equipment failure during record breaking weather conditions"

      No, the problem didn't come from record-breaking conditions but because the N+1 high avaliability system designed to cope with those conditions not only failed, but failed in the most expectable way due to lack of forethought.

    6. Re:Statement from CTO of iiNet by turbidostato · · Score: 1

      "I would rather see the CTO get fired if he spend countless dollars gold plating a system"

      How do you call, then, paying for an aircon system bigger than needed that demonstrates that it doesn't work when needed? He was doing exactly what you claim he shouldn't do.

    7. Re:Statement from CTO of iiNet by aaarrrgggh · · Score: 2

      0.4% design temperature for Perth is 36.2C. A DX system with condensers on the roof would be designed for a temporary stature of 41-43C typically. Once you get much above that, there isn't much you can do with DX; you will overload your compressors quickly. A cooling tower should be more robust, but your envelope load could have exceeded the primary system capacity.

      Typically in extreme temperatures a Tier 3 data center will need to eat into its redundancy for cooling. Tier 4 facilities should be more robust, but you would not expect 2N redundancy when you have record temperatures.

      As for load balancing and other edge conditions, it really depends on how heavily loaded a facility/portfolio is. What I expect happened was one part of their facility went down with a "hot spot" that didn't have adequate redundancy in the first place. It is an edge condition that likely required more capital than it was worth to resolve.

    8. Re:Statement from CTO of iiNet by hjf · · Score: 1

      CTO/CIO IS a political position. CTO sits at the table with the CEO and the board. Do you think a scruffy bearded neck with a hawaiian shirt and flip flops has ANY chance of doing that?

    9. Re:Statement from CTO of iiNet by thegarbz · · Score: 1

      You're assuming a true N+1 system was what was designed. Back to risk since you clearly don't understand the concept. We have here a case of a very small number of customers experiencing a problem for a very small time period. You can now go up to upper management and justify to them why they should spend money on this.

      Not everything needs to be gold plated, and not everything needs to be N+1. Quite the opposite actually, if you head into many substations or building complexes in the country you'll likely find the "redundant" air-conditioning is half the size of the duty unit installed mainly for maintenance purposes, not so that it can ride through extreme heat by itself.

    10. Re:Statement from CTO of iiNet by thegarbz · · Score: 1

      Why are you looking at the aircon? You should be looking at the customers. Now how many customers did iiNet lose as a result of this short outage affecting only a tiny percent? None?

      So tell me again what the design case was? Companies exist to make money, not to provide four 9s uptime for their clients. This so far was a one-off case. iiNet is one of the more reliable providers in the country, so what is the business decision behind gold plating the system so it never goes down? How much money are you willing to spend when you get absolutely nothing in return?

      If the number is greater than zero, let me know I'll send you my bitcoin wallet number to help relieve you of your burden.

    11. Re:Statement from CTO of iiNet by turbidostato · · Score: 1

      "CTO/CIO IS a political position. CTO sits at the table with the CEO and the board."

      It _is_ a political role. But it is not *only* a political role. This comes from forgetting where the T comes from.

    12. Re:Statement from CTO of iiNet by turbidostato · · Score: 1

      "You're assuming a true N+1 system was what was designed."

      I do, because that is exactly what the CTO himself said: "Our Perth data centre was subject to a partial failure of both the mains and backup air-conditioning systems yesterday".

      "Not everything needs to be gold plated, and not everything needs to be N+1."

      Quite true. But here there *was* a N+1 system just to allow the "+1" part to fail, which begs the question of why the expenditure on the "+1" part was allowed.

    13. Re:Statement from CTO of iiNet by turbidostato · · Score: 1

      "Why are you looking at the aircon? You should be looking at the customers."

      No, I shouldn't. The air conditioner was an already paid for expenditure which was approved upon a -now demonstrated, failed ROI. And the fact the CTO mixes apples to oranges makes the clear case that he is just in CYA mode, both about himself and his company.

    14. Re:Statement from CTO of iiNet by thegarbz · · Score: 1

      Backup does not mean anything. What backup, what capacity? Does the backup assume the same worst case as the main? Is the backup susceptible to the same failure as the primary?

      You see, you don't actually know anything. You think you do because you read two soundbites and now you're asking for the CTO's head.

    15. Re:Statement from CTO of iiNet by thegarbz · · Score: 1

      Tell me again how it failed its ROI. It has successfully kept the datacentre cool for all but a couple of hours of the year.

      You're starting to use all fancy words and are beginning to talk more and more crap in the process. But since you know everything tell me again what are all the design decision that went into the investment. You seem to know it failed it's ROI so I assume you're intimately familiar with the design of that datacentre and the business case for operating it. So please do share with us.

    16. Re:Statement from CTO of iiNet by turbidostato · · Score: 1

      "You seem to know it failed it's ROI"

      Of course I do. The investment of a backup, whatever the investment plan is, is to cover for the time when the mains is in failing or maintenance mode. Here the mains went nuts and the backup was not there to cover it. It's no need for more "intimate familiarity" than that.

    17. Re:Statement from CTO of iiNet by thegarbz · · Score: 1

      Please continue to show your ignorance. A "backup" has many definitions especially those used for maintenance. If you think all "backup" systems should run regardless of the what is happening to the primary system then you're deluded. Many backups installed are half-size, or even quarter-sized to allow work in scheduled periods. Systems that are full sized are not designed to copy with preventing failure due to overload and never have been.

      But you said of course you know the ROI equation so please tell me, how many times was the backup used? What was it used for? How much did it cost to install? How much did its unavailability cost the company now? How much would the lack of a backup have cost for past times where it was used? Common answers man, you clearly claim to know the full design criteria of the plant so lets see it.

      You sound like the type of person who installs stuff and maintains stuff. You're told something is a "backup" without ever having been part of the project team and you make countless assumptions due to this. Am I right that you've never seen a design criteria document or project costing document? That's not your fault you're not expected to know that.

      The only thing you're guilty of is the endless streams of assumptions you are making.

      Anyway I'm out. This conversation is going nowhere. I suggest you maybe enrol in a project management course so you understand what the terms like ROI actually mean and why a failure of the system has very little to do with ROI especially when you amortise the cost over the life of the project.

    18. Re:Statement from CTO of iiNet by turbidostato · · Score: 1

      "Please continue to show your ignorance. A "backup" has many definitions especially those used for maintenance."

      It might be the general case but certainly it is not the current one, CTO stating that the backup was used to cover for the mains failure -and failing at that, so it is not a "maintenance-only" setup.

      "Anyway I'm out. This conversation is going nowhere. I suggest you maybe enrol in a project management"

      That's true. You just stay saying the same stupid things, babbling out some disconected concepts you probably read about in wikipedia, despite of *you* being the one implying things you don't have a damn clue about.

  4. Coincidence? by Anonymous Coward · · Score: 2, Funny

    Someone call Al Gore--as he's an expert on both the Internet and Global Warming--he'll know what to do.

    1. Re:Coincidence? by Hognoxious · · Score: 1

      Get Bennet Haselton to ship in all the ice from burning man.

      --
      Confucius say, "Find worm in apple - bad. Find half a worm - worse."
  5. Re:(in)Tolerance by Joe_Dragon · · Score: 1

    did the severs get shut down or did a they do a hard power off after tripping the overheat shutdown system in the box?

  6. Internet in Oz down by rossdee · · Score: 1

    " Users in Western Australia, NSW, Victoria and South Australia took to Twitter, Facebook and broadband forum Whirlpool to post their frustrations to the country's second largest DSL internet service provider."

    Obviously it wasn't the whole of Australias internet that was affected

    How do they get on in the outback? It must get near 50C there

    One thing about global warming though - when it gets hot enough the ocean will dry up and they should be able to spot MH370 easily

    1. Re:Internet in Oz down by mjwx · · Score: 2

      How do they get on in the outback? It must get near 50C there

      The hottest place in Australia is Marble Bar.
      http://en.wikipedia.org/wiki/M...

      It average (maximum) summer temperatures in excess of 41 C. Average yearly temps are around 35 C so it doesn't get much cooler in the winter.

      I'm certain that they would see the odd day above 50 C there.

      --
      Calling someone a "hater" only means you can not rationally rebut their argument.
    2. Re:Internet in Oz down by _merlin · · Score: 1

      I've experienced 50'C outside in Finley once. It makes you feel horribly lethargic, just don't want to do anything.

    3. Re:Internet in Oz down by mjwx · · Score: 1

      I've experienced 50'C outside in Finley once. It makes you feel horribly lethargic, just don't want to do anything.

      I used to live up north in a Pilbara mining town. I saw 50 C a few times. Not a day you really want to spend outside unless your job was in a giant tin shed.

      --
      Calling someone a "hater" only means you can not rationally rebut their argument.
  7. last summer by slashmydots · · Score: 2

    Last summer in Wisconsin, believe it or not, it got around 100F for several days and it knocked out our internet. It wasn't some morons with inadequate server cooling though. Apparently Time Warner equipment runs on 90V lines and our energy company's equipment that drops to 90V was overheating. Unbelievable! Our digital phones were down too.

  8. Re:Slow news day? by ChunderDownunder · · Score: 2

    You're a galah - the summary clearly says it wasn't a storm.

  9. Re:(in)Tolerance by WillKemp · · Score: 1

    As i understand it, they shut some servers down because they were worried about overheating.

  10. From TFA by Etherwalk · · Score: 1

    It's unbelievable that a data centre can't cope with an extra degree or two. What sort of idiot designs these places? Haven't they heard of tolerances?

    They had air conditioners fail. They probably needed more redundancy, but they shut down some systems as a precaution when the AC failed.

  11. It is a big cover up. by Anonymous Coward · · Score: 1

    They were really clearing a nest of drop-bears from the server room and had to turn reverse the air-con to drive them out so the servers shut themselves down to prevent overheating.

    1. Re:It is a big cover up. by talis9 · · Score: 1

      Every Australian knows, if you take the proper precautions and put Vegemite behind your ears then you don't get attacked by Drop Bears

  12. Re:Slow news day? by Anonymous Coward · · Score: 1

    It's a ClimateChangeDoom sort of thing. The real story is environmental control and contingency planning at the Perth Data Center. If it were my data center we'd rent coolers before we shut anything down. It was 44.2C on 26-Dec-2007, not as if it could never happen again.

  13. It's not the heat! by Vinegar+Joe · · Score: 2

    It's the humidity.

    --
    "The average reporter we talk to is 27 years old......They literally know nothing." - Ben Rhodes
    1. Re:It's not the heat! by mjwx · · Score: 2

      It's the humidity.

      And if that doesn't get you we have sharks, snakes, spiders, jelly fish, drop bears and backpacker murderers.

      --
      Calling someone a "hater" only means you can not rationally rebut their argument.
    2. Re:It's not the heat! by Anonymous Coward · · Score: 1

      It's a dry heat

  14. Plan ahead by blogagog · · Score: 1

    I used to work for an ISP in very hot Texas, but we planned ahead. We kept all of our servers indoors so they were out of the heat. It's a really good idea. I think everyone should do it.

  15. Re:(in)Tolerance by itzly · · Score: 4, Insightful

    Yes, it's unbelievable that something, somewhere goes wrong.

  16. Nice headline by rebelwarlock · · Score: 1

    Let me just fix that for you:

    Extreme heat, as compared to the climate in a specific part of Australia, causes one ISP to panic and shut down internet for a small percentage of customers in said specific part of Australia.

    1. Re:Nice headline by Anonymous Coward · · Score: 1

      It was users all over Australia. There only are 6 states so those examples of where people complained from actually cover the whole place! Everyone I know using iiNet at home, on the other side of the country from Perth (6 hour flight) was affected.

  17. Click bait...Nothing to see...Move along people by Anonymous Coward · · Score: 5, Informative

    Sigh... first world problems. Few services were shutdown for precaution as result of A/C failure (primary/backup).

    Hi All,

    Due to heat in Perth we have lost a number of services and precautionary shut others down.

    Customer will notice impact to the following services:

    - iiNet Toolbox and Westnet MyAccount [RESTORED 7:00PM WST]
    - Westnet Email [RESTORED 7:45PM WST]
    - Westnet Hosted Websites [RESTORED 9:00PM WST]
    - iiNet hosted email [RESTORED 7:45PM WST]
    - iiNet/Westnet/Adam/Netspace Webmail [RESTORED 7:00PM WST]
    - Customers may be unable to re-authenticate after disconnecting from the internet [RESTORED 8:00PM WST]

    A number of internal tools are also affected, which will impact our ability to respond to certain customer enquiries.

    Update 6pm WST: Due to issues with staff access, some contact centre queues have been closed. Affected queues will be reopened once the incident has been resolved.

    Update 8pm WST: Most services have been restored, Engineers are continuing to review all services impacted by the incident. Customers that were off-line are recommended to perform a modem power-cycle to get back on-line.

    Thanks,

    Basically few gen Y's screaming that they can't post their sweaty selfies for a few hours.
     

  18. At 44'C ambient a lot of air conditioners fail by ZombieEngineer · · Score: 2, Informative

    Depending on the refrigerant used it is possible that the condenser temperature (the bit exposed to the outside air) exceeded the critical point of the gas at which point it is impossible to tell the difference between liquid or gas. The trouble is phase change cooling works best (most efficient) the closer to the critical point you can go but not past it.

    The second problem is the condenser pressure would increase with increasing ambient air temperature. In the past this was enough to stall the compressor motors on a hot day.

    My guess is they went for a system with a high efficiency that should work for 99.9% of the time, that last 0.1% is the 8 hours of the year when the temperature is above 42'C (normally for Perth it is normally only an hours before the sea breeze kicks in and drops the temperature by at least 5'C). This time the temperature went up and stayed up for a period of time.

  19. Re:(in)Tolerance by Anonymous Coward · · Score: 1

    http://blog.iinet.net.au/statement-chief-technology-officer-mark-dioguardi/

    Take the statements with a huge cup of salt. "Network redundancy plans ensured over 98% of our customers’ broadband services were unaffected - See more at: http://blog.iinet.net.au/statement-chief-technology-officer-mark-dioguardi/#sthash.FSMpB8sc.dpuf" is using some big weasel words.

    They turned off the PPP auth servers. So any users that weren't already logged in, couldn't log in. And with the PPP servers offline, they have no way to know how many people were affected. The layer 2 broadband service was fine, it was just those pesky layer 3+ that were affected.

  20. Re: ha ha ha... by mobarobber · · Score: 1

    If you ever go the the weather observatory in New Delhi, its near Lodhi Gardens, and the instruments themselves are in an actual garden. That place is actually a couple of degrees cooler than the rest of Delhi. The hottest recorded may be 48.6 but it does stay over 45 for a week or two and over 40 for around a month easy.

  21. Re:ha ha ha... by Anonymous Coward · · Score: 1

    On the downside though, its sewerage and sanitation systems aren't 'online' yet.

  22. Fuck you slashdot you media hungry whore by Anonymous Coward · · Score: 1

    I remember when slashdot
    - didn't treat their readers like mindless media starved zombies
    - had content targeting intelligent geeks, not the retarded Inbred 7 year old crowd.
    - had stories with more than a catchy but blatantly false headline.

    Fuck you.

  23. Re: Slow news day? by redback · · Score: 1

    Perth is a state capital, so its not exactly remote.

  24. Re:Slow news day? by natd · · Score: 2

    It was more widespread than just a three states. I'm in the most populous state (NSW, where Sydney is) and many people at work today commended that they were down, as I was. iiNet is the #2 DSL provider and I suspect it was a lot more than this 2% they spout. Bottom line, a significant amount os Australian residential customers and business had no Internet. Someone is saying "inept isn't the Internet" but if the pipe from someones computer to the Internet isn't working, "the Internet is down" to that person.

    --
    Only big ligs use sigs.
  25. Re: Slow news day? by serviscope_minor · · Score: 2, Informative

    Perth is very remote. sure it's the state capital, but once you're out of the metropolitan area it's a thousand kilometers to anywhere else.

    --
    SJW n. One who posts facts.
  26. Re:ha ha ha... by BlacKSacrificE · · Score: 1

    Average temperature seems to sit between about 32 and 37 degrees Celsius for Perth. I'm certain if the average were a full 10+ degrees over this, the DC would have been built to accommodate this significantly higher temperature. Not really a useful comparison.

    --
    [Sorry, this signature is unavailable in your country/region]
  27. Fiber by Bengie · · Score: 1

    They should have been using fiber. Specs on a popular GPON chassis says maximum operating ambient air temp is 131f @ 95% relative humidity.

    The problem seems to be routers. A regular router seems to be limited to about 121f and "core" routers are more around 104f.

  28. Re: Slow news day? by crimson+tsunami · · Score: 1

    It's one of the most remote cities in the world. That said it affected all of Australia not just Perth.

  29. Here in Arizona by Applehu+Akbar · · Score: 1

    We don't start complaining about the heat until it hits 50C.

    1. Re:Here in Arizona by Anonymous Coward · · Score: 1

      The (generally) higher humidity in Australia puts more load on AC, but even still, this was a freak incident. Temps like this aren't uncommon in most parts of Aus during summer (and usually with higher humidity than Perth gets, being one of our southern most cities), and critical infrastructure usually doesn't crumple.

      Basically it just sounded like a cool (anti-pun not intended) gimmick piece, so some rando journalists jumped all over it.

    2. Re:Here in Arizona by LVSlushdat · · Score: 1

      I'm betting their humidity is waaaay higher than ours (Las Vegas) and yours... High heat with high humidity REALLY sucks for humans (and technology)

      --
      THANK YOU, Edward Snowden!! Americans owe you a debt of gratitude (whether they know it or not..)
  30. Re:(in)Tolerance by jbengt · · Score: 1

    For a newly commissioned, purpose built DC, why didn't they install a ground loop system?

    $.
    Also, there are still modes of failure for ground loop systems. None of the links say what specifically happened, so it's hard to judge.

  31. Las Vegas summers... by LVSlushdat · · Score: 1

    Sounds like the temps we get here each and every year... Las Vegas Nevada *lives* between mid June and late October with temps over 100F, and frequent transients to over 110F... Guess the Aussies aren't used to such temps...

    --
    THANK YOU, Edward Snowden!! Americans owe you a debt of gratitude (whether they know it or not..)
    1. Re:Las Vegas summers... by raind · · Score: 1

      They have and will be seeing more of these temps, better get used to it.

      --
      Get up!
  32. moderator fault by CmdrTamale · · Score: 1

    posting to suppress mis-moderation