Slashdot Mirror


Anyone Besides Zune Owners With New Year's Crashes?

aputerguy writes "My Fedora 8 Linux server crashed sometime between 18:59:40 EST (GMT -5:00) and 19:00:00 EST (GMT -5:00) on Dec 31, 2008 which remarkably corresponds to within at most 20 seconds of the New Year in GMT. I have been running this same hardware non-stop for more than six years and other than the occasional reboot for kernel (or distro) upgrades, it has not crashed more than 1 or 2 times in 2237 days of cumulative uptime. Nothing other than background processes were running at the time of the crash. Could this be a coincidence or was there some 2008/2009 rollover issue going on here? Has anyone (other than Zune 30GB owners) noticed similar year-end issues with their computers or electronic devices?"

20 of 480 comments (clear)

  1. Errrrrrr by segedunum · · Score: 5, Insightful

    Why don't you actually boot it, or failing that, take the hard drive out, perhaps look at some logs and actually find out rather than aligning it with a certain set of mystical circumstances?

    1. Re:Errrrrrr by LunarCrisis · · Score: 2, Insightful

      Especially today.

      --
      Mr. Period: Nine is the one that's right by ten!
      Nine: One day I will kill him. Then, I will be Ten.
  2. Probably coincidence. by Thiez · · Score: 5, Insightful

    > Could this be a coincidence

    Yes. People are wired to see causality everywhere, even where there is none. Had your server crashed a week ago you wouldn't think anything of it (maybe 5% of all servers mysteriously crashed exactly one week ago, but because it was an 'ordinary' day nobody noticed). Anyway, since you noticed your server crashed at new year and reported it on /., and with 6 billion people on this planet we will soon hear stories about other computers that mysteriously crashed around midnight. Not because there has to be anything special, but because computers are crashing all the time and new year (and your post) made it appear special.

    I doubt it has anything to do with leap seconds, if your computer ran for 6 years it survived the leap second of 2005.

    1. Re:Probably coincidence. by gpw213 · · Score: 2, Insightful
      Assuming your math is correct (I didn't bother to check it), that would be the odds of a random server failing in a random 20 second interval.

      You didn't pick a random server, you picked one already known to have crashed. And you didn't pick a random 20 second interval either. The odds of that server crashing in that 20 second interval was 100%, because it was already known to have happened. This is a classic mis-application of statistics.

      Admittedly, the interval right at New Year's is a bit suspicious, since there is some specific code to handle leap years, etc. But given that there wasn't a rash of outages reported, I am going along with the coincidence theory.

      --
      However beautiful the strategy, you should occasionally look at the results. -- Winston Churchill
    2. Re:Probably coincidence. by retchdog · · Score: 2, Insightful

      No, it's not a misapplication. It's a textbook-standard application of statistics, which looks at the probability of an event (which did occur) happening under a "null hypothesis", in this case including 1) no extraordinary event associated with year roll-over (no time dependence); 2) all servers are stochastically identical (i.e. they each have the same failure rate).

      The hypotheses are a bit strong, but it's not a mis-application.

      Statistics often answers the question "How likely was that to have happened without an extraordinary explanation?" By nature, this deals with events of "100% probability" as you misleadingly call them.

      --
      "They were pure niggers." – Noam Chomsky
    3. Re:Probably coincidence. by AbyssWyrm · · Score: 4, Insightful

      I think your logic is incorrect. The original poster did not say "my server went down around midnight, could this be a coincidence?" rather he said "my server, which has a particularly excellent track record of not going down, did so near midnight with very high precision. Could this not be a coincidence?" Given that this happening at any specific time is very unlikely compared to the relative abundance of rollover errors, this is a very legitimate hypothesis. Furthermore your argument is essentially saying that anything with a non-zero probability of occurring randomly is probably not a coincidence. Otherwise, instead of comparing to some 50 million servers you ought to be comparing to a much smaller number of servers meeting the description of the original poster's. I don't think you pose any legitimate argument that this is coincidental, and it strikes me as very probable that it is not.

    4. Re:Probably coincidence. by aputerguy · · Score: 2, Insightful

      But there have now been reports (just adding up the comments posted on slashdot and emails to me) of hundreds of machines going down at precisely 00:00:00 GMT (across multiple timezones). That combined set of data points plus the obvious potential issue of a leap second being introduces at that precise time would seem to make your coincidence theory astronomically unlikely.

    5. Re:Probably coincidence. by aputerguy · · Score: 2, Insightful

      Well given that there have been reports of several hundred such crashes, I guess it can't be a coincidence unless there are a billion or so Linux servers ;)

    6. Re:Probably coincidence. by Guido+von+Guido · · Score: 3, Insightful

      I think your logic is incorrect. The original poster did not say "my server went down around midnight, could this be a coincidence?" rather he said "my server, which has a particularly excellent track record of not going down, did so near midnight with very high precision. Could this not be a coincidence?" Given that this happening at any specific time is very unlikely compared to the relative abundance of rollover errors, this is a very legitimate hypothesis. Furthermore your argument is essentially saying that anything with a non-zero probability of occurring randomly is probably not a coincidence. Otherwise, instead of comparing to some 50 million servers you ought to be comparing to a much smaller number of servers meeting the description of the original poster's. I don't think you pose any legitimate argument that this is coincidental, and it strikes me as very probable that it is not.

      If you want to show that this is anything but a coincidence, you either need to show that this happened to more than one server, or you need to demonstrate the mechsnism. At this point we have exactly one server and we can't point to a specific bug. Until that changes, "coincidence" is the best answer.

      For instance, this could be an entirely local problem. The motherboard or some other hardware component is beginning to fail, and the server will start crashing more frequently until that component dies completely. Or it could have been caused by a power surge, or a problem resulting from some bad wiring. Or the guy who manages the server above it came in to swap out some hardware and accidentally unplugged the server, and won't admit to it. (I have a former boss who did exactly that, after he went to work for a customer.)

      Sure, it could still be related to the time. Without any additional evidence, though, it's just speculation.

  3. Given an infinite number of server monkeys... by melonman · · Score: 4, Insightful

    How many servers in total are watched over by people posting on Slashdot? I suspect that the answer is high enough that it would be amazing if at least one of them didn't crash within 20 seconds of the New Year.

    --
    Virtually serving coffee
  4. test by wizardforce · · Score: 5, Insightful

    Could this be a coincidence or was there some 2008/2009 rollover issue going on here?

    set the system time back a few mins before the crash occured and see if your server crashes again... otherwise it's idle speculation

    --
    Sigs are too short to say anything truly profound so read the above post instead.
    1. Re:test by $RANDOMLUSER · · Score: 5, Insightful

      First good idea in this whole discussion. Don't forget the hardware clock as well.

      --
      No folly is more costly than the folly of intolerant idealism. - Winston Churchill
  5. driver by TheSHAD0W · · Score: 5, Insightful

    The Zune crash was due to a specific hardware driver. Perhaps you also have an unusual hardware driver on your setup that was affected?

  6. Re:Well this is obvious... by Anonymous Coward · · Score: 5, Insightful

    What's with all the 4chan idiocy on Slashdot recently?
    4chan is funny when you're a teenage boy, but for those of us that aren't...

  7. Nothing crashed on me -- madplayer hicked however by billsf · · Score: 3, Insightful

    Madplayer hicked three times at about 0100 CET. I thought it might have been my RAID system I had just repaired. (There was a bad sas/sata controller.) This happened over about 20 seconds. I only use Unix/Unix-like systems and to the best of my knowledge there are no embedded MS devices in this house.

    Unix/Linux, etc. handles things like this well. All time sync services like NTP, DCF-77, MSF, WWVB, GPS and the rest give fair warning. I personally are in favour of ditching 'leap seconds'. Time corrections would best be made day to day, the length of today being based on yesterday. That's better, but surely someone can think up the real solution?

    BillSF

    PS: Frequent updates to Java caused by US daylight saving time are pathetic.

         

  8. Re:I Second That by athakur999 · · Score: 3, Insightful

    My Mythbuntu-based HTPC also froze up last night.

    This is what my /var/log/messages file looks like:
    Dec 31 16:03:45 puppet -- MARK --
    Dec 31 16:23:45 puppet -- MARK --
    Dec 31 16:43:45 puppet -- MARK --
    Dec 31 17:03:45 puppet -- MARK --
    Dec 31 17:23:45 puppet -- MARK --
    Dec 31 17:43:45 puppet -- MARK --
    (... below is when I noticed the box was hung and restarted it ...)
    Jan 1 14:02:31 puppet syslogd 1.5.0#2ubuntu6: restart.
    Jan 1 14:02:31 puppet kernel: Inspecting /boot/System.map-2.6.27-9-generic

    Every 20 minutes, I get those "-- MARK --" messages and the last one is at 5:43PM local time which would be 11:43PM UTC (also my system clock is set to UTC, not local time). The next "-- MARK --" should have been at 12:03AM UTC, so there's a good chance the leap second messed something up.

    --
    "People that quote themselves in their signatures bother me" - athakur999
  9. Likely by Demena · · Score: 1, Insightful

    It was the leap second

  10. Re:Time Mathematics and Microsoft by ThePhilips · · Score: 4, Insightful

    Try once yourself to code conversion from "seconds since 1/1/1970 00:00:00" to any other user digestible presentation.

    It's not as easy as it might seem.

    --
    All hope abandon ye who enter here.
  11. Re:nope... by jibjibjib · · Score: 3, Insightful

    But if it /did/ crash, then that would be very strong evidence that it /was/ date-related, and then he could find the cause and make sure it didn't happen next time. So, it might still be a useful thing to do.

  12. Re:Catastrophe by HJED · · Score: 2, Insightful

    Perhaps it is the leap second that is coursing problems for computers using NTP and other time servers

    --
    null