Slashdot Mirror


Leap Second Bug Causes Crashes

An anonymous reader writes in with a Wired story about the problems caused by the leap second last night. "Reddit, Mozilla, and possibly many other web outfits experienced brief technical problems on Saturday evening, when software underpinning their online operations choked on the “leap second” that was added to the world’s atomic clocks. On Saturday, at midnight Greenwich Mean Time, as June turned into July, the Earth’s official time keepers held their clocks back by a single second in order to keep them in sync with the planet’s daily rotation, and according to reports from across the web, some of the net’s fundamental software platforms — including the Linux operating system and the Java application platform — were unable to cope with the extra second."

9 of 230 comments (clear)

  1. All of my servers were fine by Anonymous Coward · · Score: 5, Insightful

    And I didn't do anything special, just kept their software up-to-date.

    1. Re:All of my servers were fine by Anonymous Coward · · Score: 5, Informative

      the patch was posted back in March.

      https://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=6b43ae8a619d17c4935c3320d2ef9e92bdeed05d

    2. Re:All of my servers were fine by Gil-galad55 · · Score: 5, Informative

      They lost commercial power due the big storm system that went through the DC area.

      --

      To follow knowledge like a sinking star, / Beyond the utmost bound of human thought. ("Ulysses", Tennyson)

  2. Re: by Anonymous Coward · · Score: 5, Funny

    >hick-up.

    The hick up watching the servers when the leap second came was you.

  3. Extremely weird by Anonymous Coward · · Score: 5, Informative

    From my own machines and comparing notes with some other people (all in all, about 3k servers) the bug seems to affect machines randomly. Known facts:

    There's a kernel patch that fixes the supposed issue: https://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=6b43ae8a619d17c4935c3320d2ef9e92bdeed05d

    Affects Debian stable a lot.

    Affects Java and Virtualbox (starts using too much CPU).

    Affected my browser (iceweasel on debian testing).

    Affects SOME mysql installs (5.1 and 5.5, but not all, and of two identical installs one might be affected, the other not).

    The fix has been posted at lot of places: /etc/init.d/ntp stop; date; date `date +"%m%d%H%M%C%y.%S"`; date; /etc/init.d/ntp start

    (I'm all for switching unix time to a simple counter and leaving it to the calendar libs to put the leap seconds where necessary)

  4. Re:Linux by Anonymous Coward · · Score: 5, Informative

    What you describe is a bug in the Linux kernel that causes problems for the Java VM that OpenManage uses.
    It is not a bug in OpenManage at all.

  5. You probably don't do much Java, then by burne · · Score: 5, Informative

    As it turns out my biggest problems was customer-supplied software which uses their own java jre's. We install a jre by default and update it whenever possible, but some software (Adeptia, VLTrader, Alfresco) comes with their own ancient jre and scripts to call that over system-supplied java.

    Not a single machine crashed (we are very explicitly in charge of what OS-version there's running) but a lot of java locked up and had to be restarted.

    I can even see a small bump in the power-usage around two o' clock (0:00 GMT).

  6. Re:Linux kernel unable to cope? I think not. by Anonymous Coward · · Score: 5, Interesting

    I run Arch Linux with kernel 3.4.4 and it went haywire. My machine was very heavily loaded at the time and when the leap second happened mysqld, firefox, and ksoftirq processes started consuming 100% CPU. The load factor was well over 10 and the machine was grinding along. It didn't actually fail but it was loaded down.

    Even restarting the processes didn't fix it. The high load would go away once I stopped the processes but as soon as I started them again the load would come right back. I had Firefox open on a blank page not doing anything and it was slammed at 100% CPU and had a could ksoftirq tasks slammed at 100% CPU each too.

    I had to reboot the machine to get it back to normal.

    I have Ubuntu and Debian servers that for whatever reason did not add the leap second so they were fine. Their time was a second off today though (at least until ntp slowly corrected it or I manually intervened).

  7. Re:Linux kernel unable to cope? I think not. by kwardroid · · Score: 5, Informative

    Restarting ntp wasn't enough for me, I had to reset the date with:
    date -s "`date`"
    Only one machine went haywire though.