June 30th Leap Second Could Trigger Unexpected Issues
dkatana writes: On January 31, 2013, approximately 400 milliseconds before the official release of the EIA Natural Gas Report, trading activity exploded in Natural Gas Futures. It is believed that was the result of some fast computer trading systems being programmed to act, and have a one-second advance access to the report. On June 30th a leap second will be added to the Network Time Protocol (NTP) to keep it synchronized with the slowly lengthening solar day. In this article, Charles Babcock gives a detailed account of the issues, and some disturbing possibilities: The last time a second needed to be added to the day was on June 30, 2012. For Qantas Airlines in Australia, it was a memorable event. Its systems, including flight reservations, went down for two hours as internal system clocks fell out of synch with external clocks.
The original author of the NTP protocol, Prof. David Mills at the University of Delaware, set a direct and simple way to add the second: Count the last second of June 30 twice, using a special notation on the second count for the record. Google will use a different approach: Over a 20-hour period on June 30, Google will add a couple of milliseconds to each of its NTP servers' updates. By the end of the day, a full second has been added. As the NTP protocol and Google timekeepers enter the first second of July, their methods may differ, but they both agree on the time.
But that could also be problematic. In adding a second to its NTP servers in 2005, Google ran into timekeeping problems on some of its widely distributed systems. The Mills sleight-of-hand was confusing to some of its clusters, as they fell out of synch with NTP time. Does Google's smear approach make more sense to you, or does Mills's idea of counting the last second twice work better? Do you have a better idea of how to handle this?
The original author of the NTP protocol, Prof. David Mills at the University of Delaware, set a direct and simple way to add the second: Count the last second of June 30 twice, using a special notation on the second count for the record. Google will use a different approach: Over a 20-hour period on June 30, Google will add a couple of milliseconds to each of its NTP servers' updates. By the end of the day, a full second has been added. As the NTP protocol and Google timekeepers enter the first second of July, their methods may differ, but they both agree on the time.
But that could also be problematic. In adding a second to its NTP servers in 2005, Google ran into timekeeping problems on some of its widely distributed systems. The Mills sleight-of-hand was confusing to some of its clusters, as they fell out of synch with NTP time. Does Google's smear approach make more sense to you, or does Mills's idea of counting the last second twice work better? Do you have a better idea of how to handle this?
The only problem mentioned is that they fall out of sync with each other. If they're both otherwise fine, just pick one. Sounds like the disadvantages of either one aren't as big as the disadvantage of them not working well together.
A problem for sysadmins is that the status quo of the standards requires that we choose which standard we want to violate. We can violate the specification of UTC by not counting 23:59:60 or we can violate POSIX by counting it or we can violate POSIX and the SI second by not actually keeping the system clock on UTC using smeared seconds that are not suitable for tracking projectiles and other real-time applications. This problem is old, 50 years old, as seen in the 3 plots on this web page.
I find it strange than a possible 1 second different could cause so much issues.
It's not the time difference that causes problems per se, it's time going backwards. You presumably missed the fact that many Java servers crashed over the last leap second because of a kernel bug that screwed up their internal timers?
We had problems last time due to faults reported by external hardware when it saw the time jump backwards. I'll be at my desk when it happens this time to deal with any problems that come up this time.
And, given the chaos every leap second causes, hopefully we can finally convince the 'experts' to stop fiddling with time.
+1 - Mod parent up.
Also this is an awesome graph, and illustrates that the Earth is a horrible clock: https://upload.wikimedia.org/w...
1^2=1; (-1)^2=1; 1^2=(-1)^2; 1=-1; 1=0.
I thought Slashdot was dead. I thought they killed the comments until someone told me where to look.
God spoke to me
I'm not sure exactly what arguments each Linux distribution uses, but this is from the man page on ntpd:
My reading of that is that the normal adjustment uses slew. Step is used only when there's a big discrepancy, and you can use -x to use slew even in that case.
Please look at this tzdist internet draft which is close to becoming an RFC. The tzdist protocol can communicate the list of leap seconds along with the list of time zones.