Azure Failure Was a Leap Year Glitch
judgecorp writes "Microsoft's Windows Azure cloud service was down much of yesterday, and the cause was a leap year bug as the service failed to handle the 29th day of February. Faults propagated making this a severe outage for many customers, including the UK Government's recently launched G-cloud service."
30 years ago, Arthur David Olson started engineering a solution to this problem that persists to this day, and which he supported personally for all but the last few months. The systems I have that run his software have never even burped through legislative changes of the calendar, leap-seconds, and the Century leap-year day, which is a separate cycle from the 4-year one.
Bruce Perens.
Now, I'm not necessarily a Microsoft apologist, but I have to point out that it wasn't so long ago that other things near and dear to us geeks were experiencing similar problems.
I was trying to run some ant scripts yesterday that interact with an FTP server to delete some files. Those damned files wouldn't get deleted. They weren't even returned from a listing command. As it turns out, I was using a particularly old version of Apache Commons-Net library (this jar file was from 2005) which had a leap-year bug. It simply would not show me files with modification dates of 2/29. I was looking at the FTP server configuration, logging in with other clients, moving and renaming files, and all about ready to break out Wireshark... and then it occurred to me that it was leap day. Hoo-fucking-ray. "touch"ed the file, and sure enough, it was suddenly available. Those are a few hours of my life I'll never get back.
I only post comments when someone on the internet is wrong.
Some of the common leap year bugs that I've seen over the years:
1. A matrix with the number of days per month:
e.g. smallint dayspermonth[12]={31,28,31,30,31,30,31,31,30,31,30,31};
Indexing into the matrix for February (index 1) ignores leap years.
1. A matrix with 365 elements to represent a year's worth of something:
e.g. smallint hightemps[365];
This usually doesn't fail until Dec 31, when hightemp[mydate.dayofyear()-1] points to a non-existent element.
Of course, if dayofyear is calculated using the matrix in the prior bug, it will fail invisibly since that will be incorrect
as well.
2. Quck-n-dirty subtract one year math:
e.g. Convert date to char in YYYYDDMM format, convert char to int, subtract 10000, convert back to a char and then date.
Why people do this when you can dateadd(year,mydate,-1) is that easy, I have no clue. But it breaks horridly when
you use it to determine "one year ago today" from Feb 29.
The leap year specification was only written in 1582. So it isn't 2k years old.