Slashdot Mirror


The BBC Looks At Rollover Bugs, Past and Approaching

New submitter Merovech points out an article at the BBC which makes a good followup to the recent news (mentioned within) about a bug in Boeing's new 787. The piece explores various ways that rollover bugs in software have led to failures -- some of them truly disastrous, others just annoying. The 2038 bug is sure to bite some people; hopefully it will be even less of an issue than the Year 2000 rollover. From the article: It was in 1999 that I first wrote about this," comments [programmer William] Porquet. "I acquired the domain name 2038.org and at first it was very tongue-in-cheek. It was almost a piece of satire, a kind of an in-joke with a lot of computer boffins who say, 'oh yes we'll fix that in 2037' But then I realised there are actually some issues with this.

16 of 59 comments (clear)

  1. Ask Mel by Sarten-X · · Score: 5, Funny

    It's not a rollover bug; It's a rollover feature!

    --
    You do not have a moral or legal right to do absolutely anything you want.
  2. You cant win... by jellomizer · · Score: 4, Insightful

    If you reuse code, you get rollover bugs.
    If you start over from scratch you get brand new bugs.

    Reusing the code, you have a lot of the issue from the past already fixed, so you are not introducing bugs that you had in the past.
    Making new code, you can modernise the code set, so you don't run into particular troubled code, and is easier to follow.

    Programmers are human beings, they make mistakes, they can't give 110% every day. Even the best of them will often have a stupid bug, that they can't believe that they had slip.

    --
    If something is so important that you feel the need to post it on the internet... It probably isn't that important.
  3. OpenBSD is 2038-ready by QuietLagoon · · Score: 3, Informative

    Since the OpenBSD 5.5 release a year ago, the OS is fully ready for the onslaught of 2038.

  4. some that never made the list. by nimbius · · Score: 4, Funny

    Minuteman III nuclear missile: Due to an unsigned integer bug, missile resets to the year 1900 and targets Grover Clevelands presbyterian ministry as part of the clandestine war on christmas
    US Presidential Limousine: Function call returns a flawed short int that causes the vehicle to lose entropy in its timekeeping, routinely deploying countermeasures and refusing to operate in the presence of a black president.
    UCLA Scheduling mainframe: strconv slurps an undersized signed int, causing date/time tracking problems and resulting in comfortable, plausible and very useful class scheduling to occur.
    Russian RT-23 Molodets missile: timer returns a null or negative value, resulting in an active launch thats aborted by sergei usually after he has his first cup of coffee...but sometimes after the paper.

    --
    Good people go to bed earlier.
  5. Volunteers by dfgfgfdgsgsdfsdf · · Score: 3, Insightful

    Given that so much of the GNU/Linux code base is written by volunteers I wonder who it is exactly that is going to fix all of the code. I mean back when it was written Computer programming was much less of a gold rush. Nowadays everyone is competing for jobs that pay $120,000. Who is willing to go through all of the old code and fix it for free?

    1. Re:Volunteers by Greyfox · · Score: 2

      Oh we went to a 64 bit time_t ages ago. You should just have to recompile, even if you use long instead of time_t. Assuming you ever upgraded your machine to a 64 bit platform, which won't be a problem for most people by 2038. Even the US military and NASA should be on 64 bit systems by then. So essentially we've already fixed the problem for Linux. Specific installations that don't upgrade might have some problems, but most of those systems won't last another couple of decades and will require replacement sooner. Specific in-house software that was compiled 32 bits and the the source lost might also have problems. Any remaining SCO installations might also still have problems. I actually kind of hope I can spend my last couple of years before retirement stamping out the remaining SCO installations, naturally while billing $200 an hour.

      --

      I'm trying to teach myself to set people on fire with my mind... Is it hot in here?

    2. Re:Volunteers by belthize · · Score: 3, Insightful

      Given that so much of the non-GNU/Linux code is written by paid programmers I wonder who it is exactly that is going to fix all the code. I mean back when it was written Computer programming was much less of a gold rush. Nowadays everyone is competing for jobs that pay $120,000. Who is willing to pay programmers to go through all of the old code to fix it.

      It's really not an issue. It's already fixed in OpenBSD. Certainly there's some user space code that also counts seconds since 1970 but if folks would simply start now there's no future fix necessary. The set of code written today which will be in use in 2038 will be vanishingly small. The remaining folks will pay some gray hair to knock it into shape. Missed code will make itself apparent sometime that Tuesday morning.

  6. Why rollover? by jovius · · Score: 2

    Isn't mouseover the modern term?

  7. Windows one is my fave by cant_get_a_good_nick · · Score: 2, Informative

    There was a counter in Windows that rolled over after 28 days I think (like the 787 bug, but 1000 ticks.second not 100).

    Even Microsoft knew that no Windows box could stay up that long.

    (And before you mod me as a troll, think about it and know that MS could have made a bigger counter, but didn't feel the need to)

    1. Re:Windows one is my fave by singularity · · Score: 3, Informative

      The version of Windows was Windows 95, and the number of days was 49.7.

      https://support.microsoft.com/...

      --
      - (c) 2018 Hank Zimmerman
  8. 2038 is working itself out already by jandrese · · Score: 2

    Several years ago I was really concerned about the 2038 rollover because so many protocols have hard baked 32 bit timestamp fields in them. Even if systems were updated the protocols might not be. But I've come to realize that once the systems are updated, the protocol tend to follow suit in the next revision, and in the next 23 years pretty much every protocol is going to go through at least one revision. There are still going to be a few holdouts that have trouble in 2038, but I'm expecting it to be as much of an event as the year 2000. A few fringe things act weird or even stop working, but pretty much everything important is OK.

    --

    I read the internet for the articles.
    1. Re:2038 is working itself out already by omglolbah · · Score: 3, Insightful

      In the business I work "profibus" is considered a "new" technology. The standard was published in 1989.

      We still run a token ring coax network for most critical systems on a significant part of the oil rigs in the North sea and on onshore installations supporting them.

      Some of the controllers are 20 years old and just milling along happily. We did a replacement of NVRAM recently and that is all the service the modules need.
      I fully expect this crud to still be in use in 20 years. Conservative bastards >.

    2. Re:2038 is working itself out already by Viol8 · · Score: 4, Insightful

      If the hardware is still fully operational after 20 years in a hostile enviroment like an oil rig I'd say its anything but "crud". It was probably some of the best kit on the market.

      This might come as a shock but a lot of businesses want kit that Just Works reliably 24/7, not the latest trendy junk that would impress a Hipster cycling past on his fixie bike but lasts about 5 minutes in the real world.

    3. Re:2038 is working itself out already by omglolbah · · Score: 4, Interesting

      Oh it is good gear, but the list of 'bugs' and 'erratas' on the gear is growing longer and longer for every month it stays in service. Spare parts are almost impossible to come by, and even the toolchain needed to update the programs are old enough to require special dedicated workstations.

      It is not a matter of 'working' it is a matter of 'will work in the future'. Right now all the gear has reached "end of life" and spare parts are very close to being "ebay if you're lucky" in terms of procurement. Trying to get the customer to upgrade BEFORE we're already screwed and have to 'rush' an upgrade is the game we're in now.

      Doing a 3 year project in 6 months (while in some cases doable..) leads to badly rushed design and future redesigns. We've seen this over and over in the past 10 years.

      An example is that the new hardware has built in EX barriers on each channel, the termination boards are much better and a variety of other improvements. This translates into -4- massive cabinets being reduced to one. Real-estate offshore is hugely expensive and this would save staggering amounts of money compared to expanding equipment rooms... but they want the stuff they're used to, not the stuff that is current.

      The hilarity of the whole thing is that the 'current' stuff is now installed all over the rig where old hardware is not available so now we have both systems running in parallel with a ton of 'interfacing' and single points of failure introduced as a result.

      It can drive an engineer mad.

  9. So Is Mac OS X. by tlambert · · Score: 4, Informative

    So Is Mac OS X.

    I converted time_t to 64 bits on 64 bit systems (which include the most recent iPhones) as part of the changes for 64 bit binary support on the G5 when I wrote the 64 bit binary loader support into exec/fork/spawn, and again as part of UNIX Conformance. It's basically been fixed since Tiger.

  10. Y2K was -not- a small issue by mccalli · · Score: 5, Insightful

    The reason so little went wrong is because people spent ages testing and upgrading/fixing beforehand. Had we left it all to 1st Jan 2000 there would have been issues,

    It annoys me to see Y2K trotted out time and time again as a non-event. It was a very big event, and by the large part it was very successfully handled.