Slashdot Mirror


Lessons From Your Toughest Software Bugs

Nerval's Lobster writes: Most programmers experience some tough bugs in their careers, but only occasionally do they encounter something truly memorable. In developer David Bolton's new posting, he discusses the bugs that he still remembers years later. One messed up the figures for a day's worth of oil trading by $800 million. ('The code was correct, but the exception happened because a new financial instrument being traded had a zero value for "number of days," and nobody had told us,' he writes.) Another program kept shutting down because a professor working on the project decided to sneak in and do a little DIY coding. While care and testing can sometimes allow you to snuff out serious bugs before they occur, some truly spectacular ones occasionally end up in the release... despite your best efforts.

285 comments

  1. Cool by Anonymous Coward · · Score: 0

    Thanks for posting

  2. 2nd link goes to Kotaku by Anonymous Coward · · Score: 1, Informative

    Here's an archived copy: https://archive.is/VpWQl

  3. Compiler optimizer bugs by Dan+East · · Score: 4, Interesting

    Some of the bugs I've beat my head against the wall over the most are compiler bugs. It's easy to have the mindset that the compiler is infallible, and so programmers don't usually debug in a way that tests whether fundamentals like operators are really working right. This was particularly bad developing for Windows CE back around 2000 when you had to build for 3 different processors (Arm, MIPS and SH3). I ran into a number of optimizer bugs usually related to binary operators. The usual solution was precompiler directives to disable the optimizer around a specific block of code.

    --
    Better known as 318230.
    1. Re:Compiler optimizer bugs by Anonymous Coward · · Score: 2, Interesting

      Just after I graduated and I was working at my first job writing my first program ever that was not a homework assignment, I decided to write it as a multi-threaded program. I had a race condition that was causing a datastructure to give bad data. Took me almost 30 minutes to track it down. Now that I've gotten better at programming, race conditions take me much less time and rarely involve any debugging.

    2. Re:Compiler optimizer bugs by Anonymous Coward · · Score: 1

      So you could say, you "race" to fix the bugs ;)

    3. Re:Compiler optimizer bugs by eulernet · · Score: 3, Interesting

      I had a worst experience: hardware bugs.

      Back in the 90s, I was working on a trucks game.
      Strangely, when playing via network, the trucks on some computers sometimes desynchronized.
      I spent one week locating the problem by digging into verbose logs: it was due to the FDIV bug, which was subtly changing the positions of some trucks.

      More recently, I spent a lot of time figuring why some programs crashed on my computer.
      After a few weeks, I realized that some bits in the RAM were dead, writing into them returned random values.

    4. Re:Compiler optimizer bugs by Anonymous Coward · · Score: 1

      I wish all bugs could be fixed in 30 minutes. It took Microsoft about 10 years to fix the following line of code in SSRS's delimited file writer:
      value.Replace(this.m_qualifier, this.m_qualifier + this.m_qualifier);

      Hint: they didn't do anything with the result.

    5. Re:Compiler optimizer bugs by epyT-R · · Score: 1

      Was that 'trucks' game called "Over the Road Racing" by any chance?

      https://www.youtube.com/watch?...

    6. Re:Compiler optimizer bugs by Anonymous Coward · · Score: 0

      thankfully, i've just recently noticed this by chance. ahead of it raising any unanswerable questions. :)

    7. Re:Compiler optimizer bugs by arglebargle_xiv · · Score: 4, Interesting

      Some of the bugs I've beat my head against the wall over the most are compiler bugs.

      Ah yes, the gift that keeps on giving. Every new version of gcc that gets deployed has new optimizer bugs, to the point that, several years ago, we stopped using O3 and above since the small loss in performance (if there even was any) was easier than handling a long tail of compiler bugs across dozens of different CPU types with every new release ("dozens" may be an under-estimate depending on how you want to count families of ARM, MIPS, Power, and other embedded CPUs).

    8. Re: Compiler optimizer bugs by Anonymous Coward · · Score: 5, Interesting

      A compiler guy here, who used to work for one of the RISC companies. Most compiler bugs are not that difficult to debug. But I worked on instruction scheduling and register allocation, hence always got assigned all the weird bugs. The most memorable one for me was actually a hardware bug - most people don't realize but most of the commercial microprocessors have a lot of bug in them. See published erratas and you will find many bugs. A few years after the particular generation of this processor was on the market, I got assigned a bug from this commercial DBMS vendor (I.e. very important customer) on this weird crash bug. It took me forever to figure out but it turns out to be a bug in the processor that corrupts a particular register (due to the register renaming logic screwing up in a rare combination of instructions) that is dependent on the timing and the instruction combination. It became anothet errata item, and I ended up implementing a workaround - if you notice some benign but odd code sequence a compiler generates, there might be a good reason behind :)

    9. Re:Compiler optimizer bugs by Darinbob · · Score: 1

      This is tough in college when it happens. No one believes the student who says things aren't working because of a hardware problem, the other students don't even believe it. There are a lot of software people who are trained to assume hardware never has problems; some even think operating systems don't have problems.

      So in my class at school we always got the new minicomputers, the ones that had never been tested on a full classroom yet. One of them had a bug in a divide instruction, and when used incorrectly it would crash the machine. But the OS never used that instruction incorrectly. The problem was that this machine was used for the compiler class, and we were generating machine code directly. We were getting crashes which kicked off 30 students at once which meant that it very quickly became an issue to try and figure out what was going wrong rather than the usual practice of waiting and hoping it goes away. Eventually one student group figured out it must be them, because it always crashed when they executed their program and it was probably their code. Even then the system admins were dubious at first, because it wasn't the sort of thing to cause crashes. They did figure out which instruction it was though and everyone avoided it after that.

    10. Re:Compiler optimizer bugs by Jeremi · · Score: 5, Insightful

      I was working at my first job writing my first program ever that was not a homework assignment, I decided to write it as a multi-threaded program

      ^^^ 2015 nominee for most terrifying sentence on Slashdot :)

      --


      I don't care if it's 90,000 hectares. That lake was not my doing.
    11. Re:Compiler optimizer bugs by Z00L00K · · Score: 1

      One memorable one is when someone used Pascal coding in C;

              int c='a' - 'z';

      --
      If builders built buildings the way programmers wrote programs, then the first woodpecker would destroy civilization.
    12. Re:Compiler optimizer bugs by Solandri · · Score: 2

      Yeah, I spent two weeks trying to track down an instant blue-screen bug in a 3D simulation. Running it in a debugger didn't help - it would still blue-screen, though it did allow me to narrow it down to an innocuous-looking piece of code. I went over it with a fine-toothed comb and couldn't find anything wrong with it.

      After two weeks, a co-worker was assigned a task similar to mine. She asked for my code so she wouldn't have to start from scratch. I gave it to her with the warning that it was blue-screening and I couldn't figure out why. A half hour later she called to say the code worked just fine on her computer. I couldn't believe it and trotted over to her office to see for myself. It did indeed run on her computer exactly like it was supposed to. I copied her compiled executable to my computer, ran it, and it blue-screened.

      Armed with that knowledge, I began testing by eliminating different parts of my computer. The breakthrough came when I disabled hardware 3D acceleration and ran it in software emulation, and it ran just fine. The culprit was a hardware bug in the nvidia 3D video card. (This was when 3dfx was king. My company had tried to save some money and bought me a discount video card made by some company nobody had ever heard of.)

      Lesson: Sometimes it's not your fault. If you've looked over your code and can't figure out why it's crashing, try running it on another computer.

    13. Re:Compiler optimizer bugs by flargleblarg · · Score: 2

      What's wrong with that? c now contains the delta from 'z' to 'a', which is a well-defined –25 because char literals are signed ints.

    14. Re:Compiler optimizer bugs by goose-incarnated · · Score: 0

      Lesson: Sometimes it's not your fault. If you've looked over your code and can't figure out why it's crashing, try running it on another computer.

      That won't help at all - invoking undefined behaviour could make it behave when running on a another computer. Seeing your code work on another computer tells you nothing at all.

      --
      I'm a minority race. Save your vitriol for white people.
    15. Re:Compiler optimizer bugs by Anonymous Coward · · Score: 0

      Half an hour to fix a parallel concurrency issue?

      I've spent half an hour in CUDA trying to deduce why a concurrency problem exists, after nailing down where it happens. The solution generally consists in finding the exactly one place where nVidia's documentation mentions some particular, nonobvious but critical limitation or requirement.

      The kind of Heisenbugs that deterministically occur 100% of the time the same way when not debugging, and never occur when in cuda-gdb, are the best/worst. They are also fiendishly likely to occur when debugging a multi-GPU program, because stepping over kernel invocations fundamentally alters the nature of the concurrency: The time it takes you to tap 'next' is so vast that kernel execution is effectively serial and synchronous. The only way I know of to get useful information in these situations is to vomit printf() all over the code and hope it doesn't perturb the Heisenbug into the 'not manifesting' state.

    16. Re:Compiler optimizer bugs by inasity_rules · · Score: 1

      I had a fun one with a nameless companies "gateway" onto a wireless network. You could write a repeater path onto the gateway, but if any of the addresses contained a 0xFF byte (most did) the gateway would write 0x0F. It took a while to track down as I had to use an external tool to read everything back, and when I reported it to the nameless company, they informed me they were "too busy" to fix it. This is about when I learned to swear in french again.

      --
      I have determined that my sig is indeterminate.
    17. Re:Compiler optimizer bugs by adhdengineer · · Score: 3, Insightful

      The number of times i've had fellow developers complain that their bug *must* be caused by the compiler, or the OS, or the framework, or the hardware only for it to turn out to be their fault all along is the reason why i always suspect my code before i blame anything else.

    18. Re:Compiler optimizer bugs by Anonymous Coward · · Score: 0

      He said "back in the 90's" whereas Big Rigs was published in 2003, you insensitive bug!

    19. Re:Compiler optimizer bugs by eulernet · · Score: 1

      No, the game was named "Trucks". It ran on DOS+Vesa. DirectX was released at the same moment.

    20. Re:Compiler optimizer bugs by Xest · · Score: 1

      "I spent one week locating the problem by digging into verbose logs: it was due to the FDIV bug, which was subtly changing the positions of some trucks."

      Similar issues are actually a fairly common occurrence in network code for video games during development when the developer is fairly new to the task. A lot of people writing network code for games run into it before learning their lesson.

      See this SE question and the associated links for example for some interesting points:

      http://gamedev.stackexchange.c...

    21. Re:Compiler optimizer bugs by mwvdlee · · Score: 1

      It's well defined in ASCII. This code will produce different results depending on the native character set.
      On an EBCDIC machine (IBM z/OS mainframes), it will not return -25.

      --
      Slashdot social media options: AIM, ICQ, Yahoo, Jabber and Mobile Text. Why no MySpace?
    22. Re:Compiler optimizer bugs by TheRaven64 · · Score: 1

      My day job involves a research OS, a research compiler, and research hardware. When something breaks and we can narrow it down to just two possibilities out of an OS bug, a compiler bug or a hardware bug, it's a good day...

      --
      I am TheRaven on Soylent News
    23. Re: Compiler optimizer bugs by TheRaven64 · · Score: 3, Interesting

      Most compiler bugs are not that difficult to debug

      Another compiler guy here: Some compiler bugs are not that difficult to debug if you have a reduced test case that triggers the issue. Most are caused by subtle interactions of assumptions in different optimisations and so go away with very small changes to the code and are horrible to debug (and involve starring at the before and after parts for each step in the pipeline to find out exactly where the incorrect code was introduced, which is often not where the bug is, so then backtracking to find what produced the code that contained the invalid assumption that was then exploited later).

      --
      I am TheRaven on Soylent News
    24. Re: Compiler optimizer bugs by speedplane · · Score: 1

      I worked in the embedded space for a few years with an in-house built compiler and came across a few compiler bugs. They were all very easy to debug. You would write some code like "int foo = bar++;" and the program would crash on that line. You'd scratch your head for a minute, but check out the assembly and see some weird optimization it was doing.

      --
      Fast Federal Court and I.T.C. updates
    25. Re: Compiler optimizer bugs by arglebargle_xiv · · Score: 1

      but check out the assembly and see some weird optimization it was doing.

      That's how, and why, I learned RS6000 assembly language, to figure out an Aches compiler bug...

    26. Re:Compiler optimizer bugs by arglebargle_xiv · · Score: 3, Interesting

      Having said that, there was one gcc compiler bug that got me a trip to Europe. A client had spent about three months trying to track down an impossible data corruption bug on their NIOS II embedded device, and eventually flew me over to try and sort it out. Our code is paranoid enough to run checksums on internal memory blocks, and that was reporting a memory-corruption problem. After about a week of work (with half-hour turnaround times on the prototype hardware whenever we made a change) we found that gcc was adjusting some memory offset by 32 bits. Everything looked fine at a high level, e.g. in a debugger, but if you took a cycle-by-cycle memory snapshot then at some stage writes started being out by four bytes. It was only the memory-checksumming code that caught it initially, it knew there was a fault but you couldn't see it using any normal debugging tools. We fixed it by detecting when the memory block had "moved" due to the alignment bug and memcpy'ing it 32 bits over so it was where gcc thought it was.

    27. Re: Compiler optimizer bugs by Anonymous Coward · · Score: 0

      I have seen real compiler bugs but I have also seen code where the author thinks it is external bug when it really was obvious out if bounds bug or rrusing memory that was already freed. If you suspect a compiler bug get verification from compiler developers don't just write workarounds.

    28. Re:Compiler optimizer bugs by Antique+Geekmeister · · Score: 1

      > Seeing your code work on another computer tells you nothing at all

      Oh, it can be quite useful. "Another computer" often means "a system that has not had the interesting local undocumented developer modifications that have replaced basic perl modules with too-new or too-old CPAN dependencies". Or it can mean systems that have not had the latest software update with the new regression in a system library, or a system where a developer has not been tuning sysctl parameters and SELinux. I've run into all of those, in the last week.

      I'm afraid I'm unable to post some of my best failures, they're too personally identifiable to me or to a client or colleague. I will mention my most galling, most frequent style of bug in the last five years: It's the complete refusal to bundle software. To just "compile from source" or haphazardly integrate components from CPAN, from pip, from maven, from apt or RPM or other sourceforge or github or any unmaintained, untested repository scattered anywhere in the world without the slightest dependency testing or component verification. Cleaning up the mess is paying a great deal of my salary right now.

    29. Re:Compiler optimizer bugs by drinkypoo · · Score: 1

      I had a fun one with a nameless companies "gateway" onto a wireless network.

      How unusual! Were they simply called "Nameless, Inc."?

      This is about when I learned to swear in french again.

      Guess that's when you learned to surrender, huh? You left out the interesting part of the story as a result.

      --
      "You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
    30. Re:Compiler optimizer bugs by drinkypoo · · Score: 2

      ^^^ 2015 nominee for most terrifying sentence on Slashdot :)

      I don't get scared when I read that stuff. I just say, "Oh, that explains Adobe" or whatever. The truth is that the world is a fractally more fucked up place than you think it is. Most people are doing it wrong and proud, regardless of their job. Or, they're phoning it in, and they know it. But since our world is not even close to being a meritocracy, we're going to have more of that.

      --
      "You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
    31. Re: Compiler optimizer bugs by RabidReindeer · · Score: 1

      The worst bugs to fix are the bugs that are not bugs.

      I learned a long time ago that the place where most time gets eaten in a project isn't the tricky fancy functions and algorithms, it's the niggling little things. Stuff like missing commas or mis-capitalized names. You can stare at them for hours or even days and miss them, and this is why it's important to get someone - anyone at all, regardless of how inexperienced to look at the offending code. Because you see what 'should" be there, and not what actually is there.

      Worse even than that are the non-bugs. Where the code is doing exactly what it should, but you are making incorrect assumptions about the results and cannot find anything to fix because there is, in fact, nothing to fix.

    32. Re:Compiler optimizer bugs by Anonymous Coward · · Score: 0

      I agree, some of the most frustrating bugs I have ever run into were in GCC's optimization routines. Optimizing turned off compiled fine, optimizations turned on caused the application to crash. Switching to using Clang (with or without optimizations) resulted in a working application.

      People tend to assume the compiler produces the correct output, but it sometimes does not (especially when optimize flags are set). I remember /. covered a story a few years ago where code in the Linux kernel was getting optimized out and it was causing problems. Compilers are tricky things.

    33. Re:Compiler optimizer bugs by inasity_rules · · Score: 2

      I don't normally name and shame. I am the worst of customers - I quietly leave for other suppliers and you don't know I am gone. So, in a way I did surrender. I surrendered my business to a German company... I guess it would be more awkward for them if I informed them of this...

      --
      I have determined that my sig is indeterminate.
    34. Re:Compiler optimizer bugs by AmiMoJo · · Score: 1

      Depends, these days a lot of C# programmers use multi-threading by default and in fact many popular C# frameworks require it. For example, in WPF you pretty much have to use separate processes for everything, and many WPF components will generate new threads for your callbacks to execute in etc.

      It's been made a lot easier and simpler. It is homework level stuff, a basic requirement of keeping apps responsive with modern UI frameworks and software development patterns.

      On the other hand he might just have been fork()ing like crazy, which would indeed be pretty scary. Task.TaskFactory, BackgroundWorker and lambda expressions, not so much.

      --
      const int one = 65536; (Silvermoon, Texture.cs)
      SJW, n: "Someone I don't like, and by the way I'm a fuckwit" - AC
    35. Re: Compiler optimizer bugs by Daniel+Hoffmann · · Score: 1

      I bet the hardware guys say that the physics has bugs that they have to work around too.

    36. Re: Compiler optimizer bugs by colablizzard · · Score: 1

      and the program would crash on that line.

      Not all compiler bugs cause a crash. A crash on the line is the ideal bug. I had to deal with a getc()/ungetc() bug that caused a wrong character to get inserted into the file stream. I spent days "printf-ing" the file parsing code. Once I found the bug, I was able to demonstrate it to the compiler guys in a 30 line program. That looked so easy to colleagues and the compiler guys after the fact.

    37. Re: Compiler optimizer bugs by TheRaven64 · · Score: 1

      That's not debugging a compiler bug, that's step one in producing a reduced test case, which is step one in debugging a compiler bug. It's also starting from the easiest kind of compiler bug to fix (ones where it causes a crash at all, and within that category, ones where the crash and the miscompilation are in the same place).

      --
      I am TheRaven on Soylent News
    38. Re: Compiler optimizer bugs by TheRaven64 · · Score: 2

      The worst bug I had to fix recently was in some hand-written assembly where the immediate in a store and a load overflowed and the assembler silently truncated it. I read the code multiple times and it was only when I traced the execution and looked at the disassembly that it became clear that the assembly and disassembly didn't show the same thing. This caused crashes, but it was in context switch code, so it only happened in processes that happened to use those particular registers and often quite a long way after returning to the process.

      --
      I am TheRaven on Soylent News
    39. Re:Compiler optimizer bugs by drinkypoo · · Score: 1

      I don't normally name and shame.

      That's too bad, because that's the only thing that would help us, the slashdotting public. And, you know, the general public, as well.

      I guess it would be more awkward for them if I informed them of this...

      More than being awkward for them, it's useful for us if you inform us. I care more about moving forward than looking back, but a glance at your notes now and again can be useful. Forgive, yes. Forget, no.

      --
      "You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
    40. Re:Compiler optimizer bugs by Anonymous Coward · · Score: 0

      In the original PS2 devkit there was a bug in the configuration of the RDRAM controller which *only* affected the extra memory in the devkit (they had 128MB vs the 32MB in the retail unit). If you did a DMA across the boundary between the two memory areas, it faulted for no apparent reason.

      I spent THREE MONTHS trying to track down why our game was randomly crashing. I even hacked into the OS (no range checking on the software interrupt index) to install a new interrupt table, to make sure nothing screwy was going on in an interrupt handler. Eventually we got a repro case where you "only" had to run a little program for FOUR HOURS to make the system crash, sent it to Sony, and they diagnosed the issue and fixed it in like a day.

      I hope I never have to do this again.

    41. Re:Compiler optimizer bugs by inasity_rules · · Score: 1

      Fair enough. The company is Webdyne (you can google it). They produce(d?) an annoying little box that acted as a ftp(why do all water guys think ftp is the greatest invention ever?) gateway that would query devices on a Wavenis network. The protocol itself is fairly straightforward, and I would have preferred to implement it myself directly, but they would not give me that functionality. The gateway ran linux, but they provided no way to upload any modified code to the box, and refused to give the admin password. They were particularly unhelpful, and I do have an archive of my correspondence with them.

      I no longer deal with them in any way, so I don't know if they are still selling the fatally flawed product. Forgive? Maybe, but they would have to prove their solution at their cost if they ever wanted to do business with me again.

      --
      I have determined that my sig is indeterminate.
    42. Re:Compiler optimizer bugs by parkinglot777 · · Score: 1

      So you could say, you "race" to fix the bugs ;)

      The GP didn't say "fix" but rather said "track it down." Huge differences! ;)

    43. Re:Compiler optimizer bugs by __aaclcg7560 · · Score: 1

      This was when 3dfx was king.

      My favorite gaming combination back then was a Nvidia TNT2 and a pair of 3Dfx Voodoo 2 SLI video cards. Although I had AMD K5 CPU (not as good as the K6-2 that came later), I was able to smoke my roommates with their more expensive Intel Pentium CPUs and software rendering video cards in Quake 2.

    44. Re:Compiler optimizer bugs by wvmarle · · Score: 1

      Seeing your code work on another computer tells you nothing at all.

      To me it would tell: "your code is correct", for starters. It also tells me to look for the problem elsewhere, outside of MY code. External software libraries that are used. Or the hardware it runs on. It won't give you the solution right away, but it does eliminate one possible culprit - your own code - which is normally the prime suspect.

    45. Re:Compiler optimizer bugs by Anonymous Coward · · Score: 0

      30 minutes? That's a bad bug? Admit it, this first job of your started last week, right?

    46. Re:Compiler optimizer bugs by requerdanos · · Score: 1

      >> Seeing your code work on another computer tells you nothing at all.

      > To me it would tell: "your code is correct", for starters.

      No, it doesn't indicate that your code is correct, only that in the differing environment of the "other" computer, you aren't running into the condition that exposes the bug. For example, GP points out that you may have coded undefined behavior that could be arbitrarily executed a number of different ways depending on the specific environment. Your program may be inadvertently opening too many file handles and the other computer has less files open or more file handles available. Your program may inadvertently run itself out of memory, but fail to do so on another computer with a different memory situation.

      Lots of things could cause a bug-infested bit of code to run fine in one environment and die in another. So, if your program crashes or runs in an unexpected way, it's probably better to vet your code first before looking for other problems.

    47. Re:Compiler optimizer bugs by Anonymous Coward · · Score: 0

      Haha, considering no undergraduate education remotely prepares anyone to write a real-world app in the first place (without having such horrible thoughts as 'i'll make it multi-threaded!') I think that still falls on, for instance, Adobe, when their engineers do something like that. I am working as a solo developer out of college and I can't say I didn't make that exact mistake, pretty much in that exact situation ;p

    48. Re:Compiler optimizer bugs by Anonymous Coward · · Score: 0

      My answer is usually along the lines of "If an application can bluescreen the machine (without intentionally doing so), then it's not an application bug."

      Besides things like killing csrss.exe, there is no excuse for allowing an application, even a runaway one, to bluescreen the kernel. It's probably a security bug as well.

    49. Re:Compiler optimizer bugs by flargleblarg · · Score: 1

      Ah, I see. And back in the Pascal days, there were actually non-ASCII systems in use.

    50. Re:Compiler optimizer bugs by mwvdlee · · Score: 1

      There still are. It's the native encoding on big IBM hardware, which means you're probably triggering EBCDIC-based code every time you communicate with your bank, insurance company or other large company.

      --
      Slashdot social media options: AIM, ICQ, Yahoo, Jabber and Mobile Text. Why no MySpace?
    51. Re:Compiler optimizer bugs by flargleblarg · · Score: 1

      Whoa. Did not know that!
      I want to share this fact with friends. Is there any specific hardware platform that you know of? Something I can search for with Google and give a link to my cow orkers? They'll be shocked.

    52. Re:Compiler optimizer bugs by david_thornley · · Score: 1

      One time, I had the reverse problem. Everything worked on my computer, but not on the users' computers, which were pretty close to identical to mine. That's even more frustrating.

      --
      "When you have eliminated the unacceptable, whatever is left, however improbable, must be the truthiness" - Holmes
    53. Re: Compiler optimizer bugs by david_thornley · · Score: 1

      Theoretically, the reduced test case is where you hand the bug over to whoever makes the compiler. Of course, there was the time when the response was, "Yes, that's a bug." Even a clue as to how to work around it would have been more useful.

      --
      "When you have eliminated the unacceptable, whatever is left, however improbable, must be the truthiness" - Holmes
    54. Re:Compiler optimizer bugs by ImprovOmega · · Score: 1

      That's the one I remember the most. Ran perfectly when compiled with a debug profile, broke to pieces when compiled in a release profile. Finally tracked it down to a complicated "if" statement that the optimizer failed to correctly account for. This was back in Visual Studio 6.0 though, it's been a good long while. I seem to remember fixing it by refactoring the if statement though, telling it not to optimize that section was too much of a performance hit.

    55. Re:Compiler optimizer bugs by KGIII · · Score: 1

      I did not try any AMD products until the K6-2 came out. I had an Acer that it came in, I had purchased it for the house. It was 350 mhz and I OCed it to a bit over 500 mhz and, while a bit warm, it never failed while I owned it. I could get it a bit higher but it was not very stable when I did so. The curious thing is that it came with ME on it and, honestly, ME ran like a champ. I had multi-month uptimes and ran an OpenNap server (then a hub) on it. I think it was one of six computers in the country that ran ME properly.

      --
      "So long and thanks for all the fish."
    56. Re:Compiler optimizer bugs by KGIII · · Score: 1

      I am not sure if that was intentional but it is practically begging for:

      Your coworkers are all cows. Moo says the coworkers. Moo! Mooooo! You coworker cows!

      --
      "So long and thanks for all the fish."
    57. Re:Compiler optimizer bugs by Anonymous Coward · · Score: 0

      Correct. I failed to mention that the race condition was not caught for several years because it required inserting enough data to force the datastructure to resize, and the resizing was not thread safe. I spent a few hours and wrote my own lock-free datastructure and haven't had a problem since. To keep it simple, it does not resize itself. I also wrote it in a way that a race condition should result in the datastructure falsely thinking it is full rather than overwriting or losing data.

    58. Re:Compiler optimizer bugs by Anonymous Coward · · Score: 0

      I was on the receiving end of something like this, 250,000 lines of C, multiple processes talking to each other, each of which was multi threaded.
      Once I had gone through rafts of race conditions, I eventually found my bug - strcopy had been used instead of memcopy, so occasionally the authentication byte being used would fail.

    59. Re:Compiler optimizer bugs by stoatwblr · · Score: 1

      "After a few weeks, I realized that some bits in the RAM were dead"

      Do you think ECC is a bad idea? If not, why are you using a system without it?

    60. Re:Compiler optimizer bugs by stoatwblr · · Score: 1

      "They were particularly unhelpful"

      Ah. They're french. That explains a lot.

    61. Re:Compiler optimizer bugs by eulernet · · Score: 1

      It's my personal computer, but I am not the only one to get this kind of error:
      http://www.jpl.nasa.gov/news/n...

    62. Re:Compiler optimizer bugs by flargleblarg · · Score: 1

      Ever since I learned of it, I say “cow orker” whenever I get the chance. :)

  4. No time to post... by Anonymous Coward · · Score: 0

    ... can't stop now, I'm in the middle of a nasty debugging session.

  5. ...nobody told us.... by turkeydance · · Score: 1

    OMG....a Meme...

  6. Hardly devastating, but a waste of several hours by Rei · · Score: 5, Insightful

    Program crashing at startup? Okay, let's add debugging statements.

    Can't get the debugging statements to execute? Okay, let's try removing code.

    Doesn't fix the problem? Okay, let's keep removing more... and more...

    A couple hours later, so much code was removed that the entire program had become nothing more than an empty main function that still crashed. This led to the following rule which I try to follow to this day: Make sure that you're actually compiling and executing the same copy of the code that you're modifying. ;)

    --
    I'll never forget the last thing grandma said to me before she died: "What are you doing in here with that knife?!?"
  7. Passing Parameters with Side Effects by Etherwalk · · Score: 3, Interesting

    I had a bug once where red and blue values were swapping places across thousands of pixels that took quite a while to hunt down once. It turns out there was a function doSomething called with parameters (pixel[i++],pixel[i++],pixel[i++]) while doing transformations. The compiled code pushed the third parameter onto the stack first, so it was using the red value from the array in the blue spot and vise-versa across the entire image.

    1. Re:Passing Parameters with Side Effects by Tablizer · · Score: 1

      red and blue values were swapping places across thousands of pixels

      Just convince the customers that "the Avatar look" is in style. Who needs technicians when you have a good sales team.

    2. Re:Passing Parameters with Side Effects by Anonymous Coward · · Score: 0

      flexelint and other static code analysis tools will nowadays scream to high heaven if you try using subexpressions with side effects like that.. best to use pre-post increment only in simple statements on their own, if at all.

    3. Re:Passing Parameters with Side Effects by Anonymous Coward · · Score: 5, Informative

      Actually, what you're describing is formally defined as undefined behavior in the C and C++ standards.

      Undefined behavior:

      doSomething(pixel[i++],pixel[i++],pixel[i++]); /* function call commas are NOT sequence points, so the result is undefined */

      Refer to the Sequence point article. The [3] citation says

      "Clause 6.5#2 of the C99 specification: "Between the previous and next sequence point an object shall have its stored value modified at most once by the evaluation of an expression. Furthermore, the prior value shall be accessed only to determine the value to be stored."

      Pay spectial attention to see point #4 under "Sequence points in C and C++", because that talks about your exact problem. But beware that you'd still have a bug even if you hid the increment inside of a function, because order of argument evaluation is not specified (as oppposed to undefined behavior, which can cause nasal demons or format your hard drive).

      Fixed with least diff:

      int r=pixel[i++], g=pixel[i++], b=pixel[i++]; /* commas between declarators ARE sequence points */
      doSomething(r,g,b);

      See also: S.O. questions related to undefined behavior and sequence points in C and C++.

    4. Re:Passing Parameters with Side Effects by TheRaven64 · · Score: 4, Insightful

      The order of parameter evaluation is one that bites a lot of people because most compilers do it the expected way. When you're walking an AST to emit some intermediate representation, you're going to traverse the parameter nodes either left-to-right or right-to-left and most compiler IRs don't make it easy to express the idea that these can happen in any order depending on what later optimisations want. If they have side effects that generate dependencies between them (as these do) then they're likely to remain in the order of the AST walker. Most compilers will walk left-to-right (because a surprising amount of code breaks if they don't), but a few will do it the other way.

      To understand why this is in the spec, you have to understand the calling conventions. Pascal used a stack-based IR (p-code) and had a left-to-right order for parameter evaluation, which meant that the first parameter was evaluated and then pushed onto the stack, so the last parameter would be at the top of the stack. The natural thing when compiling Pascal (as opposed to interpreting the p-code) was to use the same calling convention, with parameters pushed onto the call stack left to right. Unfortunately, C can't do this and support variadic functions (not: some implementations wanted to do this, which is why the C spec says that variadic and non-variadic functions are allowed to use completely different calling conventions), because if the last variadic argument is the top of the stack then there's no way to find the non-variadic arguments unless you also do something like push the number / size of variadic arguments onto the stack.

      This meant that C implementations tended to push parameters onto the stack right to left. This is less of an issue now that modern architectures have enough registers for most function arguments, but is still an issue on i386. Because of the order of the calling convention, it's more efficient on some architectures to evaluate arguments right to left. Some compilers that are heavily performance-focussed (GPU and DSP ones in particular, where they don't have a large body of legacy code that they need to support) will do this, because it reduces register pressure (evaluate the rightmost argument using some temporaries, push it to the stack, move onto the next, reusing all of those temporary registers).

      --
      I am TheRaven on Soylent News
    5. Re:Passing Parameters with Side Effects by david_thornley · · Score: 1

      That's a very standard case of undefined behavior, and anybody with any real knowledge of C or C++ would know that this doesn't work.

      --
      "When you have eliminated the unacceptable, whatever is left, however improbable, must be the truthiness" - Holmes
    6. Re:Passing Parameters with Side Effects by david_thornley · · Score: 1

      It's not just parameter evaluation, it's execution of side effects. If you write something like 'int j = ++i;", then you are modifying the values of two variables, and the Standard has nothing to say about the order of operations except that everything's sorted out at the next sequence point.

      --
      "When you have eliminated the unacceptable, whatever is left, however improbable, must be the truthiness" - Holmes
    7. Re:Passing Parameters with Side Effects by Etherwalk · · Score: 1

      That's a very standard case of undefined behavior, and anybody with any real knowledge of C or C++ would know that this doesn't work.

      It turns out you can say that about pretty much any mistake. And that it is pretty useless to do so.

    8. Re:Passing Parameters with Side Effects by david_thornley · · Score: 1

      There's lots of things in C and C++ that are relatively obscure errors. Did you know that signed integer overflow is undefined behavior? How about the fact that negating a signed integer can cause undefined behavior? I wouldn't be surprised to find a C expert writing a bug by forgetting one of those things, or by not realizing the effects. They aren't immediately obvious errors. On the other hand, anybody competent at C should know to avoid obvious multiple increments in one statement with no intervening sequence point (although they might wind up doing that with aliasing).

      --
      "When you have eliminated the unacceptable, whatever is left, however improbable, must be the truthiness" - Holmes
  8. Compiler bugs are the worst by sectokia · · Score: 3, Interesting

    When ARM first came out on some philips CPUs it had bugs in the C compiler. The IT department called us hardware engineers in after being stuck on a bug for months. The problem with programmers is to many of them work at a high level, and they hit a wall at some abstraction layer, usually at assembly code. The other problem with these compiler bugs was as you removed unrelated code, they went away, as the compiler had pointer corruption issues. So to get the vendor to fix it, you often had to submit an entire copy of your code project. Sometimes we had to submit images of entire machines because the compiler would interact with an IDE and with Windows. These days we use only open source compilers to ensure we arnt held up and can identify and fix problems quickly.

    1. Re:Compiler bugs are the worst by sectokia · · Score: 5, Interesting

      The absolute worst I've had was a soft cpu in a altera fpga. It shipped with a C compiler. A programmer came to me to explain how his program would crash if he changed the order in which subroutines were defined. After carefully checking the logic it, there was nothing wrong with his code. So i then trawled through the assembly. Again i could find nothing wrong And thought i was losing my mind. I had to painstakingly check the cpu state after each instruction until i eventually found one instruction that did not set a flag as per the manual, and the assembler matched the manual. It was a fault that would only trigger it you did a certain conditional jump after a certain fetch increment then store sequence. It was a bug in the cpu pipeline logic. I learnt a valuable lesson never to trust anything. We wasted allot of time because we were convinced we must have been the source of the fault.

    2. Re:Compiler bugs are the worst by zennyboy · · Score: 1

      *too

    3. Re:Compiler bugs are the worst by zennyboy · · Score: 1

      a lot

    4. Re:Compiler bugs are the worst by Z00L00K · · Score: 1

      You found the bug - that's what counts! Never forget that!

      --
      If builders built buildings the way programmers wrote programs, then the first woodpecker would destroy civilization.
    5. Re:Compiler bugs are the worst by Anonymous Coward · · Score: 0

      thank you

  9. Why Version Control is Important by 14erCleaner · · Score: 4, Insightful

    Back in the 80's, I was working on a project with three other programmers. Nobody had heard of version control back then; we were using VAX/VMS and it would keep a few versions of a file around after you changed it, which seemed good enough (after all, we all trusted each other, right?)

    Well, I don't remember the exact bug(s), but one day I fixed something, and tested it. Fine. A few days later the bug came back. So I went back, fixed it again (wait, didn't I already make this change?). A few days later it came back again.

    It turned out that one of the other guys had fixed a different bug, which I had introduced with my fix. So, his fix was to change the code back the way it was. We went back and forth a few times un-doing each others' changes before we realized what was going on. Seeing a revision log with comments on the changes might have helped...

    --
    Have you read my blog lately?
    1. Re:Why Version Control is Important by Anonymous Coward · · Score: 0

      Did your company have a bug bounty? You and your coworker could've gotten new cars...

    2. Re:Why Version Control is Important by 14erCleaner · · Score: 1

      We had a bonus for delivery, that decreased by some amount per week (I think it was 2.5%). By the time we delivered, we were about 100% in the hole...

      --
      Have you read my blog lately?
    3. Re:Why Version Control is Important by Z00L00K · · Score: 1

      That's why you comment in the code why something is done there.

      --
      If builders built buildings the way programmers wrote programs, then the first woodpecker would destroy civilization.
    4. Re:Why Version Control is Important by flargleblarg · · Score: 1

      Nobody had heard of version control back then [80's]

      I don't think that's correct. Wikipedia says that SCCS was first released in 1972.

    5. Re:Why Version Control is Important by tlolczyk · · Score: 1

      There were all sorts of SCM's available then.

  10. You are not qualified to debug your own code by myowntrueself · · Score: 5, Insightful

    I recall a proverb, something like

    "It takes twice as much intelligence to debug code as it took to write it.
    So if you code to the best of your ability you are, by definition,
    not qualified to debug it."

    --
    In the free world the media isn't government run; the government is media run.
    1. Re:You are not qualified to debug your own code by bloodhawk · · Score: 5, Informative

      The full quote is

      “Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it.”
      - Brian Kernighan

      I used to use this as my signature a few years back to try and make devs think about what they are writing. It is nearly always better to make the code simple and readable than to try and produce the best possible code. No it isn't as fun, but it is a damn side better for those that have to try and decipher your clever coding tricks later.

    2. Re:You are not qualified to debug your own code by Anonymous Coward · · Score: 0

      This makes more sense than the grandparent.

    3. Re:You are not qualified to debug your own code by npetrov · · Score: 1

      My point too. It doesn't matter which frameworks are used. As long as something is 1) simple and 2) easy to figure out and debug in the future, it's fine.

    4. Re:You are not qualified to debug your own code by Z00L00K · · Score: 1

      I see one reason to write clever code - and that's when you try to optimize for performance. But the "hot spots" in code are rare, so in those cases you can get away with it if you put in a decent comment in the code describing why it's done in a particular way.

      Multithreading is also a special beast - looks simple, is simple to implement but you really have to watch out if you have shared variables/memory. It can lead to some not so obvious errors!

      --
      If builders built buildings the way programmers wrote programs, then the first woodpecker would destroy civilization.
    5. Re:You are not qualified to debug your own code by Rockoon · · Score: 1

      An issue is that often times when optimizing for performance that even though performance is only important in those "hot spots" that the optimization frequently involves large chunks of the code base. If you don't understand this then you probably don't understand real optimization efforts and have just been toying with the kind of optimizations that the compiler should already be doing for you.

      The professional optimizers, the guys called in because nobody on the team can get even close to acceptable performance, they werent called in to tweak. They were called in to change the underlying structure of everything. They arent looking to get 10% here or 20% there. They are looking to get 10000%. They cant do that by focusing on your silly "hot spot." They can only do that by changing the problem.

      For some insight into this, check out the classic book "Graphics Programming Black Book" which is available online in many places for free (such as ) Chapter 17 (on the well known "game of life") is good on this, but the entire book is a good read.

      --
      "His name was James Damore."
    6. Re:You are not qualified to debug your own code by dargaud · · Score: 1

      It is nearly always better to make the code simple and readable than to try and produce the best possible code.

      One of my former bosses told me upon starting the job: "Whenever you think of a clever programming trick: forget it !"

      --
      Non-Linux Penguins ?
    7. Re:You are not qualified to debug your own code by Walter+White · · Score: 1

      "When you write code like me you have to be good at debugging." -me

                                                         

    8. Re:You are not qualified to debug your own code by TheRaven64 · · Score: 1

      For some insight into this, check out the classic book "Graphics Programming Black Book" which is available online in many places for free (such as ) Chapter 17 (on the well known "game of life") is good on this, but the entire book is a good read.

      Please don't. The advice in that book (or, that chapter, at least) is great, if you're targeting a 486 with VGA graphics and a compiler from around 1995. Some of the advice is still good, but it's interleaved with advice that will result in much slower code (though the advice to always profile first may save you there!) and so it would be very hard to read that book without already understanding optimisation and come away more informed.

      --
      I am TheRaven on Soylent News
    9. Re:You are not qualified to debug your own code by Anonymous Coward · · Score: 0

      That's a good quote.

      Also, just for the sake of passing on knowledge, the idiom is "damn sight better" not "damn side better".

    10. Re:You are not qualified to debug your own code by david_thornley · · Score: 1

      When micro-optimizing the code, you need to know what's good and what isn't. Old advice can screw up optimizations on more modern compilers.

      For much of my youth, I was told that unrolling loops was good for performance, and quite a few years ago I sped up a slow section by rolling the loops up as tight as I could. My best guess was that my version was much more cache-friendly, but what I did know was that my profiling showed a very large speedup.

      --
      "When you have eliminated the unacceptable, whatever is left, however improbable, must be the truthiness" - Holmes
  11. Re:Hardly devastating, but a waste of several hour by Anonymous Coward · · Score: 0

    Hahaha, I see this happening so often at work.

    I think the most memorable one for me was when a specific order of load and store instructions caused the data cache to emit incorrect information.

  12. Debugging Gone Wrong by mlookaba · · Score: 4, Interesting

    Bug 1 (my fault) : Took over working on a financial application that took an identifier and enriched them with all sorts of useful data. The original programmer had left, and nobody at the company knew anything about how it worked. Soon after, we were troubleshooting an issue reported by a client that the output data wasn't consistent between runs. I grabbed a list of all the unique security IDs I could find (about 100k) and pushed them through a couple of times just to try and replicate the issue. HOWEVER... it turns out the application was actually using the Bloomberg "By Security" interface under the hood. That was a service where you drop a list of IDs onto Bloomberg's FTP server, and they would respond with data... for a fee of $1 per security. The client got an unexpected bill of nearly $200k that month, and I had the most awkward talk ever with my boss. Fortunately, Bloomberg forgave the charges, and it turns out they were actually responsible for the inconsistent data - which was fixed on their end shortly thereafter.

    Bug 2 (not my fault) : A client/server application is returning odd responses to a particular query. Developer (we'll call him "Jason") inserts a switch into the code that dumps this query out to a hardcoded folder on the server. The code then gets checked into production WITH THE SWITCH TURNED ON. It went undetected for nearly a year because the query wasn't terribly high volume. But slowly and steadily, the query files built up over time. Our IT had lots of money to play with, so server space was not an issue. Unfortunately, the number of files was. Server performance went steadily downward every so often, until finally this query would make it crash every time. When we eventually tracked down the cause, there were millions of files sitting in the same folder of every single server in the group. It took nearly three days just to get the OSs to delete the files without falling over.

    1. Re:Debugging Gone Wrong by Anonymous Coward · · Score: 0

      Even today Windows Explorer still has problems with more than about 10,000 files in a single folder, occasionally crashing the desktop session after spending ages scanning the selected folder's contents. We updated our cache manager to limit files/folder to 1,000, which improved things a lot, but because computers like binary so much (and so do we) we eventually dropped it to 256 and haven't had any problems since.

    2. Re:Debugging Gone Wrong by Z00L00K · · Score: 2

      Bug #1 - not a bug really. Just an awkward mistake, but good that Bloomberg dropped it. But that also shows the need for documentation of how stuff works when someone quits.

      I once developed an SMS gateway and did a test run on it but forgot to change the list of phone numbers so my manager at the time got 50 text messages with the same content. Ooops! :)

      --
      If builders built buildings the way programmers wrote programs, then the first woodpecker would destroy civilization.
  13. One stray ; burned a week... by DamonHD · · Score: 2

    A stray ; 30 years ago in some C took me a week to find, replacing the intended body of a loop with an empty block IIRC. I have ever since tried always to { } statement blocks so that it is easy to tell what was intended...

    Also I strongly echo the "make sure that you're editing what you're running/debugging" comment elsewhere. Still horribly easy to get that one wrong in lots of different ways...

    Rgds

    Damon

    --
    http://m.earth.org.uk/
    1. Re:One stray ; burned a week... by FrozenGeek · · Score: 2
      "Also I strongly echo the "make sure that you're editing what you're running/debugging" comment elsewhere. Still horribly easy to get that one wrong in lots of different ways..."

      Agreed, although a modern VCS really really helps avoid this. Wish I'd had GIT back in the '80s.

      --
      linquendum tondere
    2. Re:One stray ; burned a week... by Darinbob · · Score: 1

      I stick with the ";" for loop bodies but it's always on a line by itself so that it's obvious.
      Another problem along lines with this is to not trust the indentation from other people's code. So you miss the ";" at the end of the line with the "while" because the indentation is fooling you. Some people just insist on their own indentation style even if the code above and blow it use different styles. I even had a boss once who cut and pasted code without re-indenting afterwords.

    3. Re:One stray ; burned a week... by Darinbob · · Score: 1

      If you're paying attention that is. You can edit code, save it, pop up the window and type "make", see stuff actually build, then hit your debugger and it's loading the wrong code. Happens if you're forced to use some lame IDE or debugger for the chip while using better tools to develop with (because every damn chip maker thinks they should make some proprietary half assed IDE rather than make open debugging tools).

    4. Re:One stray ; burned a week... by Anonymous Coward · · Score: 0

      came across a few times where putting a tab into a #define macro seems to make the MS compiler misscompile things.

      I now *only* use spaces. Never tabs.

      When I come across projects that do what you said 'dangling ifs' and mandates tabs. I run the other way.

    5. Re:One stray ; burned a week... by Z00L00K · · Score: 1

      Or just have the path variable set wrong - current directory last in the path can yield some interesting effects when coding.

      --
      If builders built buildings the way programmers wrote programs, then the first woodpecker would destroy civilization.
    6. Re:One stray ; burned a week... by CSMoran · · Score: 2

      I even had a boss once who cut and pasted code without re-indenting afterwords.

      These go into a separate chapter usually, just like forewords, so they might be indented differently on purpose.

      --
      Every end has half a stick.
    7. Re:One stray ; burned a week... by Anonymous Coward · · Score: 1

      Today you can do:

      gcc -Werror=empty-body

      You're welcome.

    8. Re:One stray ; burned a week... by boomer_rehfield · · Score: 1

      Took over a large perl code project from my boss years ago where this happened. Apparently the two people that had worked on it after him each had their own indentation (and programming) styles and it was ridiculously hard to read or debug. I ended up going through all of the code over a few months to make it all conform just so I didn't go insane try to read through it.

      --
      Carpe Canem - Seize the Dog
    9. Re:One stray ; burned a week... by sconeu · · Score: 1

      My worst bug ever was a third order bug (A changes B, which later results in C being changed, which finally manifests as visible defect D).

      This occurred four hours into a full bore system integration test.

      This was using a Z8000 CPU We wound up having to put an ICE on the thing, but because of all the radio signals, the ICE cable had to go into the case, the gap sealed with foil, and the ICE cable also wrapped in foil. Then we recorded (using analog cassette tape) all the FSK radio signals, and play them back into the boxen.

      The Z8000 compiler used jump tables and the CPIR (compare/increment/repeat) instruction to implement switch statements. I was the kernel guru, and the error appeared out of a kernel error message. We didn't have any memory protection.

      What happened was that someone ignored a return value, and wound up indexing something by -1, which eventually wound up modifying a switch table...

      --
      General Relativity: Space-time tells matter where to go; Matter tells space-time what shape to be.
  14. Oracle desupported Rule Based Optimizer with 10g by garyisabusyguy · · Score: 1

    A trigger on a busy table was using a Rule Based Optimizer
    We had done a 'rough' system test for upgrading from 9i to 10g, but the system did not have a realistic production load put on it
    The DBA group placed the upgrade into production and suddenly the system drags to a crawl
    It took us a very short amount of time to figure out the problem, but a few hours to deal with the existing change control process and satisfying a DBA manager, who failed to let us know that there was a major change with the database release, that dropping the hint entirely (he had been on the team that introduced it years earlier) would be the best way to go since the new Cost Based Optimizer would recognize the query and make adjustments for it.

    --
    Wherever You Go, There You Are
  15. I still have a javascript bug by minstrelmike · · Score: 1

    I have a bug in javascript that I can't fix.
    I can't remember what it is now but it's documented in the code that if you remove the Are you sure? prompt (or remove the now-hidden debug statement), the code doesn't work. When you display the variable, or just wait and ask, then the code does work.

    Every couple years when someone scans thru the code, they'll spend a day or two trying to figure out what's really happening.

    1. Re:I still have a javascript bug by secretsquirel · · Score: 0

      let me guess, a setTimeout(function(){/*run code here*/}, 18) also fixes it?

    2. Re:I still have a javascript bug by eulernet · · Score: 1

      This has probably something to do with global/local scope.
      If your variable is declared with a "var", it's local, otherwise, it's global.
      You probably missed some var i, and your i variable is global, leading to random crashes if the loop is used at several locations.

  16. More of an update than a bug by coop247 · · Score: 3, Interesting

    First job out of college doing tech support for a big corp. One day thousands of Win2000 computers start taking multiple hours to boot up. Nobody can figure out what the problem is, got like 20 people working on it for almost two weeks.

    After digging through logs and error messages I discover than some idiot who had denied doing anything had sent out an update via our client management software to add a new local user for support purposes. He didn't do this via a script, rather "recorded" him adding it to a machine and then sent out a copy of the files and registry entries that had changed. Unbeknownst to this genius, the local security database is an binary (pretty sure encrypted) file that you can't just go copying between machines.

    I put together a script that repaired the local database and fixed the problem in a couple minutes. But literally had thousands of workers sitting around doing nothing waiting for computers to boot for like 2 weeks.

    --
    //TODO: Insert catchy phrase
    1. Re:More of an update than a bug by Anonymous Coward · · Score: 0

      How on earth did the computer still boot? I know it's Win2k but jeez, you'd think that a boot disk would be the only option...

  17. Address dependent assembly by Anonymous Coward · · Score: 0

    I wrote some Assembly (Saturn processor in case anyone cares) once that ran differently when run on a even memory address, vs run on an odd memory address. That was no fun to figure out, since the debugger always ran it on an even memory address!

    1. Re:Address dependent assembly by Anonymous Coward · · Score: 0

      lol, that's what "align" directives are for.

  18. debugger by BradMajors · · Score: 1

    One of my toughest bugs didn't exist.

    My code was actually working correctly, but the debugger until certain conditions would display wrong values. I wasted a lot of time trying to find the bug in my code.

    1. Re:debugger by Jeremi · · Score: 4, Interesting

      Some people, when trying to analyze a buggy program, think "I know, I'll use a debugger". Now they have two buggy programs to analyze.

      -- a grumpy old programmer

      --


      I don't care if it's 90,000 hectares. That lake was not my doing.
    2. Re:debugger by Z00L00K · · Score: 1

      Programs that crashes when running under a debugger are always fun, sometimes it's better and easier to run the program normally and then do a post mortem on the core file generated. Hence "generating core dumps" is a standing joke in some development.

      Fortunately the number of cases where a debugger don't work have diminished greatly over the years compared to how it was under MS-DOS.

      --
      If builders built buildings the way programmers wrote programs, then the first woodpecker would destroy civilization.
  19. Prayer can help your code life. by GoodNewsJimDotCom · · Score: 3, Funny

    I once had a hiesenbug, which was a simple dereferrenced pointer. The problem is that I had a couple thousand lines of code, and the bug wasn't where I was recently coding. Every coder knows to check for bugs in their most recent code, but a derefferenced pointer can be anywhere in the code. Anyway, I decided to break down and pray for help. Then within moments I read through a random line of code in some random file and debugged the problem. Since then, I often pray I do well in general, then I don't get stuck on a brick wall of tech, that God helps me while I code, and a host of other cool stuff. I find things flow more smoothly since then and I don't fight with code. I know God is real, and I've come to discover prayer does help too. In addition to that, I've been more careful with pointer math, biasing array memory structures more.

    1. Re:Prayer can help your code life. by frank_adrian314159 · · Score: 3, Insightful

      I'm glad you found the truth - that being more careful with pointer math and biasing array memory structures more is truly a blessing. May you also discover the higher truth that coding in languages that need no such nonsense (as their automated memory allocation and deallocation routines have been far better debugged than yours) is even more blessed and may lead you more quickly to the communion with defect-free code you desire.

      --
      That is all.
    2. Re:Prayer can help your code life. by cheesybagel · · Score: 2

      Use valgrind. It helps. A lot.

    3. Re:Prayer can help your code life. by Jeremi · · Score: 1

      I know God is real, and I've come to discover prayer does help too.

      Interesting; I found just the opposite. When I was a programming n00b working on my C assignments in college, and it was the night before it was due and I couldn't figure out why it was crashing, I tried praying, hoping, wishing, random changes to the code, furrowing my brow at the screen, loud cursing, exhaustive special-case-logic, and a dozen other increasingly desperate non-methods to "make the code work" without actually understanding it.

      Just before the 4 AM deadline for submissions, the code would still be crashing, so I'd give up, email in the non-working code, and get a poor grade.

      Eventually I realized that the only way to get the code to work was to understand what I was doing, and that if I didn't understand something I needed to learn about it (through experimentation, or reading the man pages, or asking a fellow programmer for help, or simplifying the program to make it more manageable, or etc) until I did understand it. Once I understood what was really going on under the hood, the nature of the problem (and therefore its solution) usually became obvious and trivial.

      I think it was this more than anything else that cemented my atheism -- the repeated experience of prayer not making a bit of difference, followed by the realization that only the application of logic and observation would lead me to the correct solution.

      --


      I don't care if it's 90,000 hectares. That lake was not my doing.
    4. Re:Prayer can help your code life. by Anonymous Coward · · Score: 0

      You weren't praying properly also at that point you didn't have the implicit knowledge he did, meditating would do the same thing for you now

    5. Re: Prayer can help your code life. by Anonymous Coward · · Score: 0

      Or, you failed to notice your prayers were answered by the realisation that applying reason and logic is a general requirement of being an averagely useful human being. But no, narcissism, has become your position. I will now flee to a safe place, using reason and logic to realise this is a good course of action...

    6. Re:Prayer can help your code life. by kilroy_hau · · Score: 1

      God is Real... ...unless previously declared as integer

      -Old Fortran joke

      --


      Kilroy was here!
    7. Re:Prayer can help your code life. by hackwrench · · Score: 1

      Which God?

      And regarding your web page, what do you have against eating beautiful things? And also how could you tell it was God telling you "good news", and not some person some distance away from you that was talking to some other person or just a voice that you imagined? And as for, "while I was driving, I felt unable to turn my eyes to look at billboards or be distracted from the road." my mom drives like that too. How often do you not drive like that? If your dad had legal ownership of that Bible you've probably seen it before.
      If "The Bible is God's infallible word, and that he guided the translators perfectly to copy it," how come there's so many different translations in English alone?

      https://en.wikipedia.org/wiki/...
      The exact relationship between the Book of Isaiah and any such historical Isaiah is complicated.[a] One widespread view sees parts of the first half of the book (chapters 1â"39) as originating with the historical prophet, interspersed with prose commentaries written in the time of King Josiah a hundred years later; with the remainder of the book dating from immediately before and immediately after the end of the exile in Babylon, almost two centuries after the time of the original prophet

    8. Re:Prayer can help your code life. by 140Mandak262Jamuna · · Score: 1

      God is Real... ...unless previously declared as integer

      -Old Fortran joke

      Shouldn't that read, GOD is REAL ?

      --
      sed -e 's/Chuck Norris/Rajnikant/g' joke > fact
  20. Test code by gnu-sucks · · Score: 1

    Write code to test your code. Hit every edge case hard, every boundary condition.

    All too often we tend to test our code by just running the overall program, but this is not good enough. Running the overall program does not introduce a wide enough range of input parameters to every function.

    Write test code. Write code to log your inputs and outputs to files early in the development cycle. Don't get swamped down in the land of trying to debug code that was never written to be debugged.

    I had many many tough bugs back in the day before I learned this lesson. Once I got this behind me, it was a lot easier.

  21. C library sleep(x) caused code instabilities... by Anonymous Coward · · Score: 2, Interesting

    My favourite head scratcher - back using Motorola's version of Unix, we had a voice response (IVR) application that would poll for activity, and otherwise sit idle using the sleep() command. The code had interrupt handlers SIGUSR (iirc) that would perform "real-time" activities as necessary (handling call hang ups, touch tone digit receipt, etc). When running under a load test scenario during a quality cycle, we kept running into scenarios where 1 in a 1000 or so instances of our event handlers were NOT handling the activities such as call hangups, missing digits, etc.

    After MUCH digging, having witnessed our interrupt handling code, half way through a trace, simply stop executing, we did a reverse disassemble of the sleep command, and found this jewel: a SETJMP on invocation, and a LONGJMP back to the stack location when the SIGALRM timer that it set ran out. Assumption being that while in the sleep() call, no other code would be executing. In reality, if our event handlers where running when the the SIGARLM timer ran out, the sleep call did a LONGJMP, restoring the stack back to its original state, wiping our interrupt handler off the stack.

    When Motorola was confronted, the first reaction was "no, we didn't do that. We're looking at the code." Only when we showed them the disassembled output did they admit there was an issue with the release of software we were using.

    That one took 4 days for me to track down as a junior programmer at the time, some 25 years ago.

    1. Re:C library sleep(x) caused code instabilities... by Z00L00K · · Score: 1

      Only 4 days - that's good for a problem like that - especially if you are junior.

      --
      If builders built buildings the way programmers wrote programs, then the first woodpecker would destroy civilization.
    2. Re:C library sleep(x) caused code instabilities... by TheRaven64 · · Score: 1

      longjmp from a signal handler to normal code is undefined behaviour anyway - there are lots of things that can break it (signal stacks are one). God created ucontext (unfortunately, while drunk) for a reason.

      --
      I am TheRaven on Soylent News
  22. Re:Hardly devastating, but a waste of several hour by Anonymous Coward · · Score: 0

    This is why printf is (usually) self-debugging

  23. Agreed by justthinkit · · Score: 1

    There are few things more exhilarating than writing code.

    Controlling the machine with just a few commands? Cool.

    Debugging said program, with just a few hours of stress? Not so fun.

    But, recompiling and rerunning said program when one is sure it is now bug-free? Like hammering the gas pedal of a muscle car!

    Oops, missed one. Screetch.

    Three of my favorite bugs or gaffs.

    (1) endlessly tweaking and commenting my autoexec.bat file. Only to eventually overwrite something with nothing. With absolutely no backup. Lesson learned? Don't waste time tweaking autoexec.bat

    (2) putting the computers at all our department's campus-spanning internet-connected locations into a tight "give me the code again" loop, and then going home for the night. Lesson learned? Normally polite campus sys admins spend much of their time counting the tens of gigabytes of data they reluctantly ship.

    (3) one can gain full-time summer employment from the pursuit of a single bug. Nerdy organic chem professor has custom chem. sim program made for the previous year's grad. students' thesis. Only it doesn't work. Lesson learned? Tiny variables need 32-bits of precision.

    --
    I come here for the love
    1. Re:Agreed by plopez · · Score: 2

      "recompiling and rerunning said program when one is sure it is now bug-free"

      That's a neat trick Please let us know how you do it.

      --
      putting the 'B' in LGBTQ+
    2. Re:Agreed by sublayer · · Score: 2

      ... [did something] With absolutely no backup. ...

      Lesson learnt: Backup

    3. Re:Agreed by Anonymous Coward · · Score: 0


      #include

      int main(void)
      {
          printf("Hello World\n");
          return 0;
      }

      Seems like a good start if your intention is to check if a bunch of compilers on different platforms work.

    4. Re:Agreed by Zero__Kelvin · · Score: 1

      Amazingly, you managed to fail to learn the correct lesson from each of your gaffs. I can only imagine the fun that would ensue if I tried to explain to my boss that I wasn't going to make changes to any system configuration files because I was once a moron and overwrote my autoexec.bat file.

      --
      Guns don't kill people; Physics kills people! - John Lithgow as Dick Solomon on Third Rock From The Sun
    5. Re: Agreed by Anonymous Coward · · Score: 0

      That code does not even compile.

    6. Re: Agreed by Muad'Dave · · Score: 1

      Slashdot ate his <.

      --
      Tiller's Rule: Never use a word in written form that you've only heard and never read. You will end up looking foolish.
    7. Re: Agreed by RabidReindeer · · Score: 1

      Must be a bug.

      Seriously. IBM has a program whose sole purpose in life is to do nothing. It went through 4 or 5 releases before they made it do nothing correctly. Then they had to do further maintenance on it when mainframes went 64-bit.

    8. Re: Agreed by RavenLrD20k · · Score: 1

      It's because the < and > symbols around the library name (presuming stdio.h) made the comment window think it was supposed to be an html tag instead of a library include statement. Preview button would work wonders. (Incidentally, my post would have had the same problems if I used the actual Shift-, or shift-. keys instead of using the ampersand codes for less than and greater than.)

    9. Re:Agreed by justthinkit · · Score: 1

      The humor in (1) and (2) was too hard to perceive, apparently. Let's try a more left-brained approach.

      Regarding overwriting Autoexec.bat, it is obvious that I should have been backing it up. Not stating that is me showing respect to slashdotters. Maybe you forgot the forum we are commenting on here?

      The more subtle issue is that the file only works with a single name, so there is a source code versioning problem right off the bat.

      Versioning it as, say, Autoexec_20150804.bat, would be a solution, except my story predates long filenames.

      Copying it to, say, dated folders of the form yyyymmdd would also work...unless the mistaken overwrite happened on the same day as a large number of earlier edits.

      The reality is that it is easy to open and edit Autoexec.bat.
      Easy to keep a single backup of it. And harder to keep it properly backed up.

      My tongue-in-cheek "lesson learned" was to not waste time documenting Autoexec.bat -- because it didn't matter. It was my Autoexec.bat only. There was no payback to messing with it further. And backup of it made it more problematic.

      Another lesson learned here is "beware of automated backup processes". They can be too good. Propagating (in this case a zip of all my .BAT files) a problem to multiple other places. So, offlining, or alternating backups is part of a better solution. etc.

      The real point? That there are a lot of lessons to be learned from editing one simple file. I had tried to document everything, so that it would be useable later...and ended up with nothing despite my best efforts at the time.

      So humbleness is a big part of doing something well.

      Speaking of a lack of humbleness, 0 K, how do you figure I learned the wrong lesson on (3) above?

      --
      I come here for the love
    10. Re: Agreed by plopez · · Score: 1

      Win.

      --
      putting the 'B' in LGBTQ+
    11. Re: Agreed by plopez · · Score: 1

      Yep. Now prove to me that it will compile everywhere all the time for all time in the future.

      --
      putting the 'B' in LGBTQ+
    12. Re: Agreed by plopez · · Score: 1

      Win. I think you get the point.

      --
      putting the 'B' in LGBTQ+
    13. Re:Agreed by david_thornley · · Score: 1

      You may not have noticed, but there are software systems that have been available for a few years that can manage different versions of a file that will have the same name. You can even revert to an earlier version when you like.

      --
      "When you have eliminated the unacceptable, whatever is left, however improbable, must be the truthiness" - Holmes
    14. Re:Agreed by justthinkit · · Score: 1

      Thanks. Known about them for years, if not tens of years.

      As my pre-long filenames reference suggests, my adventure in autoexec.bat land predates Win9x. Let's say it was 20+ years ago.

      It was my personal autoexec.bat file, not some major application. Of which I had several on the go at that time. Each of which had daily (& off site) back up procedures.

      --
      I come here for the love
    15. Re:Agreed by Zero__Kelvin · · Score: 1

      Just admit that you completely screwed the pooch and move on with your life.

      --
      Guns don't kill people; Physics kills people! - John Lithgow as Dick Solomon on Third Rock From The Sun
  24. Talk t the engineers who designed the chip by thinkwaitfast · · Score: 1

    about unpublished errata. According to the lead engineer (at a major cpu vendor) there are more hardware bugs than software bugs.

    1. Re:Talk t the engineers who designed the chip by Anonymous Coward · · Score: 0

      Don't divide, Intel Inside.

    2. Re:Talk t the engineers who designed the chip by PingSpike · · Score: 1

      At Intel quality is priority number 0.99987845!

  25. Great topic by FrozenGeek · · Score: 2
    Seriously, great topic.

    Two bugs come to mind, one that I wrote and fixed, one that I fixed but did not create. The one that I created was an assembler bug, code written in UKY-502 assembler (military computer). I screwed up one op code, specifying LK (load constant) instead of L (load from memory address). The difference in the code was one bit, but I had to single-step through the code to find the bug - took me hours for one stinking bit.

    The other bug, also on the UYK-502 computer, was a bug in the micro-code. The guy who wrote the micro-code for one particular instruction had ignored the user guide for the bit-slice processor and had implemented a read-modify-write operation in a single micro-code instruction. It worked for him because the timing hardware was slow enough. Unfortunately, a couple of years later, the manufacturer of one of the chips in the timing hardware improved the internal workings of the chip so that one of the line dropped sooner than it did on older versions of the chip (NB: the chip still met the same specs - it was just faster). Debugging was a pain. The computer used a back-plane, and the timing hardward and the bit-slice processor were on difference cards. When we put either card on a extender so we could connect a logic analyser, the delay added by the traces on the extender caused the problem to go away. It took two of a week to find the problem. The fix was to update the microcode ROMs for every computer that received the new timer card.

    --
    linquendum tondere
  26. Best lesson I learned from bugs? by frank_adrian314159 · · Score: 1

    Stop writing so many of them?

    --
    That is all.
  27. My best bugs weren't mine by Snotnose · · Score: 2

    For about 10 years I was a troubleshooter, they'd assign me something to work on and then interrupt me for a big ass bug.

    First big bug? Linux system would crash after about a week. Diagnosis? When it crashed it was out of FDs. Turns out a kernel resource was opening a file, exiting, and never closing the fd. Time to find? About a week. Time to diagnose? About a minute. Time to fix? About 10 minutes.

    How did I find it? Waiting until it died, did some built in command to see WTF happened, looked at the source code, fixed.

    Second big bug. System would reboot randomly within an hour to a week due to a watchdog timer firing. Even had a "magic" laptop that made it crash more often. Diagnosis? When you read from a register the chip would sometimes hang. Time to diagnose? About a month, most of that waiting for the damned system to crash. Didn't help I only had 1 JTAG, I couldn't do anything else while waiting for the sytem to crash. I spent a lot of time looking for interesting websites during that month. Time to fix? For me, about 30 seconds. It was a system status register, nobody cared except the hardware folks, I quit reading it. For the hardware folks? Don't know, don't care.

    How did I find it? It was a cellphone. When it restarted JTAG was initialized at the reboot point. I found the point in software that initialized the memory controller. As the system never lost power memory was intact. Found the process crashing. Then I created an in-memory array. As the code progressed I updated this in-memory array, stuff like "code does something, I put 0x10 into my array. Code does something else, 0x20 into my array". After a couple days of "it's just reading a register, I messed up somewhere" I finally concluded "reading this register causes it to crash about 1 time in 10,000"

    Third big bug? Cellphone base station. Card handled 3 T1 lines, did the analog/digital and digital/analog muxing for each call. Cells would randomly drop out after a day or so, they didn't come back until you rebooted the system. It's a base station, you never reboot the system. After about 3 months of this I got asked to look into it. I'm like, dafuq? It's a DSP issue, I don't know jack about DSP, I'm screwed. Honestly, I had no idea how to even approach this problem.

    The fix? I was telling myself how screwed I was, and I'd never get a raise, and generally killing time reading the docs. Found a library call that said "do not call this during an ISR". It was being called from an ISR. Sent email to the DSP folks asking them to comment out that line, they did and sent me the binary blob to load onto the card. I did, problem went away.

    1. Re:My best bugs weren't mine by flargleblarg · · Score: 1

      Diagnosis? When it crashed it was out of FDs.

      I usually just switch to using FEs and FFs when the FDs run out.

    2. Re:My best bugs weren't mine by tehlinux · · Score: 1

      Some government contracts require FDs though. Damn bureaucrats!

      --
      Most linux users don't know this, but the man pages were named after Chuck Norris. Chuck Norris fsck'ing hates noobs!
    3. Re:My best bugs weren't mine by Anonymous Coward · · Score: 0

      I was telling myself how screwed I was, and I'd never get a raise, and generally killing time reading the docs.

      Ha! That's me too. Don't feel bad, we've got to put *something* on our resumes, right?

  28. Incrementing by darkain · · Score: 3, Interesting

    One night while coding half asleep, I wrote the following to increment a variable in C++

    x = x++;

    The problem with this code is that it is an undefined behavior. It looks okay at first glance, and then when you consider the machine code that would be built from it, a bit of ambiguity arises. The problem comes in with the = sign vs the ++ operator. Both of which are assignment operators for the x variable, but it is not well defined which assignment should happen first/last. The code in use was actively being used in both MSVC and GCC environments, each producing opposite assignment ordering. This was awesome to debug, since the code "worked" on one platform but not the other!

    1. Re:Incrementing by Z00L00K · · Score: 1

      It's also a good way to ensure your code only works on your favorite platform!

      --
      If builders built buildings the way programmers wrote programs, then the first woodpecker would destroy civilization.
    2. Re:Incrementing by flargleblarg · · Score: 3, Funny

      Well, next time write:
      x = ++x;

    3. Re:Incrementing by CSMoran · · Score: 1

      x = x++;
      It looks okay at first glance,

      TBH, it screams "@fixme, sequence points" even at a first glance.

      --
      Every end has half a stick.
    4. Re:Incrementing by epine · · Score: 1

      Well, next time write:
      x = ++x;

      I've pretty much trained myself to never use post-increment unless a statement is incorrect without it, and even then I'm unhappy if the statement has any other side effect at all (unless the entire idiom is lifted straight from K&R, and then I ponder why the code is rolling its own iterator loop.)

      Post-increment can fail in interesting ways (yes, those darn sequence points). In addition, when using a template metaprogramming library, post-increment can trigger a large state copy that an unwary programmer doesn't expect. It can be horrifically less efficient.

      On the other hand, the ternary operator (even a compound ternary operator) has FAR FEWER semantic ass-bites that plain old post-increment.

      Post-increment: Visually familiar, but badly behaved.
      Ternary: Visually unfamiliar (to some), but well behaved.

      In the STL context, an important property of the ternary operator is that you don't have to declare the return type of the expression (whereas with an if/else assignment into an intermediate variable, you do). Maybe this is less important now with better "auto" support.

      A prudent ?: will also keep you on the straight and narrow with respect to the ODR. You can avoid re-typing shared sub-expressions. Anyone ever debugged a program where consecutive lines of code intended to contain an identical subexpression, but actually didn't? No, I didn't think so.

      Really, when someone complains about the ?: operator as some form of diabolical trickery, I flip the bozo bit. But you just can't get a programmer to embrace it for The Right Reasons who won't first master sequence points and the horror show of post-increment.

      Grasshopper, this is your debugger.

      Debugger, this is your new grasshopper. Enjoy your tasty meal.

    5. Re:Incrementing by GiganticLyingMouth · · Score: 1

      x = x++; It looks okay at first glance,

      TBH, it screams "@fixme, sequence points" even at a first glance.

      Whenever I see 'sequence points', I want to get all pedantic and point out that the term itself is deprecated as of C++11, in favor of using more precise terminaology concerning memory ordering (and we actually can now, because of the C++11 memory model). But then I refrain.

    6. Re:Incrementing by CSMoran · · Score: 1

      Thanks, I didn't know that.

      --
      Every end has half a stick.
    7. Re:Incrementing by Anonymous Coward · · Score: 0

      lol what?

      x++;

    8. Re:Incrementing by Anonymous Coward · · Score: 0

      Reminds me how i one day, wrote something like:

      x+=y;
      z-=a;
      b!=c;

  29. Mishandling handles by Tablizer · · Score: 1

    I once contracted with a shop that had a process that generated garbled output data rows. It appeared to be extra stuff that didn't affect (over-write) the intended rows. The shop had added an extra processing step to filter out the garbage rows and eventually just worked around the glitch.

    They had asked me to try to track it down, among other projects, because they were newbie programmers. I couldn't figure it out either because it never appeared in my intermediate trace statements. I put a trace (print) statement before every "write" in the program. None of the prints showed the garbage, yet garbage ended up in the output file. Head-scratcher galore. I was supposed to be "the expert", and thus feeling a bit deflated.

    On I think the last day of my contract, I was running a test copy of the code with some changes to perform speed tests. I went to try a certain speed tweak, and I suddenly spotted the error: the file handle variable was re-used for another non-handle purpose, something like this:

    fhandle = openFileForWrite(fileName);
    ...
    writeToFile(fhandle, someData);
    ...
    fhandle = countX + countY - 7;
    ...
    closeFile(fhandle);

    The actual handle name was something like "qhand". But a regular variable, "quantity on hand" ended up "qhand" also, the same name as the file handle.

    When it dawned on me what happened, I started screaming like a wildman and the others popped out of their cubicles to see what was going down. They took my coffee away :-)

    As far as the link on goofy video game bugs, I remember somebody discovered that if you don't put a game cartridge in all the way, certain characters dance and spin randomly and rapidly in the sky.

    It created an Internet meme, and spoofs started appearing all over, typically using stop-motion with live actors. I forgot the nickname of the meme, but I found it hilarious. It took my mind off the handle bug.

    1. Re:Mishandling handles by Z00L00K · · Score: 1

      It's way too common that people re-use variables for different purposes. Especially in Visual Basic.

      --
      If builders built buildings the way programmers wrote programs, then the first woodpecker would destroy civilization.
    2. Re:Mishandling handles by david_thornley · · Score: 1

      One of my most destructive bugs was because we were using a single variable to hold one value that was used to mean two different things. I was the lucky guy who wrote the line of code that modified the value to be correct for one use but not the other. When one is writing code that winds up in CNC machines, the results can be spectacular.

      --
      "When you have eliminated the unacceptable, whatever is left, however improbable, must be the truthiness" - Holmes
  30. Re:Hardly devastating, but a waste of several hour by JazzXP · · Score: 1

    Been there done that. Lol yep, lost half a day on it...

  31. Great cross-platform MS Office story by Cali+Thalen · · Score: 2

    http://blogs.msdn.com/b/rick_s...

    Read this years ago, and thought it was interesting at the time...I've saved the link for years. Really detailed story about finding a really complicated bug in MS Word way back in the day.

    --
    Chaos, panic, disorder...my work here is done.
  32. Re:With Republicans pushing... by Anonymous Coward · · Score: 0, Insightful

    Yeah.. That's why the Democrats are pro open immigration too.

  33. Re:Hardly devastating, but a waste of several hour by Anonymous Coward · · Score: 0

    This has happened to me too. Which is why all of the code now includes a startup banner that displays the time and date on which it was built.

  34. Self-Checking Code by Cassini2 · · Score: 3, Insightful

    I gave up on the concept that I would be able to write and debug programs correctly the first time. Now all the central data structures in any long-lived control system get error-checking code added to them. For example, the sorted-list code is built with a checker to ensure it stays in order. The communications code gets error-checking. The PID controllers get min/max testing, etc.

    Every once in a while I come across a bugs that are not in the source code. Often they are compiler errors. Sometimes the bugs involve a rare C/C++ or operating system eccentricity. Sometimes the errors are caused by obscure library changes. Sometimes they are hardware errors.

    Especially with the embedded micro-controllers, I leave the consistency checking code in, because you just can't assume the everything always works. The nature of software bugs change with time, and it is not always in the way a programmer would expect. I am frequently surprised by how obscure some of the bugs are.

    1. Re:Self-Checking Code by shoor · · Score: 1

      Yeah, I remember working on a program for an embedded system (Motorola 68000, this was back in the 80s). In every loop I had #ifdef DEBUG range check the loop #endif /* DEBUG */ (Having all these range checks in production code would have slowed down the poor old 68000 too much.)

      Finally, in testing, the thing would crash mysteriously and my boss finally compiled with DEBUG and one of my loops reported a problem. It turned out that my program was getting invalid data, but of course, in my boss's mind it was always 'my' bug.

      --
      In theory, theory and practice are the same; in practice they're different. (Yogi Berra & A. Einstein)
  35. Compiler by Anonymous Coward · · Score: 0

    Always check your compiler.

  36. Not a Republican by EzInKy · · Score: 0

    Not a Republican, but all us libertarians understand that in order to sell a thing there must exist people who can buy that thing. And, since the vast majority of things cost money, the more people who have money the more of those things you can sell. This isn't rocket science folks.

    --
    Time is what keeps everything from happening all at once.
  37. Runaway pointer shut down computer by davidwr · · Score: 2

    Back in my student days I had a runaway pointer. On one of mid-1980s Motorola 68000 Macs, it would trigger the power-off function if it wasn't running under a debugger. Talk about frustrating.

    At least it was consistent.

    Remember, this was back in the days before protected memory. Also, if memory serves, the MacOS and applications always ran in "supervisor mode" (analogous to "ring 0" on Intel chips), so your program 0wned the machine while it was running.

    --
    Knowledge is how to play a game, intelligence is how to win, wisdom is knowing what game to play.
  38. Re:Hardly devastating, but a waste of several hour by maugle · · Score: 2

    Oh man, that's happened to me twice, with several hours lost in each instance. I've sworn to never allow it to happen a third time.

  39. Someone wrote a novel about a computer bug by shoor · · Score: 3, Informative

    The novel is The Bug by Ellen Ullman.

    Here's quote from one of the reviewshttps://www.kirkusreviews.com/book-reviews/ellen-ullman/the-bug/:

    Her first fiction - which descends back into this realm of basement cafes and windowless break rooms, of buzzing fluorescents, whining computers, and cussing hackers - sustains a haunting tone of revulsion mingled with nostalgia. This artful tension distinguishes heroine Roberta Walton, who tells about the dramatic undoing in 1984 of Ethan Levin, a slightly odious but efficient programmer plagued by a highly odious but efficient computer bug.

    --
    In theory, theory and practice are the same; in practice they're different. (Yogi Berra & A. Einstein)
  40. Is return value optimisation a bug? by Anonymous Coward · · Score: 2, Interesting

    Because it stymied me for weeks years back when I first started in C++. I'd written some code that made assumptions about where variables were initialised and what happened when said variable were returned, using some custom stuff in operator= and the constructor. (irrelevant detail: I wanted to be able to return sub-matrices of a matrix that could be assigned to to overwrite the relevant parts of the full matrix. Think matlab A([1 2 3], [3 4 5]) = B overwrites part (but not all) of matrix A style. And I was fairly new to C++).

    Worked great without optimisation.
    Broke horribly when optimisation was turned on.

    It was a learning curve, but eventually google turned up a little thing called return value optimisation (or something-or-other ellision, it seems to have a few names). Basically, by design, how code executes (literally what it does) can be a direct function of your optimisation flags. Specifically what assignment operators etc get called, and in what order, when you start returning classes from functions.

    I know it's not technically a bug - after all, it's right there in page 5 billion point 2 of the spec - but still, it marked the end of my "my god C++ is amazeballs and can do no wrong" phase.

    1. Re:Is return value optimisation a bug? by TheRaven64 · · Score: 1

      It's called copy elision and it is part of the C++ spec. The spec specifically says that compilers may (but are not required to) implement it, meaning that the compiler is completely free to do different things within the abstract machine at different optimisation levels. It's definitely a bug, but unfortunately it's a bug in the C++ standard. Any sane spec would either require the compiler to implement it or prohibit it - either would be fine (with C++11, prohibiting it would make sense as returning an r-value reference and doing move construction ought to give the same benefit).

      --
      I am TheRaven on Soylent News
  41. Re:Hardly devastating, but a waste of several hour by Dutch+Gun · · Score: 3, Interesting

    Oh, damn... yeah, done that as well. Frustrating as hell, because it just doesn't make sense until you finally figure out you're not even debugging the code you're working with.

    Other variations of "the impossible is happening" include:

    * Syncing to new code, recompiling, and crashing. Crashes only go away once you force a full rebuilt to update stale precompiled headers.
    * Program crashes mysteriously, and only is fixed after the machine is rebooted (likely some process in RAM has been corrupted).
    * When you get automated crash debug reports from hundreds of thousands of customers, you eventually realize that a staggering number of people simply have bad hardware, due to the impossible crashes that occur (e.g. a = b + c; // --- crashes here. all variables are integers).
    * Compiler or hardware bugs - thankfully much more rare than they used to be.

    --
    Irony: Agile development has too much intertia to be abandoned now.
  42. Logged off everyone at once. by Darinbob · · Score: 2

    I had a job with a group managing shared minicomputers. One program I was writing was to log someone off after being inactive for some time, to free up a port for other users. So my loop to check every 5 minutes involved incrementing the time to wake up by 5 minutes on each iteration. Ie, it woke up at a specific time. So it would theoretically wake up at 12:00, 12:05, 12:10, etc.

    The problem was that this operating system for some reason blocked when sending the alert message to someone's terminal. There was possibly some non-blocking way to do this with some extra effort, but it didn't seem like any additional effort was needed. However some user type Control-S on his terminal and then went off to lunch, probably typed it by accident. So a warning message went to his terminal, but blocked because of the Control-S. So the program was stuck until he came back from lunch and typed Control-Q. At which point this unblocked my program which then printed out one after the other on everyone's terminal in two buildings:
    "your terminal has been idle and you will be logged off in 15 minutes",
    "your terminal has been idle and you will be logged off in 10 minutes",
    "your terminal has been idle and you will be logged off in 5 minutes",
    "logging off due to inactivity."
    This was shortly followed by a line of people coming into the office to complain, including my boss.

  43. 4000 is greater than 5000 by wolf12886 · · Score: 3, Interesting

    I was working on an embedded system recently that had a 5 minute timer to shut off the machine. We had received customer complaints that the machine occasionally shut off early. The code was a simple while loop that ran some pid controls and every loop checked "If (run_time > 5 minutes): exit;". I ran the machine in the lab for a while and sure enough, it shut off early once in a while. I looked through, and eventually SCOURED the code, assuming there was a subtle bug, such as clock corruption due to interrupts, or some kind of type conversion mistake, I couldn't find anything. I eventually set up a serial printout from the machine so I could see what was happening. And it would run and then print out "5 minutes elapsed, shutting down". No glitches or resets (which is what I was expected). So now I'm staring at this one line "If (run_time > 5 minutes): exit;", pulling my hair out. Finally in a moment of insane desperation, I added another line to the while loop. "if (4000 > 5000): print("Something is very wrong!"); I carry the machine to the lab and set it up, and IT PRINTS. Every few minutes or so it pops up on the display. So now I'm just like "fuck everything" how can I possibly run code if I can't even trust the basic principal that the computer will do what I tell it too. So the first thing I do is add triple checks to all critical comparisons, that eliminates the symptoms for now but I know it's going to cause weird problems forever if I leave it like that. Ok so the execution is buggy, I get out the scope and check the power line and various other things and it looks ok, but I notice at this point that the problem never occurs when the machine is running empty, only when it's loaded, so I clip ferrites everywhere you can possibly fit one and spend half a day putting metal covers on everything. As I run the machine this time I'm practically holding my breath, 1 run good, 2, 3. I'm getting super excited at this point, then bam "Something is very wrong!" prints and I die a little inside. After walking out to my car and screaming at the sky for a while, I get back to it. At least I know it has something to do with noise. Since the machine can't possibly be more shielded a take a look at the schematic, it looks normal, but there's a bunch of funky stuff on the reset line. I ask around and nobody knows why its there. It's got a regular pull up resistor, but somebody added a diode in series, and a ferrite bead right before the pin. Due to the voltage drop the MCLR is only being pulled up the 3.9v instead of 5v, so that's not good. Then I take a look at the ferrite on the board and it's sticking off the board with a coil of wire through it not 2 inches from a brushed motor the size of my fist. It must be acting like a transformer secondary. I shorted the diode and the ferrite and the problem never happened again!

    1. Re:4000 is greater than 5000 by Anonymous Coward · · Score: 0

      I think you should debug your 8042 chip. Or hit the enter key once in a while.

    2. Re:4000 is greater than 5000 by Anonymous Coward · · Score: 2, Informative

      And this is why the world will not be taken over by 12 year olds creating "apps" using a toolkit at 40 hours of Khan Academy.

  44. Re:Hardly devastating, but a waste of several hour by Anonymous Coward · · Score: 0

    This happens in Visual Studio all the time. Often you'll find that a unit test session doesn't properly release the SUT, especially when you have "CopyLocal = true" on the reference in the unit test project. There'll be a missing FreeLibrary() call on the .dll to balance out the LoadLibrary() calls and for whatever reason Windows doesn't catch it so Clean Solution/Rebuild Solution doesn't actually overwrite the previous version of the .dll. Usually the only way to fix it is to close Visual Studio and reopen it although occasionally we've had to reboot a developer workstation as well.

  45. Not my fault by Anonymous Coward · · Score: 0

    Reading TFS, it's clear that the lesson is to blame someone else. Be it the professor or the guys who built the financial instruments. Let's try to roll a few "MBA types" under the bus while we're at it. So long as it doesn't stick to me!

  46. mid-90's network card on linux by Anonymous Coward · · Score: 0

    ugh. this one stumped me, and was deeply unsatisfying since it was never resolved. There was this model of NIC used in our school where Iworked hw support. The card worked for years, dated to 1995 or so. We used Debian, I think this was back in the Potato days. We thought we were safe on Debian. The upgrade happened, and the NIC LED simply would not turn on with the upgraded system. We modified the network and driver config files, modified the kernel configurations, passed load-time boot parameters, checked old/new installation settings, hacked the driver, recompiled, re-installed old and new versions to re-verify that the old distribution worked while the new one didn't. My colleagues weren't slouches either. In retrospect, we should have abandoned the efforts, it was a waste of time and effort. Even then, the cards were nearly worthless. But we were young.

    1. Re:mid-90's network card on linux by Z00L00K · · Score: 1

      Well - I had a similar network card experience once, an 8-bit card was incorrectly identified as a 16-bit card. Some digging and I found the driver code that made the wrong assumption and patched it. I should have filed a bug correction on the kernel but I never did, and today it's pretty much no point in doing it. Card in question was a 8-bit Western Digital compatible card from Accton.

      --
      If builders built buildings the way programmers wrote programs, then the first woodpecker would destroy civilization.
  47. while-while loop in C/C++ by Cassini2 · · Score: 4, Funny

    while (something) {
    // do_stuff
    } while (something_else);

    It compiles, is legal C, and loops endlessly if something_else is true.

    It can be done in a careless moment when switching a complex piece of code from a while () loop to a do-while () loop.

    1. Re:while-while loop in C/C++ by Anonymous Coward · · Score: 0

      This made me cringe, and then laugh so hard I almost choked.

    2. Re:while-while loop in C/C++ by Anonymous Coward · · Score: 0

      Thanks for that - now the boss's secretary wants me to share what's so funny - help!

  48. Re:Hardly devastating, but a waste of several hour by bl968 · · Score: 1

    Been there done that!

    --
    "GET / HTTP/1.0" 200 51230 "-" "Mozilla/4.0 (compatible; Setec Astronomy)"
  49. Parallel distributed race condition by Anonymous Coward · · Score: 1

    In the 90s I prototyped some communications code was to bootstrap a supecomputing job on 6-10 sites with 64-128 nodes running the job at each site. We ran flawless rehearsals where the site operators reserved perhaps 10% of their machines for half a day so we could submit test jobs without much queue latency. Since there were other groups preparing jobs for an annual conference, machine time was scarce. We wouldn't get to run a full-scale test until the week of the conference, when networks and sites were reconfigured for the occasion. It kept failing with assertion failures suggesting dead-locks in the synchronization code.

    After hours of staring at code and debug logs, I finally submitted a full-scale test run where every node was configured to run their task under gdb inside an xterm with the DISPLAY set to a laptop I'd borrowed at the conference show floor. This way, I could wait for one node to crash out, leaving the rest hanging waiting for peer messages. Then I went through hundreds of gdb instances looking at stack traces and inspecting application state.

    I found that the hierarchical all-to-all message exchange was desynchronizing due to an unfortunate design blunder where I wrote some code as if it was an MPI task with a barrier-synchronization primitive, but no such primitive existed at this early bootstrapping phase and across multiple sites. My brute-force solution was to rescue this broken design by enforcing its naive implementation invariants... I added virtual clocks, counters, and reordering message buffers to make sure that all messages from one phase of the communication where consumed at a receiver before it would process messages from a later phase, even if those later phase messages arrived first due to skew among all the communicating nodes.

  50. Re:Hardly devastating, but a waste of several hour by Z00L00K · · Score: 1

    That's on par with rebooting the wrong machine.

    --
    If builders built buildings the way programmers wrote programs, then the first woodpecker would destroy civilization.
  51. Failure to scale worse than crashing by tdelaney · · Score: 2

    We had a program that was doing session matching of RTP streams (via RTCP). We had to be able to handle a potentially very high load.

    Things had been going OK - development progressing, QA testing going well. And then one day our scaling tests took a nosedive. Whereas we had been handling tens of thousands of RTP sessions with decent CPU load, suddenly we were running at 100% CPU with an order of magnitude fewer sessions.

    I spent over a week inspecting recent commits, profiling, etc. I could see where it was happening in a general sense, but couldn't pin down the precise cause. And then a comment by one of the other developers connected up with everything I'd been looking at.

    Turns out that we had been using a single instance of an object to handle all sessions going through a particular server, but that resulted in incorrect matching - it was missing a vital identifier. So an additional field had been added to hold the conversation ID, and an instance was created for each conversation.

    Now, that in itself wasn't an issue - but the objects were stored in a hash table. Objects for the same server but different conversations compared non-equal ... but the conversation ID hadn't been included as part of the hashcode calculation. So all conversation objects for a particular server would hash the same (but compare different).

    We had 3 servers and tens of thousands of conversations between endpoints. Instead of the respective server objects being approximately evenly spread across the hash map, they were all stuck into a single bucket per server ... so instead of a nice amortised O(1) lookup, we instead effectively had an O(N) lookup for these objects - and they were being looked up a lot.

    The effect was completely invisible under low load and in unit tests. The hash codes weren't verified as being different in the unit tests as there was the theoretical possibility that the hashcodes being verified as different could end up the same with a new version of the compiler/library/etc.

    1. Re:Failure to scale worse than crashing by Anonymous Coward · · Score: 0

      > the theoretical possibility that the hashcodes being verified as different could end up the same with a new version of the compiler/library/etc.

      Ah, the good old "let's not write a test because the test might fail/have a false positive". :)
      Lesson to learn: Usually even a really horrible and unreliable test is better than no test.

  52. Re:Oracle desupported Rule Based Optimizer with 10 by Z00L00K · · Score: 2

    On the level of someone changing order of columns in an indexing for no particular reason, possibly because it looked better to have the index column in alphabetical order.

    --
    If builders built buildings the way programmers wrote programs, then the first woodpecker would destroy civilization.
  53. Heisenbug, never crashed with debugger open by raymorris · · Score: 1

    One I'll always remember was some Actionscript or Javascript which would never happen with the debugging console open, but would always halt the program if the debugging console was closed.

    It turned out to be a call to console.log, which is a fatal error in IE if the debug console isn't visible at the moment.

  54. Slashdot footer quote by Z00L00K · · Score: 1

    When viewing this I got the footer quote "%DCL-MEM-BAD, bad memory VMS-F-PDGERS, pudding between the ears".

    I find it very suitable to this article.

    --
    If builders built buildings the way programmers wrote programs, then the first woodpecker would destroy civilization.
  55. No news is good news by Anonymous Coward · · Score: 0

    If you do not put any new code in it you ain't gonna introduce any new bug in it

    1. Re:No news is good news by plopez · · Score: 1

      Fail

      --
      putting the 'B' in LGBTQ+
  56. Takeaway from the summary by goose-incarnated · · Score: 0

    "tough bugs weren't my fault"

    --
    I'm a minority race. Save your vitriol for white people.
  57. Schroedinger's softwares by RubberDogBone · · Score: 1

    Worst bug we ever ran across was a program that absolutely would not work as soon as anyone looked at it to see if it was working or just to observe the GUI. If you did that, it broke. So we spent a LOT of time trying to run it, debug it, rerun it, and no matter what we did it never worked right as long as someone was looking.

    But the moment you stopped looking, locked that PC and walked away, the program would run fine on files dropped into the appropriate input hot folder. It would happily do its thing and give you good results in an output folder.

    Look at it, however, and it blew up immediately.

    We spent a LOT of time trying to fix this, however it's not a product we made. So there's a limit.

    We did finally realize the program is licensed only for the PC where it is installed. If you remote into it, the program sees the remote client as if it's the installed machine which has no license and thus it runs in kind of a suicide mode. When you stop looking at it and drop the remote connection, the license assignment returns to normal and the program runs again.

    I documented and discovered all of this and solved a major headache for our company, because we had to have this software working. I unlocked the secret and got it working. My FUCKING BOSS stole credit for the discovery and promptly began parading around like she thought of the solution.

    Which is funny because guess who they called next time it broke? Me. I know it inside and out, now.

    --
    Sig for hire.
  58. Re:Hardly devastating, but a waste of several hour by Anonymous Coward · · Score: 1

    Let us now stop and give praise and thanks to gdb's tui, for it doth warn that thy source file's mtime art newer than thy executable's.

    I can't count how many times GDB has either saved me from this directly, or short-circuited what might've been a whole day's hair-pulling frustration.

  59. Re:That IS scary!! by DamonHD · · Score: 1

    Why the snark?

    Yes, I also designed the hardware target of that code, wire-wrapping the first unit, and writing the 'OS' in a mixture of C and asm.

    The asm equivalent was accidentally starting my NMI routine with "push hl; push de" and ending it with "pop hl, pop de". That anything worked at all was a minor miracle, and it did for months before I noticed.

    Rgds

    Damon

    --
    http://m.earth.org.uk/
  60. Bug that improved code by Anonymous Coward · · Score: 0

    In the 80's, I wrote a pursuit game where you had to run away from something chasing you. To give you a chance, the chaser was dumb and headed for your current position. When objects got to the edge of the screen, they bounced off. I got the signs in the logic of that wrong, which made the chaser head for where you were going instead, so it always got you quicker. Only time I wrote a bug that improved (in some sense) the code...

  61. As A Tester by Anonymous Coward · · Score: 0

    I was a tester on Desert Bus. Had to check if the score would roll over if it went too high. *sobs*

    Lesson learned: Mandating everything must be tested in a live environment is not always the best policy. Sometimes it's better to fiddle with the code.

  62. My worst ever bug by adhdengineer · · Score: 1

    I was writing an audio framework using ASIO. I had to create the device and get it all setup then start it. On one soundcard everything was fine, worked like a charm. On a different card? boom. reboot. every time. even in the debugger. I spent weeks trying to track it down. Eventually managed to get it in the debugger in release mode to break. as soon as i hit continue, it rebooted. I saw it was crashing as it was calling ADDR+12. I tracked it down (after three weeks of this) to the struct you pass in that give ASIO the callbacks. I had been creating an instance of the struct on the stack and passing it by reference into the ASIO call. One driver copied the contents, one just stored the pointer. Driver calls callback with the now defunct stack address and over it goes. Still gives me shudders.

  63. Best bugs by m.dillon · · Score: 1

    Most time consuming bug - The AMD cpu stack corruption bug. Errata 721. It took me a year to track it down. Half that period I thought it was a software bug in the kernel, for a month I thought it was memory corruption in gcc. And most of the rest of the time was spent trying to reproduce it reliably and examine the cores from gcc to characterize the bug. Somewhere in there I realized it was a cpu bug. It took a while to reduce the cases enough to be able to reproduce the bug within 60 seconds. And the last week was putting the whole thing together into a bootable USB stick image to send to AMD so they could boot up the test environment and reproduce the bug themselves.

    Bug that was the most fun - The 6522 I/O chip was a wonderful multi-feature chip with a lot of capability. There was a hardware timer bug which could jam the timer interrupt if it timed out at just the wrong time.

    My general advice: Add assertions for complex pre-conditions instead of assuming that said complex pre-conditions are always properly in place. The more non-stupid assertions you have in your code, the earlier you detect the bug and the easier it is to fix.

    -Matt

  64. Lesson 1 by Anonymous Coward · · Score: 0

    Lesson 1: Read your own code and understand what it does.
    Lesson 2: Reread your own code and understand what it does.
    Lesson 3: If you don't understand what it does, that's a bug.

  65. A tough one by jandersen · · Score: 2

    Looking around, it seems that most people take 'tough' to mean 'spectacular'; I disagree with that. I think some of the most difficult bugs are the subtle ones that don't give many symptoms, or which masquerade as something else.

    Probably the hardest one to solve - or the one that required most insight - was in an application is worked with on Windows NT. The architecture was messy, to say the least, with anonymous pipes everywhere, but the real trouble came from the toolset, which tempted developers into doing stupid things. I think it was written using a an IDE for C++ from Borland (I forget the name), and they had got this 'brilliant' idea of making a number of objects that you could drag onto your design surface to create a Windowed application with automatically generated code behind. One class of objects were for things like FTP, etc, which was used in a central place. The problem, as it turned out, after I had thought deeply about it, was that network communication is asynchronous by its very nature, whereas the graphical toolset in Windows is non-reentrant, meaning that it is not a good idea to call functions that update the desktop before they have returned from a previous call. See what I mean: When a network packet arrives, you update your progress bar or whatever, which looks cool - but if the next packet arrives too soon, it tends to kill not just the application, but the whole desktop. The solution was to not use the network objects at all and instead rely on POSIX network calls running in a separate thread and communicating to the main loop via a pipe. Not quite synchronous, but much more robust.

    1. Re:A tough one by geggo98 · · Score: 1

      Lo[...] I think it was written using a an IDE for C++ from Borland (I forget the name), and they had got this 'brilliant' idea of making a number of objects that you could drag onto your design surface to create a Windowed application with automatically generated code behind. [...]

      You probably worked with Borland C++ Builder. The library you were working with was probably the Visual Component Library (VCL) that this tool shared with Delphi. These tools were nice for rapid prototyping of GUI applications but lacked when used for non-GUI things, like network communication or database access.

    2. Re:A tough one by Anonymous Coward · · Score: 0

      I had an issue with an O2 meter value drifting that ended up being due to the sun shining on an opto-coupler. We had (unbeknownst to me) recently switched vendors and the new couplers were white instead of black. Every day at a specific time (depending on where the device was in the lab) the sun would shine on the coupler and the value would drift. It took forever to find the issue.

    3. Re:A tough one by byteherder · · Score: 1

      You mention "Window NT and bugs" and that brings back some bad memories.

      I was writing a program in MS Visual C++ on Window NT, and encountered a place in my code where the program would always crash. No compiler errors. The syntax was all correct. I then started walking through the assembly code with a debugger. In the middle of my function some system call was erroneously throwing an exception and crashing my program. Thank you Microsoft. So in the middle of my code, I added this try/catch block, empty brackets and all....

      try
      {
      }
      catch()
      {
      }

      That caught the Windows NT system error and allowed my program to continue. Whoa to the future programmer that removes it because is "obviously" does nothing.

    4. Re:A tough one by dotgain · · Score: 1

      Whoa to the future programmer that removes it because is "obviously" does nothing.

      And boo to you for not commenting why that code is there.

    5. Re:A tough one by byteherder · · Score: 1

      Actually, it was commented so that the next programmer would know why this seemingly absurd code was there and not to remove it.

  66. Bug created by SVN after testing was complete by Anonymous Coward · · Score: 0

    I have a very memorable bug:
    I got a new work PC and set up my environment. Checked out the repo with the SVN command line, worked on a new feature and tested it.
    When it was ready, I got the TortoiseSVN client for windows (ease of use) and submitted. And after that the software didn't work anymore. Had really curious bugs I've never seen before.

    After half a day of investigating I found out: TortoiseSVN installs a shell extension DLL which also gets loaded into your process if you use any windows shell functions, for example GetOpenFileName. The shell extension had a bug that caused it to change the C locale of the whole process, which obviously my application didn't expect to get randomly changed.

    That was a something marvelous to find out.

  67. Not Very Hard by speedplane · · Score: 2

    Many of the "hard" bugs discussed in the article do not seem very hard. Divide by zero errors and a +Inf in an input file are straightforward issues that should be caught using standard practice techniques (bounds checking and exception handling). Two of these three hard bugs would have been easy to catch with version control and continuous integration. It seems like the article is more about dealing with other people's crappy code and poor software development practice rather than debugging nasty bugs.

    The nastiest bugs are almost always race conditions, which are by their nature non-deterministic and may not be reproducible across time or certain hardware.

    --
    Fast Federal Court and I.T.C. updates
    1. Re:Not Very Hard by Buchenskjoll · · Score: 2

      The nastiest bugs are almost always race conditions,...

      I was just about to write that, but you beat me to it.

      --
      -- Make America hate again!
    2. Re:Not Very Hard by Walter+White · · Score: 1

      The nastiest bugs are almost always race conditions, which are by their nature non-deterministic and may not be reproducible across time or certain hardware.

      That is certainly the problem with one of the toughest bugs I faced. It boiled down to a flag and value being set in a main thread to pass information to code running in an interrupt routine. The only thing that revealed it was exhaustive testing. Once in thousands of tests it would screw up. I studied the symptoms and postulated that the only way this could happen is if the ISR operated based on the flag setting but the value it needed hadn't been set. I examined the code and found that the flag was being set and the value assigned in the next statement. (Doh!) The only time the bug bit was if the ISR fired between the two assignments. Reversing the assignments solved the problem.

    3. Re:Not Very Hard by speedplane · · Score: 1

      I had a race condition in my code, my product would crash randomly once every few days or weeks. I killed myself trying to reproduce it reliably. I wrote software that would instrument the code, adding random sleep timers between each line. That didn't work. I eventually went line-by-line trying to deduce the issue, and found two potential bugs by thinking through it. I never knew which of my two fixes fixed the issue (or if either did), but I never saw the bug again.

      --
      Fast Federal Court and I.T.C. updates
    4. Re:Not Very Hard by speedplane · · Score: 1

      Another hated class of bugs are library bugs. These may actually be quite easy to debug, but they force you to go into someone else's spaghetti code and spend countless hours becoming a master of some library you'll never use again.

      And the worst part is that often they aren't really bugs, but programming errors as a result of crappy documentation.

      --
      Fast Federal Court and I.T.C. updates
  68. Concurrency bugs must be the worst! by EmBeeDee · · Score: 1

    I do remember struggling with compiler bugs - in particular the C++ compiler we were using on OS/2 way back in the day suffered a few - but I was primarily a machine-code programmer in those days so an inspection of the compiled object code would tell me what was wrong pretty quickly; plus the compiler guys (I think it was Watford) were very responsive and would usually fix things up pretty quickly. So no, the bugs that have most challenged me have always been concurrency-related - deadlocks, race conditions and the like. My earliest experience of this was probably on the Atari ST. I was coding a game in 68000 assembler, one of the early 3D-rendered golf games. Whilst the golfer (a sprite) was taking his swing I needed to pre-render the first 3D frame, so hooked up a hardware interrupt to run the renderer whilst the golfer animation took place (or maybe it was the other way around; doesn't matter). Anyway it all worked nicely except, just occasionally, a rogue red pixel would appear in a random part of the screen. It took me about a month to figure out that there was one variable/memory location being read and written by both bits of code, with no mutex round it. I guess it was good to learn a hard lesson about concurrency early in my programming career.

  69. Soviet debugging by Anonymous Coward · · Score: 0

    http://jakepoz.com/soviet_debugging.html

  70. What's the chance... by Walter+White · · Score: 1

    Mine wasn't particularly hard but was particularly funny. I was working on "blocking" for a guided vehicle system. Vehicles followed a guidepath buried in the floor which was broken into segments. It was (mostly) sufficient to make sure that no vehicle was in a segment before another vehicle was allowed to enter it. While developing this code a developer on another project ran into a problem where a small circle in the guidepath could be filled with vehicles which would then deadlock because none had an empty segment in front of them.

    I realized my project had a similar configuration, a system with 5 vehicles and a circle with 5 segments. I thought "what is the possibility that all five vehicles will be in the circle at the same time" and did nothing about it. Within 15 minutes of getting all five vehicles working on site they were all sitting deadlocked in the circle. I manually moved one out of the circle to break the deadlock and they soon wound up back in the circle. It was comical, like they were drawn to that area so they could deadlock and take a break.

    What I hadn't realized was that the vehicles had to traverse some part of the circle to go between to any two destinations on the guide path. I remind myself of this any time I'm tempted to ignore a problem just because I think it unlikely to happen.

  71. Essential debug statements by whimdot · · Score: 1

    Developing some embedded software we had a common issue when adding new features that the code would crash when outputting strings to the console, until we added some debug code to identify the problem, when the crash would stop happening. We were in a hurry and so the code generally got shipped with the debug code suitably disabled but still present. I had some extra time one day and decided to investigate this, but couldn't find any coding errors. I eventually got around to looking at the output of the linker/locater to discover that the problem was related to trying to print the last declared string to the console. It emerged that the build tools would fail to append the closing null to the last string stored in the initialised memory portion of the image. Stored to EPROM, some of those final strings ended up with a lot of FF characters appended.

    1. Re:Essential debug statements by CauseBy · · Score: 1

      Ha ha I had that happen. I wrote some code one day, it worked great, no problem, so I deleted my debug statements and turned it in. I got an angry email saying my code was broken, and when I ran it, sure enough it was broken. During debugging I realized that my debug statements were keeping a multi-threaded problem hidden. That was an early lesson in synchronization.

    2. Re:Essential debug statements by david_thornley · · Score: 1

      That should also be an early lesson in testing the exact version you're going to check in. (Mine was back in the 90s.)

      --
      "When you have eliminated the unacceptable, whatever is left, however improbable, must be the truthiness" - Holmes
  72. Re:Hardly devastating, but a waste of several hour by Rufty · · Score: 1

    Ah, me too.

    --
    Red to red, black to black. Switch it on, but stand well back.
  73. Re:Hardly devastating, but a waste of several hour by boomer_rehfield · · Score: 1

    "That's ...strange... why did that prod server just alert offline... ooohhh crap..."

    Nope, totally haven't done that.

    --
    Carpe Canem - Seize the Dog
  74. I have seen... by Anonymous Coward · · Score: 0

    I have seen... things you can't imagine. I feel like a replicant in the rain, explaining my life and death. ( https://www.youtube.com/watch?v=ZTzA_xesrL8 )

    My own bone-headed prize goes to when I was clearing away debris files left by typos from a previous admin, and accidentally deleted "/bin/[" on an old BSD machine. That's a symlink to "/bin/test" and is part of the logic of most shell scripts, There were other typos there, like "/bin/-r" from old mistyped "rm commands" so there was in fact debris to clean. But it took me 3 days to bring that system back up, the bootstrap TK-50 tapes hadn't been tested in years and didn't work anymore, and the mag-tape drive was on the wrong VMEbus to restore from tape with. There were compelling reasons we didn't touch system file on systems running BSD on a VAX lightly.

    My boneheaded prize for others goes to the kernel idiots who insisted on building their kernels on their desktop machines and sending them to me to install, rather than publishing their code changes and using the build system I wrote. Unfortunately, they failed to merge the patches that had been in place for 2 years. So when their "new kernel" was installed, it didn't have the hardware support for the disk drives that the manufacturer had upgraded to in new hardware and had never made it into my rack of test hardware. So when we kernel updated the network, an entire continent went mostly offline: only the old servers were up.

    Because I'm an absolute paranoid weasel, I'd learned the hard way to make failed kernel upgrades recoverable: Set the boot loader to use the new kernel *once*, and only once, and activate it as the default only if the reboot succeeds running the new kernel. If it's not running the new kernel, keep the old kernel as the default and revert to it on the next reboot. But somebody had to go and powercycle *all* the new servers.

  75. Tiberio's Law of Mutually Cancelling Bugs by mtiberio · · Score: 1

    Don't forget Tiberio's Law of Mutually Cancelling Bugs... While doing a code review of otherwise "working code" you fix an obvious flaw, and as it turns out it was preventing another latent bug from manifesting. Now go find it...

  76. New financial instrument being traded .. by nickweller · · Score: 1

    Is trading a 'financial instrument' the same as making a bet?

  77. Web Development: Desperately fixing on staging ... by Qbertino · · Score: 1

    ... and testing for the fix on live at the same time. Noticing after an hour of desperation. .... Arrrrrgh!!!!

    --
    We suffer more in our imagination than in reality. - Seneca
  78. 2 of the biggest WTF by Snotnose · · Score: 1

    Neither of these was hard to diagnose. First was back in the 80s, when automated circuit board assembly was new. Got a batch of boards that didn't work. Turns out somebody had loaded capacitors where resisters should have gone, all our RAM lines had capacitors instead of pullups on them. Whoops.

    Then about 10 years ago we get an ASIC from the fab. The clock was all over the place, you could hook a scope up to it and watch it vary from, say, 10 MHZ to 500 MHZ. Turns out that, after running a suite of tests on the VHDL before sending the VHDL to the fab, one of the hardware guys forgot to turn his DEBUG switch to OFF. This left a diode in the phase locked loop that prevented the loop from locking. That was a million dollar mistake that also caused a 6 week schedule slip.

  79. Turn them all off, see who screams. by TapeCutter · · Score: 1

    Yes, early days of MSVC (v1.52 on win 3.1 IIRC) was one of my most memorable bugs. It appeared in a new release of our app where a counter was incrementing by 2's and severely screwing up a job dispatch system servicing 6000 telco workers. Running the code in the debugger we watched as the counter jumped by two as we stepped thru a single line i++ statement. Sure enough when we opened it up in assembly we found an extra INC op? I rebuilt to binary using the same build tag and environment, the bug disappeared? It wasn't a particularly difficult bug to fix, but the fact that we couldn't reproduce it from source and never found a better explanation than "cosmic ray" or "Microsoft, pfft", is why it has stuck in my mind for 20yrs.

    Disclaimer: I currently manage a large and ancient cvs repository, over the last decade or so I have constructed and maintained an automated build system for about a dozen active projects and a couple of dozen legacy versions that services a team of 25-30 devs plus offshore subcontractors. I have had similar head banging moments wrt compiler optimizations. What I have learned from those experiences is that optimisation often has no noticeable impact on the end customer, so unless a developer can convince me that a specific optimization is critical to an application's performance, I always have them turned off and ask our devs to do likewise.

    --
    And did you exchange a walk on part in the war for a lead role in a cage? - Pink Floyd.
    1. Re: Turn them all off, see who screams. by Anonymous Coward · · Score: 0

      My experience: if you're doing incredibly complex, processor intensive simulations (the sort of things that take days or weeks to run) or similar then optimisation is worth it. For everything else it's a pointless headache source to be avoided.

  80. no win situation by Anonymous Coward · · Score: 0

    The original release of Skeleton+ for the Atari 2600 had a bug which caused the "You Win" screen to generate garbage rather than a stable display. The bug turned out to be in Skeleton (hand coded in 6502 ASM) I'd used the label for the code which followed the "You Win" graphics data as the end-of-data marker. When I modified Skeleton to create Skeleton+ that chunk of code got moved, but the I'd forgotten I'd reused the label.

  81. Hardware bugs by TapeCutter · · Score: 1

    A colleague and I once found a hardware bug that affected ~2000 motorola modems that we were using for a (1990's) mobile app. The problem was the modem became "emotionally attached" to the first tower it found and refused to talk to any other tower even when its original partner was well out of range and other towers were within easy reach. Tough one to crack for a couple of software guys, took a couple of weeks and a trip to Queensland.

    --
    And did you exchange a walk on part in the war for a lead role in a cage? - Pink Floyd.
  82. BumbleBee by Anonymous Coward · · Score: 0

    It is not my bug but in a list of epic bugs this one should not be missing.
    https://github.com/MrMEEE/bumblebee-Old-and-abbandoned/issues/123
    A spurious space caused the install script to delete your root mount point...

  83. Re:Hardly devastating, but a waste of several hour by TapeCutter · · Score: 1

    Sounds familiar :)
    I notice I do it more often than I did 20yrs ago. Some say it's old age but I think it's probably due to regularly working on multiple VM's via a laptop as opposed to the old days of a stand alone dev box sitting under the desk.

    --
    And did you exchange a walk on part in the war for a lead role in a cage? - Pink Floyd.
  84. The bugs that seem to get me by Anonymous Coward · · Score: 0

    Are very non-obvious like a typo or a whitespace issue. Though I do recall one particularly annoying bug that dealt with encryption/compression and only manifested itself in rare cases. It drove us nuts for months.

  85. "the code was correct" by Anonymous Coward · · Score: 0

    'The code was correct, but the exception happened because a new financial instrument being traded had a zero value for "number of days," and nobody had told us,' he writes.

    Sounds like it wasn't correct after all...

  86. When you have been around for a while by TapeCutter · · Score: 1

    crash..debug..crash..debug
    Mumble, mumble,...@#$!...what moron wrote this code!
    Scroll...mumble...scroll..
    Oh, I did,...let me read that again.

    A friend of mine once described the above phenomena as "source code is like shit, you can't smell your own"

    --
    And did you exchange a walk on part in the war for a lead role in a cage? - Pink Floyd.
  87. First big bug by Cro+Magnon · · Score: 1

    This happened when I was still in school. One of my COBOL programs worked fine at first, but then gave results that didn't make any sense at all. I looked at it until my eyes crossed, the code looked fine, but the results didn't. Finally, I was taking a crap, and something occurred to me. After I completed my important business, I took another look and realized that one of the table subscripts was getting set to 0, and writing outside the table.

    I learned an important lesson that day. If I get stuck on a problem, a good long crap will fix it.

    --
    Slow down, cowboy! It has been 4 hours since you last posted. You must wait another few hours.
  88. Re:Hardly devastating, but a waste of several hour by Anonymous Coward · · Score: 0

    I ran into this in college on a Solaris machine. One of my classmates had created a program called "test". Running test always resulted in the program returning immediately with no output. A bunch of us were puzzled as to why his program wasn't working until one bright fellow asked, "What happens if you run ./test?" We all learned a valuable lesson that day about not including the current working directory in $PATH.

  89. My most memorable by CauseBy · · Score: 1

    Almost all bugs turn out to be my bugs, but the one that still stays with me to this day was when I tried to implement drag-and-drop in a Java application on a Mac back in 2003. I spent a solid month trying to get it to work and it just didn't behave the way the APIs said it should. Finally I mentioned it to another programmer, a friend, and he said oh yeah he'd noticed the same thing.

    Apple's impl of the drag-and-drop library had a bug in it. A user *must* support String type DnD in order for other data types to work. Even though I didn't need to support String for my app, even though the String support did nothing, as soon as I added String to the list of types I claimed to support, all the rest of my code immediately worked as expected.

    Fuck you, Apple!

  90. Re:Hardly devastating, but a waste of several hour by Anonymous Coward · · Score: 0

    Tip: add shell aliases that print a warning. Whoever actually wants to reboot a server needs to use the full path to reboot/shutdown/halt.

  91. Re: Hardly devastating, but a waste of several hou by Chewbacon · · Score: 1

    Also, when you fix a bug, make sure you're pushing to the right repo. I suppose that's more of a bug in the programmer. Humanity is a bug.

    --
    Chewbacon
    The Bible is like Wikipedia: written by a bunch of people and verifiable by questionable sources.
  92. His top of list... by Anonymous Coward · · Score: 0

    ...is my top of list: GLARINGLY OBVIOUS yet so buried in code that it's easily overlooked/missed.

  93. Javascript bug by trybywrench · · Score: 1

    Back before I really got to know my good friend Javascript I encountered the ol' truthy vs truth thing. If I remember right it was a single element array with the value 0 that tested to false. Something like var x = [0]; and then if ( x ) equates to false. That one can really unnerve someone not familiar with the pyscho-gf that is Javascript.

    --
    I came to the datacenter drunk with a fake ID, don't you want to be just like me?
  94. array boundaries by Anonymous Coward · · Score: 0

    One bug I remember quite well.

    As quite often, "all works without debug, but not if I enable it." (although didn't notice that at that point)

    I was receiving data through MIDI, and had built-in (optional) debug functionality, that would print the incoming (raw) data to the screen. It worked so, that it would print values for one command on the first line, next ones on second etc. until screen is full, after which the used part is cleared, and first line printed again.

    The core problem was, I was accidentally printing one line too many, which resulted in data being assigned after the used screen mem. It just happened so, that immediately after that was array of least significant bytes for MIDI command callback routines, some of which were modified.

    So after that happened, one command whose handler address was changed arrived (a while after the data was corrupted), and the code jumped to wrong position. And not "too wrong" either (actually called keyboard number key handler) but of course didn't call anymore what it was supposed to call.

    So how I found & fixed it:

    See where the code execution ends up. Number key handler. "But I didn't press any keys?"

    At that point I was rather lucky. I just added a breakpoint at the number key handler, and noticed two things:
    1)The number key callback was still clearly executed, as before.
    2)Breakpoint was never triggered.
    The corrupted address was actually a few bytes after the first command of the function, which clearly explained this. At that point I became really suspicious, considering the MIDI command callback wasn't called anymore, I checked the callback array, and noticed quite a few wrong bytes. Set watchpoint there, run code, and it was clearly done by the debug print code. After decrementing the compare value, all was fine again.

    And before anyone mentions it, stack contents were rather useless, as there's exactly one jsr call during this all.

    And that was, what... Two weeks ago?

  95. Still haunts me by byteherder · · Score: 1

    I was a young programmer working at my first startup company back in 1999. We had an communication app that talked using CORBA much as today you would used Web Services. We knew we had a memory leak and back then you only had a few megabytes of RAM so a memory leak could chew through all your memory pretty quickly. This bug was causing our servers to crash at least once a day and was in danger of taking down the whole company.

    I didn't write the code, it was written by one of our senior engineers. He has insisted his code was right but I found the memory leak in his code using a debugger.


    /*code snippet*/

    returncode = CallCORBA( new CORBAConnect( /*Connection parameters*/), &header, &message);

    if (returncode != ERROR)
    {
    /* continue processing other event */
    }
    else
    {
    /* handle error */
    }

  96. I found a $250K bug as an intern... by __aaclcg7560 · · Score: 1

    As a software testing intern, I found a crash bug on the test server. I could reproduced it 100%. My boss couldn't reproduced it at all, and subsequently approved the patch for the production server despite my dire warnings. The production server crashed within 24 hours, knocking it offline for three days and costing the company $250K in lost revenues. My internship wasn't renewed and 1/3 of the division got laid off the following month to make up for the lost revenue. As for my boss, he got promoted.

  97. Beware long uptimes by HarmlessScenery · · Score: 1

    I inherited a 'creaky' legacy system - and the server needed to be rebooted. It hadn't been rebooted in 5+ years.

    I did all of the sensible things: I checked that the backups were up to date - and then I manually copied the entire codebase and local data and database to another server before touching anything, just to be sure.

    Reboot ... dead.

    Restore everything ... still dead.

    After a lot of tracking down I discovered that a previous developer had placed critical config in /tmp /tmp was purged on reboot.

    I then discovered that the backup system was configured to ignore /tmp - because ... /tmp

    That took a lot of effort and guesswork to rebuild.

    Now I always copy /tmp before rebooting anything with a long uptime ;)

  98. Fuck Dice and Fuck Nerval Lobster by Anonymous Coward · · Score: 0

    I am the only one sick of the articles that get submitted by this crustacean?

  99. ** Spoiler ** by byteherder · · Score: 2

    It is the anonymous CORBAConnection variable that is create in the function call. Programmers create this anonymous variables all the time and never thing that it will bite them is the ass. Well, this one did and nearly took down the company too. Here is the explanation behind it.

    CORBA communication is asynchronous, and thus COBRA connect object lives past the function that created it. When the communication thread that was using the connection is finished the original calling function that created it, has passed out of scope so there is no destructor called implicitly. And since there is no explicit variable, we cannot call the destructor explicitly either. With no way to call a destructor, there is no way to reclaim the memory, used thus the memory leak.

    The solution was to explicitly declare a variable for the CORBA connection object and then call the destructor when it finished.

  100. Celibacy and poverty by Anonymous Coward · · Score: 0

    Along with prayer, you should take a vow of celibacy and poverty. More for the rest of us.

  101. Lesson is clear: HIRE QA PEOPLE. by Anonymous Coward · · Score: 0

    Stop cheaping out and expecting your end users to report the bugs. You need to actually HIRE QA people and test the software BEFORE delivering it.

  102. Worthy Bugs by wevets · · Score: 1

    I worked with a guy many years ago who coined the term "worthy bugs" that we used when we had a really good one. Two, in particular, I remember decades later. 1. This turned out to be a hardware bug that showed up in our software very intermittently. In the 1980's, National Semiconductor offered the NSC 880, a clone of the 8080: Same instruction set, mostly the same specs. This processor is spec'ed such that on the enable interrupts instruction, interrupts are not actually enabled for one instruction cycle so as to allow for a bit of cleanup (pop or whatever) without interruption. Without this, stacks could become confused, and did. Well, the guys at the fab across the street from where we were doing software development did not implement this one-instruction delay, but kept knowledge of it as a secret errata. When confronted about it after we had traces that proved the error in hardware, their response was "well, we didn't think it would ever come up." Bastards. 2. It turned out that one of the early Intel chipsets implementing PCI would, when doing 64K data transfers that fell exactly on 64K boundaries, deliver the first byte of the the range in place of the last byte. I was working on Ethernet device drivers at the time, so this just looked like data corruption in the driver or the network controller to us. It took a while and many logic analyzer traces to root cause this one to the chip set. Once we knew what was happening, the software work-around was easy, but it did slow down the driver just a bit. At least the chip set guys were unaware of this bug, and it never appeared in the many subsequent chip set implementations of PCI.

  103. Re: Hardly devastating, but a waste of several hou by Half-pint+HAL · · Score: 1

    Not just a bug; it's a virus, Misterr Anderrson.

    --
    Got them moderator blues I blieve I walk out the do', With these mod-points I been gettin', I 'most never post no mo'
  104. Re:Hardly devastating, but a waste of several hour by Anonymous Coward · · Score: 0

    This is why I never develop in any language that doesn't have an IDE I can step through the code with a debugger.

  105. My favorite... by TemporalBeing · · Score: 1
    Well, my favorite bug of my own was in a batch file I put on a Win95 system to clean out the Temporary Internet directory; only it didn't change its working directory and ended up deleting stuff from the Windows directory instead...learned that Win95 can be installed over itself at run-time, thus saving the system before I rebooted it.

    That said, I'm a strong believer in defensive programming practices. Not only do they make the software more secure, but they also help catch your own bugs. As the one article in TFS says

    Unless out-and-out performance is vital, checking inputs is always a good idea.

    Hint: Performance is only vital in very few locations, namely interrupt handlers deep in the Operating System. So it's not likely that performance is vital enough to skip checking the inputs to your functions.

    Hint: Checking the inputs to your functions will almost always help you catch logic errors, prevent memory overflows, etc. IOW, they'll save you many man months of debugging by making many things obvious. You just have to be disciplined enough to do them.

    --
    Truth is like the sun. You can shut it out for a time, but it ain't goin' away. - Elvis Presley (source: imdb.com)
  106. Fat fingers by Anonymous Coward · · Score: 0

    The default shortcut to submit code for the IDE I work in is F3. This language/IDE has some legacy support features, one of which is assigned to F4 by default. Pressing F4 will take the last piece of code that you submitted to run, and insert it into your current editor window wherever your cursor is.

    One day I was working on a 1k odd LOC program. I selected the code from top to bottom, pressed F3 to submit it, and then somehow pressed F4, which copied the 1000 LOC and pasted them into the bottom of the doc (unbeknownst to me).

    There were some errors in the log (as the code was still in development), so I would start at the top of the log, find the first error and go and debug it, then resubmit the program, and repeat. Though I fixed the program, the errors kept occurring. Took me about half a day to figure out that I was debugging the top half of a duplicated piece of code. Because I was using CTRL-HOME, CTRL-END, and CTRL-F to navigate around, I never noticed the duplication.

    So mad. My first action now is always to disable that damn F4 shortcut.

  107. Memory corruption: Double free() calls by Anonymous Coward · · Score: 0

    Late 90s, multi-developer team. One day, my code was crashing, and all my attempts to step thru the code and examine the stacktrace left me scratching my head why it was crashing in my code.

    The problem was someone else's code freed the same memory block twice, causing the heap to get corrupted. The program would still run a bit until it crashed in my code.

    I would have never figured it out if I decided not to run Purify on the code, and it detected the double free() call.

  108. Browser Testing by SixMinutes · · Score: 1

    The various browser DOM implementations are an endless fountain of weird bugs, even today. One of the most bizarre I've tackled was an issue that cropped up in a headless browser test. One input element for currency would intermittently report the reverse of the value entered into it by the automated test. 234.00 became $432.00. Try as I might, I couldn't reproduce the problem in an actual browser, which meant using the slightly less than awesome tools for inspecting state in a headless browser. Suspecting that the jQuery plugin for currency formatting responsible, I debugged its handling of the input end to end, and found nothing. But disabling the plugin suppressed the problem. Identical code on other pages worked without issue. And then, I discovered that commenting out the input element *after* my currency element made the problem go away. So what followed was a few hours of trying to find any way in which the code for managing these two inputs could possibly be interacting, nothing. Finally, I got to bisecting the HTML itself, and found a styling related HTML class on the second input, that would suppress the problem if it was removed. Some combination of unsupported CSS in the headless browser squashing my form elements, the jQuery plugin reformatting input, and the way the test runner entered input into the form resulted in the string getting reversed, sometimes. To this day I still don't know exactly what was happening. The fix ended up being a tiny CSS tweak completely unrelated to my poor currency input.

  109. The Glass House and the Drawbridge by Anonymous Coward · · Score: 0

    Hi fellow /.'ers,

    The timing is fortuitous; I just started a blog of my own experiences on this very topic, drawn from 35+ years of software development experience: www.geekcrumbs.com

    The Glass House and the Drawbridge (most recent post) is on the front page, and will no doubt give some of you a chuckle. Please forgive the advertising down the left side. I'm hoping for my first tropical vacation in fifteen years :-)

    Cheers,

    ws

  110. Re:Hardly devastating, but a waste of several hour by Keybounce · · Score: 1

    Oh, I just have to chime in on this one.

    When you fork a copy of someone's git repository? Do not assume that the code you've just inherited matches the binary that you have been using. Make a test compile before you do anything.

    Sometimes, what's in the public repository doesn't match what was compiled for the binary. Sheesh.

  111. Notes by cwsumner · · Score: 1

    After 50 years of coding:

    1. Beware of doing things the easy way, even if you are in a "crunch". The "easy" way is often the hard way.

    2. Beware of clever coding. Clever is a "bug farm" for the next person, even if -you- are the next person.

    3. Learn to read at least a little Assembler code (and the Debugger). If the compiler or linker have a bug, it is the only way to figure out what is wrong. And it can be useful for regular bugs, too.

  112. Use after free bugs by godamntheman · · Score: 1

    I had a hell of a bug about 8 years ago. Clearly it was a use after free in kernel process running in VxWorks that zero'ed out memory it no longer owned. The way it was observed was when a memory 'free size node' pointer was set to 0, corrupting the memory AVL tree. We couldn't reliably hit it; it had to happen only if the corrupted memory happened to be appended to the free size nodes, which meant it was a discontiguous free'd memory region, and then you wouldn't see the problem until someone allocated memory that had the matching requested size of the corrupted node, which meant we never got the same stack trace twice. To test, we ran a simulation of the environment constantly destroying and re-instantiating the object structure, and would get about 1 hit every 12 hours. This program instantiated tens of thousands of objects from ~250 different classes. The bug was a misunderstood order in a class hierarchy destructor: one class's destructor cleared memory an inheriting class had already freed. Not a big deal to fix, but incredibly difficult to find. We invented this to find it: http://www.google.com/patents/... While I worked on this problem longer than anyone else, I sadly was not included on the patent. :(