Slashdot Mirror


The Exact Cause of the Zune Meltdown

An anonymous reader writes "The Zune 30 failure became national news when it happened just three days ago. The source code for the bad driver leaked soon after, and now, someone has come up with a very detailed explanation for where the code was bad as well as a number of solutions to deal with it. From a coding/QA standpoint, one has to wonder how this bug was missed if the quality assurance team wasn't slacking off. Worse yet: this bug affects every Windows CE device carrying this driver."

62 of 465 comments (clear)

  1. Wow. by LeadLine · · Score: 4, Funny

    It wasn't a bug! It was an unexpected feature!

    Microsoft is taking a stance against teenagers blowing their ears out with loud music.

    1. Re:Wow. by Vadatajs · · Score: 3, Funny

      Teenagers these days are too young to remember metallica.

    2. Re:Wow. by Yvan256 · · Score: 4, Funny

      They're too busy listening to NiMHica.

    3. Re:Wow. by Anonymous Coward · · Score: 5, Funny

      No they didn't no they didn't lalalala I can't hear you That piece of shit was definitely not any Metallica I know.

    4. Re:Wow. by Barny · · Score: 3, Informative

      Metallica?

      Damn you young kids these days. /me cranks motorhead back up while yelling to "git off his lawn!"

      --
      ...
      /me sighs
  2. Warning, Y2.1K bug. by LostCluster · · Score: 3, Informative

    Just before anybody claims to have a foolproof solution to leap years, make sure you test against the year 2100. It's a multiple of four, but also a multiple of 100 that's not a multiple of 400... and therefore NOT a leap year.

    1. Re:Warning, Y2.1K bug. by LostCluster · · Score: 5, Informative

      Here's your 500 year plan:

      1900 - multiple of 100, not a multiple of 400, no leap day.
      2000 - was a multiple of 100, but also a multiple of 400 so we still had a leap day.
      2100 - see above
      2200 - not a multiple of 400, no leap day.
      2300 - not a multiple of 400, no leap day
      2400 - multiple of 400, so have the leap day anyway.

    2. Re:Warning, Y2.1K bug. by narcberry · · Score: 3, Insightful

      I think you missed the point.

      --
      Modding me -1 troll doesn't make me wrong.
    3. Re:Warning, Y2.1K bug. by Anonymous Coward · · Score: 5, Funny

      For Slashdotters you lot seem pretty confident the Zune is going to be around for awhile.

    4. Re:Warning, Y2.1K bug. by PIBM · · Score: 5, Insightful

      Actually, it's far from being good. In 99% of the cases you will do 3 modulos operation, in 0.75% you will do 2 modulos and in 0.25% you will do 1 modulo, for an average modulo cost of 2.9875 per run.

      With the initial solution, you have 1 modulo in 75% of the cases, 2 modulo in 24% of the cases, and 3 modulo in 1% of the cases, for a total average modulo cost of 1.26 per run.

    5. Re:Warning, Y2.1K bug. by kybred · · Score: 5, Informative

      No need to hard-code, there's an established algorithm for computing this.

      Why not call it by its name: Zeller's Congruence.

    6. Re:Warning, Y2.1K bug. by Anonymous Coward · · Score: 5, Funny

      I can't help but imagine how I would be directed by work to "solve" this problem.

      First, they would tell me that it's too difficult, expensive, and complicated to implement the correct solution. Even if I gave them a working prototype, they wouldn't change their minds.

      Then they would tell me "just assume every 100th year is not a leap year." So I would do that instead. In the time from 2100 to 2400, they would say that "a better solution is due to come out next quarter." They would say this every quarter for 299 years.

      In 2399, they would finally give me permission to fix the problem. But the leap year-calculating code works, and they don't want me to mess with it. Instead, they'd tell me to add a test when the program starts to see what year it is. If it's 2400, then it will refuse to run. (We'll definitely have a better solution in place by Q1. Definitely.)

      But the program often runs for an extended period of time without being restarted, so it's possible that someone will start it in December 2399 and it will still be running in February/March 2400. Management has a simple fix for this one: calculate the average run time for the program, add a margin of error, and use that to determine the actual "upper limit" on when the program is allowed to start. My boss would be really excited about this, because it would allow us to refine our earlier not-after-January-1st estimate to be "completely accurate."

      Unfortunately, we don't know the average run time for this program. So I'm told to add code to it to track when it starts and ends and store the results in a file. When the program starts, it examines that file (in addition to recording its own start time), calculates the average run time, adds 10% (there are still director-level meetings about whether we should round up to the nearest hour or day), and subtracts that value from February 28th, 2400. If the current timestamp is greater than or equal to the result we got from that, the program won't start.

      That's pretty good, but my boss would be worried about the program crashing. If that happens, after all, we won't know the program's end time -- never mind that it's November by now and there's no chance of getting useful data no matter what -- so instead of logging an end time, the program logs a heartbeat every minute. Now, you can determine when the program ended -- to within a minute! -- simply by looking at the heartbeat timestamps. When you encounter a gap of more than 1 minute (plus a small margin of error), you know the program ended. This has the bonus, my boss tells me, of simplifying the design by only requiring you to log one type of message to the file. He also assures me that this "telemetry data" has the potential to be "really useful for data mining." He talks about adding information on CPU time consumed, memory in use, I/Os, all sorts of stuff, then putting it in a database to be retrieved later. I manage to talk him out of it by pointing out that "the better solution [with which I am completely uninvolved] will be out in just a few months, so you should just make sure it makes it into that instead."

      Not that I'm bitter.

    7. Re:Warning, Y2.1K bug. by Bozdune · · Score: 4, Insightful

      If my code's still running in 2100, our society has got way bigger problems than me not figuring leap years correctly.

    8. Re:Warning, Y2.1K bug. by lyml · · Score: 3, Insightful

      That's just silly, readability trumps over using modulo a thousand times. Always, with no exceptions.

    9. Re:Warning, Y2.1K bug. by rilian4 · · Score: 3, Funny

      tested your algorithm. It breaks where y%100 is not 0..at least in python 2.5x using windows idle.

      I found the following more accurate:
      def leapyear(y):
              y4=True
              y100=True
              y400=True
              if y%4==0:
                      y4=False
              if y%100==0:
                      y100=False
              if y%400==0:
                      y400=False
              ly=(not y4) and (y100) or (not y400)
              return ly

      Might not be the most efficient but it works as far as I can see.

      --

      ...quicker, easier, more seductive the darkside is...but more powerful, it is not.
    10. Re:Warning, Y2.1K bug. by Anonymous Coward · · Score: 3, Insightful

      Yuk. Unreadable tripe. It's basically the same algorithm, implemented very poorly. Try:

      def isLeapYear(year):
              return (year % 4 == 0 and (not year % 100 == 0)) or (year % 400 == 0)

    11. Re:Warning, Y2.1K bug. by jcr · · Score: 3, Insightful

      COBOL really shouldn't have been allowed to survive past 1990.

      I disagree. I wouldn't want to write a 3D CAD program in it, but COBOL is still a fine choice when the task at hand is to account for a million monthly utility bills. The built-in BCD arithmetic is very well suited for implementing financial applications, and COBOL isn't susceptible to buffer-overflow bugs like the C-based languages are.

      -jcr

      --
      The only title of honor that a tyrant can grant is "Enemy of the State."
  3. Import calendar? by TurtleBlue · · Score: 5, Insightful

    "From a coding/QA standpoint, one has to wonder how this bug was missed if the quality assurance team wasn't slacking off."

    I can't remember the last time a QA department was asked to test date functions... but then again, I can't remember the last time anyone wrote their own Leap Year calendaring calculator from scratch.

    I'm sure there are a hundred reasons to do it (licensing being one of them) but really, when was the last time you didn't just import calendaring from another library and call it a day?

    Please clarify to me if this is something at the hardware driver level: I honestly don't know. If this were me, my own bosses wouldn't ask "Why didn't QA catch this", as much as "why are you wasting time writing your own calendar code? And then why didn't you flag it as functionality that needed to be tested?"

    1. Re:Import calendar? by Anonymous Coward · · Score: 5, Informative

      It is driver code supplied by the manufacturer of the hardware platform on which the Zune and a couple of other devices are built. This platform includes a real-time clock which counts seconds since midnight and days since 1/1/1980. Considering that hardware component prices are cut-throat, there is probably no quality management for the software whatsoever. If it appears to work, it ships.

    2. Re:Import calendar? by TurtleBlue · · Score: 5, Insightful

      Thanks - that makes a tad more sense. I see everyone running around blaming Microsoft for the code since their name is on the product, even if it was a 3rd party vendor. They certainly are still liable for all the busted Zunes, but I couldn't imagine Microsoft didn't have *some* C leap-year code sitting around that actually worked, and could be compiled for any chip they wanted.

      Microsoft still has to take the hit up front, but then they'll sue or "renegotiate contracts" with the vendor that supplied the bad driver code, based on what it costs them.

      I'm still shocked that the manufacturer couldn't dig up *some* free/open calendaring code that's was around pre-2004. But hey, at least we know they were honest about not ripping off some other source code and calling it their own.

    3. Re:Import calendar? by nato10 · · Score: 5, Informative

      This is kernel-level code -- part of the OEM Abstraction Layer -- that is used to read the current time from the RTC, hence it is hardware-specific. RTCs on other processors, or Freescale-based devices using external RTCs, may implement the OemGetRealTime () function differently than Freescale has done here (the buggy ConvertDays () function is just a helper function).

    4. Re:Import calendar? by AuMatar · · Score: 3, Informative

      It was found in driver code. Part of the goal of driver code is to be as lean and mean as possible- most embedded devices do not have a lot of rom space- what they have is measured in MBs, not GBs. Remember not all the world is cell phones and mp3 players. In that case writing your own leap year function is the correct answer- existing calendar libraries likely have far more functionality than you need and would blow out your size. Given a choice between statically linking an entire calendaring library and writing a simple IsLeapYear function, writing the leap year function is the correct choice for that environment.

      --
      I still have more fans than freaks. WTF is wrong with you people?
    5. Re:Import calendar? by profplump · · Score: 4, Informative

      It's really probably not. Most of the basic calendar functions in libc (or glibc or dietlibc or uLibc) were written for 8 MHz machines running with 1 MB of system memory -- they'd do just fine on your embedded system.

    6. Re:Import calendar? by TrekkieGod · · Score: 3, Informative

      It was found in driver code. Part of the goal of driver code is to be as lean and mean as possible

      He failed. In the function in question he had the number of days since Jan 1, 1980. At the end of the loop, he was supposed to have the number of years since 1980 + the number of days since the beginning of the current year. His solution was to iterate the year beginning from 1980, check to see if it's a leap year, then subtract 365 or 366 days accordingly. The loop would supposedly continue until the desired state is achieved but, because of the bug, became an infinite loop at the end of leap years.

      Not only was his function not "lean and mean" but it actually gets more expensive to run every year that passes :)

      I'm also curious as to why 1980 is the epoch, but that's not as important.

      --

      Warning: Opinions known to be heavily biased.

    7. Re:Import calendar? by jellomizer · · Score: 4, Informative

      Or from a more basic standpoint...
      People make mistakes.

      When testing leap year for a data set you like to see if you have a Febuary 29th and A March 1st, as well the days of the week are updated after the leap day. December 31st isn't on the top of days to check for leap year code.

      Secondly coding for date times even with good prebuilt libraries is a pain. Unfortunately Time and Date are not really good mathematical functions. 365 days a year except for every 4 years where there is 366 The subset of year is split to months where each value is different of having 28, 29, 30, 31 days in it. Then we have 7 days weeks, which do not divide nicely with any other greater time unit (except for the 28 day month, which is only happens once a year... except for a leap year) Now each day has 24 hours, split into 2 12 hour segments, each hour is split up with 60 minutes and then 60 second per minute. Then finally after the second we can start using the Metric niceity in programming. Oh! Oh! don't forget about TIme Zones, and Daylight savings time (which is different per country, state, and follows political lines more then geographic lines.), And if you are going at high speeds for aerospace applications those Crazy Einstine theories come into play.
      Now no one really goes with the same approach to follow all these crazy rules and having a common library is still tricky because we all do different math calculations, also when you do a time++, do you want it one more second like in Unix/Linux OS development or one extra day like in Microsoft SQL. Then when you get these values sorted or a quick search/filter. and you may need to sort them etc. American Time Format doesn't do a good job at this. So we need to switch it to European formats. All in all it is a lot of tough coding all of it is tough to QA Because you need to test all the times to truely know that there is no bugs in it.

      --
      If something is so important that you feel the need to post it on the internet... It probably isn't that important.
    8. Re:Import calendar? by jc42 · · Score: 4, Insightful

      [Microsoft] designed it and sell it as a unit even if parts are from other places. They have a responsibility to test it.

      No, they don't. They have several decades of success selling untested software. Their customers have given them a very strong message: "We don't care about quality or reliability. We only want to buy whatever Microsoft sells. Don't tell us about something from another vendor that's higher quality; we aren't interested."

      They've become a huge success. They're obviously doing what their customers want. They'd be stupid to waste good money on silly things like testing that won't increase their sales.

      Or, as others here keep pointing out, Microsoft's only "responsibility" is to its shareholders. Their ongoing success has hugely rewarded their shareholders. They are obviously Doing It Right. They have no other responsibilities.

      If there are signs that this fuss has a lasting effect on their profitability, they'll do something about it. But it'll probably die off in a short time, and only a few geeks (non-customers) will remember it. How many people can name even one piece of software that died on Jan 1, 2000? Customers won't remember this one, either. Most of them will never even hear about it. So Microsoft's management isn't worried about it.

      And anyway, the problem is "fixed" now (for another 3.99 years). So why even bother discussing it?

      (What, me cynical? ;-)

      --
      Those who do study history are doomed to stand helplessly by while everyone else repeats it.
  4. "Leaked"...? by Anonymous Coward · · Score: 5, Informative

    It's an open source driver from Freescale.

    1. Re:"Leaked"...? by Anonymous Coward · · Score: 4, Informative

      Who's job is worth leaking a driver for a dumb microsoft player.

      The code is not specific to the Zune. It is specific to the MC13783 PMIC RTC that is used in many different pieces of hardware.

      Do we know how this ended up on the net?

      The authors (Freescale Semiconductor, Inc.) released the source under the terms of the GPL.

      Also has anybody else noticed that the source code seems to be nicely written (bar the bug)..... somewhat surprising for microsoft I allways assumed there code was written by a bunch of children.

      Microsoft didn't write the code. It was written by Freescale Semiconductor, Inc.

  5. Re:If this interests you by Creepy+Crawler · · Score: 3, Informative

    Amazon link eh? meh.

    Try this link for your "sampling" : Deep C Secrets.

    Took only 15 seconds for that link. Enjoy.

    --
  6. Re:Old by LostCluster · · Score: 3, Interesting

    Yep, but it deserves to be covered so that everybody hears it. It's not just a laugh at Microsoft story, but also a lesson to aspiring programmers to watch there step when it comes to timekeeping. Gotta get a mention to the people who look at /. at work, gotta get a mention to the people who visit weeknights, gotta mention it for the weekend crowd.

  7. QA team slacking off... by feepness · · Score: 4, Funny

    From a coding/QA standpoint, one has to wonder how this bug was missed if the quality assurance team wasn't slacking off.

    MSFT's QA team hasn't been slacking off. They haven't slacked on since about the mid 90s.

  8. Bigger bugs have gotten through on Windows CE by msgmonkey · · Score: 5, Interesting

    For example I had some code I developed on Windows CE 4.2 .NET which kept on hanging on calling the FindWindow() fuction call.

    Turns out that trying to find a window by class name will hang (this version of) CE every time, even though you would have thought its a very much used function call and would be caught by CE.

    So no I'm not surprised at all that this bug got through.

  9. Re:Why is this a surprise? by panoptical2 · · Score: 3, Informative
    As Wikipedia would have it here...

    Windows Mobile is best described as a subset of platforms based on a Windows CE underpinning. Currently, Pocket PC (now called Windows Mobile Classic), SmartPhone (Windows Mobile Standard), and PocketPC Phone Edition (Windows Mobile Professional) are the three main platforms under the Windows Mobile umbrella. Each platform utilizes different components of Windows CE, as well as supplemental features and applications suited for their respective devices.

    So, every smartphone/PDA that currently uses Windows Mobile uses some form of CE.

  10. Regardless of whatever code in it is faulty by scourfish · · Score: 5, Funny

    Lines 122, 521, 690, 710, and 748 scare me; gotos in C code...

    1. Re:Regardless of whatever code in it is faulty by KiltedKnight · · Score: 3, Insightful
      Ever written code for an OS or device driver? You use them there... frequently... as "get me the frack out of here because of a fatal error"...

      Never mind that if done properly, there is nothing wrong with using a goto statement... just make sure that you only move in one direction... ideally "down" towards the end of the function, not somewhere else in the whole program.

      --
      OCO is Loco
    2. Re:Regardless of whatever code in it is faulty by concernedadmin · · Score: 5, Interesting

      Lines 122, 521, 690, 710, and 748 scare me; gotos in C code...

      They've used one form of a goto that's actually quite readable and useful. Would you rather have:

      if (condition1 && condition2) {
      /* boilerplate code with a return */
      }

      if (issue1 || issue2) {
      /* same repeated boilerplate code with a return */
      }

      or

      if (condition1 && condition2) {
      goto cleanup;
      }

      if (issue1 || issue2) {
      goto cleanup;
      }
      cleanup:
      /* just one instance of this code,
      no need for duplication of efforts */
      Believe it or not, there are useful reasons to use goto, and Microsoft happened to use goto for the right reason here. The Linux kernel also happens to use this practice to boost the readability of the code.

    3. Re:Regardless of whatever code in it is faulty by AuMatar · · Score: 4, Informative

      Because cleanup doesn't have access to the local variables of the calling function. This means they need to be passed in. The result is a very obscure function that takes in half a dozen or more variables and gets difficult to maintain since it's purpose makes absolutely no sense without the context in the calling function (not to mention easy to have bugs- forget to check just one pointer for null before using it and you're into undefined behavior, which may only occur in rare error conditions making it difficult to test for). Using a cleanup function like that just isn't practical.

      --
      I still have more fans than freaks. WTF is wrong with you people?
    4. Re:Regardless of whatever code in it is faulty by QRDeNameland · · Score: 4, Interesting

      The addition of single bool avoids both the specialized cleanup() function and the goto:

      bool needs_cleanup = false;

      if (condition1 && condition2) {

      needs_cleanup = true;

      }

      if (issue1 || issue2) {

      needs_cleanup = true;

      }

      if (needs_cleanup) {

      // clean up local vars exactly as you would have done

      // have done under the cleanup: label with the goto

      }

      --
      Momentarily, the need for the construction of new light will no longer exist.
  11. Re:Why write any date/time code? by p0tat03 · · Score: 5, Informative

    This was written by the Freescale guys, not MS, where it would make sense for the device manufacturer to ship their own date/time code.

  12. Sad code, sad article by gnasher719 · · Score: 3, Interesting

    Both the original code and the various corrections in the article don't catch what the algorithm is supposed to do, and therefore create code that is too complicated.

    The essence of the algorithm is this: We start with number of days since 1/Jan/1980, with the first day having the number one. We want to end up with the correct year, with a day number relative to the first day of that year, with the first day again having the number one. So we set year = 1980. And as long as day is greater than the number of days in that year, we can't have the right value yet, so we change day and year accordingly. This produces a very simple loop:

    for (;;) {
        int daysInYear = IsLeapYear (year) ? 366 : 365;
        if (day = daysInYear) break;
        day -= daysInYear; year += 1;
    }

    This is what Knuth called an "N + 1/2" loop: A loop pattern where a more or less substantial bit of code has to be executed at the beginning of the loop before we can decide whether the loop needs exiting or continuing. By following the "N+1/2 loop" pattern we avoid repeating the same code (with possible small changes) completely. And that exactly was the problem here: The same code was used twice but slightly differently (one set number of days = 365, the other made it dependent on whether the year was a leap year or not). The solutions given in the article all contain repeated code; either two loop exits, or a duplicated calculation of the number of days in a year.

    1. Re:Sad code, sad article by chalkyj · · Score: 5, Informative

      I think slashdot ate your < in the breaking line.

    2. Re:Sad code, sad article by xlv · · Score: 5, Funny

      for (;;) {
              int daysInYear = IsLeapYear (year) ? 366 : 365;
              if (day = daysInYear) break;
              day -= daysInYear; year += 1;
      }

      This is what Knuth called an "N + 1/2" loop

      No, this is what Knuth would call an infinite loop as there's no way to terminate the loop except on the last day of each year...

    3. Re:Sad code, sad article by gnasher719 · · Score: 4, Informative

      It's a bug in the Slashdot software, eating "less than" and "greater than" characters in "Plain Old Text" mode.

  13. Re:Old by larry+bagina · · Score: 5, Insightful

    Comments in the last zune slashdot story (yesterday?) were just as detailed as this "story". Maybe slashdot editors should read their own site. Or maybe I should start submitting all +5 comments for their own story.

    --
    Do you even lift?

    These aren't the 'roids you're looking for.

  14. MOD PARENT UP Re:Why write any date/time code? by exphose · · Score: 5, Insightful

    Exactly, just goes to show the dangers in not QA'ing the whole codebase including supplied drivers. You can't trust your own code so you QA it, why should you trust your partner's code.

    1. Re:MOD PARENT UP Re:Why write any date/time code? by JoeMerchant · · Score: 3, Insightful

      But really, no one can be expected to QA every single line of code that's shipped through their device

      All depends on the level of concern - for a music player, what the hell, it's only the company's reputation that's riding on it... (now, if the company isn't already a laughingstock, maybe this might matter.) If this were code on a Mars Surveyor mission whose failure would set back an entire program by 2 years or more - I'd be checking every line of code, everywhere, three times.

  15. Probably Not A Widespread Issue by nato10 · · Score: 5, Informative

    This code is actually from the Windows CE OAL (OEM Abstraction Layer), part of the code that reads the current time from the RTC. As such, the implementation is hardware-dependent, which is why there isn't a standard implementation of this function for Windows CE.

    In addition, this code is in a portion of Windows CE source code provided by a device's BSP developer, not by Microsoft. In most cases, Windows CE BSP developers start with sample BSPs written by a processor's manufacturer -- in this case, Freescale -- and then improve it.

    It turns out that this bug is specific to the Freescale's BSP -- sample Windows CE BSPs for other procesors don't have it -- and other Freescale devices using Windows CE will only have this issue if their developers used this code verbatim. Since sample BSPs provided by processor manufacturers are often of poor quality, many Windows CE developers typically rewrite such functions. In other words, the impact of this particular bug may be quite limited, which may be why there haven't been reports of this issue on other devices.

    In this particular case, though, Microsoft (or a contractor) was the Zune's BSP developer, so they certainly should have caught this.

    1. Re:Probably Not A Widespread Issue by TheSunborn · · Score: 3, Funny

      I still wonder: Why is code that translate from a number of days, to a year hardware dependent?

      Getting the number of seconds since epos is hardware depending, but translating this to other time measurements should not be,
      unless they are building a time machine.

  16. Free Book web site. by Futurepower(R) · · Score: 3, Informative

    Wow. That link is to a book from a good web site: Free eBooks.

    Other free books about C and C++: Free C and C++ books

  17. Re:Modified Julian Day by rcw-home · · Score: 4, Funny

    I highly recommend that in cases like this, programmers be good Catholics and abide by the decree of Pope Gregory XIII. Software written to work with modern dates should use Gregorian, not Julian. Or did you mean ordinal?

    From the article you linked to: The use of Julian date to refer to the day-of-year (ordinal date) is usually considered to be incorrect, however it is widely used that way in the earth sciences and computer programming.

  18. Obligatory XKCD by Failed+Physicist · · Score: 3, Funny
  19. Re:Why write any date/time code? by cbhacking · · Score: 3, Insightful

    Ah, thank you. This explains better why the 2nd-gen and 3rd-gen Zunes didn't suffer this problem; they were completely designed and developed in-house.

    --
    There's no place I could be, since I've found Serenity...
  20. Calendrical Calculations by kabloom · · Score: 4, Interesting

    The proper way to do this would be with division and modulus, which gives you a nice constant time solution even if you're still using your Zune in 2108. They ought to read Calendrical Calculations by Nachum Dershowitz and Ed Reingold and learn how to do this properly.

  21. Re:Old by Goldberg's+Pants · · Score: 4, Funny

    First to finish is not always a good thing. Just ask your girlfriend.

  22. Not only a zune bug toshiba gigabeat affected too by bigbigbison · · Score: 3, Informative
    --
    http://www.popularculturegaming.com -- my blog about the culture of videogame players
  23. Not QA's fault by Sleepy · · Score: 3, Insightful

    "evidence of QA.. slacking off"

    These comments routinely come from two groups:

    1) Software Developers
    2) Joe the Plumber

    Or put another way: elitism or ignorance.

    If a software division is letting QA "test" all on their own, that's a recipe for disaster... and it's the head of engineering at fault.

    See, software testing does not occur in a vacuum, no more than developers code without a list of requirements from Sales or Marketing.

    Engineering takes takes the requirements, use that to produce an agreed upon set of specifications.

    QA follows the same model... they take the software specs and derive a set of effective tests.... tests which are agreed upon by Engineering, and signed off on.

    When I did QA, it was mostly for startups who lacked this kind of process. The result was QA was always 2 steps behind software that continually morphed: hardware changed, or the customer changed their mind. I'm not placing the blame on any 1 group here... I come from Support, then QA, and now develop. Startups can be rough.

    But at the end of the day, not documenting and agreeing on what the product and tests should be will cost you big time.. maybe 7 out of 10 times.

  24. Re:Let's make sure this gets installed everywhere by neokushan · · Score: 4, Insightful

    When has Microsoft ever actually done that? Apple has released updates that DELIBERATELY bricked devices (jailbroken iphones for one), but that's ok, yet when a Microsoft device breaks due to a very obvious bug (obvious in that it's obvious it IS a bug, not obvious in that it really should have been noticed - bugs do happen in pretty much ALL software) that has a stupidly simple fix (Let it drain the battery then turn it on again), suddenly the Conspiracy theories are out in full force and they're once again branded as the most Evil Corporation on the planet? Please.

    There's so much you can bash Microsoft for (legitimately), why do you feel the need to actually make shit up?

    Besides, from all the reports I've read so far, Windows 7 is actually looking to be a worthy Upgrade (if you're a windows user, that is - for anyone else, your mileage may vary) and I don't just mean from Vista, I mean from XP as well.

    But no, it's easier to just hate the large, monolithic, rich company than accept that sometimes shit just happens.

    --
    +1 IDisagreeSoHeMustBeATrollOrAnAstroturferOrAShill
  25. Re:Let's make sure this gets installed everywhere by neokushan · · Score: 4, Informative

    Ok, I called your bluff. I actually went and searched for it.

    The VERY top link is this slashdot article which states:

    "We've all heard the story of Microsoft's battle cry of "DOS ain't done till Lotus won't run". Adam Barr investigates the myth, interviewing various Microsoft and Lotus old-timers (including Mitch Kapor), and finds no basis for its legitimacy or any case of 1-2-3 actually not running. Whom to blame for Lotus Notes is not discussed."

    I checked the next few links and they pretty much all pointed to the same article, namely this one. One site even described it as a "complete and utter annihilation of the myth".

    I actually thought you were disagreeing with me, but now I see you were pointing out that people have been claiming the same thing for years and it was just as unfounded then as it is now. Thank you, I couldn't have said it better myself.

    --
    +1 IDisagreeSoHeMustBeATrollOrAnAstroturferOrAShill
  26. Re:Why is this a surprise? by ozphx · · Score: 4, Insightful

    Lots of things use Windows CE, which is fine.

    The problem is with the Freescale Semiconductor's* RTC driver. So if you aren't using that specific chip and driver then CE is unaffected.

    * No, this doesn't excuse MS from proper QA.

    --
    3laws: No freebies, no backsies, GTFO.
  27. Re:Let's make sure this gets installed everywhere by gutter · · Score: 4, Insightful

    Ok, I'm getting sick of this claim. There is no proof that Apple has ever deliberately bricked devices. This is completely unfounded.

    In fact, go back and look at the reports of iPhones breaking, and you'll see that most of them started working again with a later OS release. About the only thing that happens on upgrade with jailbroken phones these days is that they are locked again.

    --
    Check out DRM-free movies at http://www.bside.com
  28. Dos ain't done till Netware won't run by Terje+Mathisen · · Score: 3, Informative

    This should be the proper version of the quote:

    I know from the actual Novell developers (I worked for Novell in 1991-92) that on multiple occasions, Microsoft modified a new Dos version between the last beta and the actual release, in such a way that Novell's Netware client drivers stopped working.

    Terje

    --
    "almost all programming can be viewed as an exercise in caching"
  29. Re:Let's make sure this gets installed everywhere by db32 · · Score: 4, Insightful

    First of all, this was a braindead stupid bug. Unbelievably poor implementation of what should have been a fairly simple thing leads to an infinite loop on special days. Just looking at the damned loop without actually tracing through every possibility reveals a infinite loop at first glance. This was mindbogglingly stupid.

    Second...Apple didn't "deliberately brick" devices. Your bias here is unbelievable. What Apple did was fix a bug that was allowing people to jailbreak and that caused problems from jailbroken phones. They fixed a security flaw that caused something that took advantage of that security flaw to cease to function correctly. Now, personally I would like it if the iPhone didn't require jailbreaking to open it up, but fixing the flaw that allows people to break your security model is not "deliberatly bricking". WGA is deliberately bricking, where it arbitrarily decides that you are invalid and shuts you off. In both cases it is incorrect useage of the word "brick" since either device can be easily recovered. So...to recap. Apple fixed a security flaw that caused bad news for people jailbreaking. Microsoft told your computer to call home every day so they could arbitrarily decide if you were valid or not and then shut you off if you werent.

    It is easier to hate the large monolitic rich company that uses illegal business practices, breaks the standards, and buys off the DoJ to avoid punishment (Go look at MS political contributions to either party before the trial...virtually nil...then the year they get busted...they contribute big bucks to both sides and walk away with a wrist slap). Trust me...big time criminals don't need cheerleaders like you to help them out. People like you are like the wife that geats her ass kicked and says "no, but he really loves me, he really is a good guy".

    --
    The only change I can believe in is what I find in my couch cushions.