Slashdot Mirror


Software Error Likely Killed MGS Spacecraft

Aglassis writes "NASA investigators have determined that a software update performed in June of 2006 may have doomed the 10-year-old spacecraft. Apparently the software error caused the solar arrays to drive against a mechanical stop which then forced the spacecraft into safe mode. Unfortunately, after that the spacecraft's radiator was pointed at the sun which overheated the battery and destroyed it. Contact was lost with the Mars Global Surveyor spacecraft in November 2006. NASA will form an internal review board to determine formally the cause of the loss of the spacecraft and what remedial actions are needed for future missions."

199 comments

  1. Don't believe it by LiquidCoooled · · Score: 5, Funny

    I don't believe it.
    Its most likely the Martian automated defense system setup just before we sent a probe and destroyed their civilisation.

    --
    liqbase :: faster than paper
    1. Re:Don't believe it by orasio · · Score: 1

      Martians were previoulsy killed by all the MSG in the spacecraft

  2. Battery by Anonymous Coward · · Score: 5, Funny

    overheated the battery and destroyed it Have NASA been using Dell batteries?
    1. Re:Battery by Anonymous Coward · · Score: 1, Interesting

      s/Dell/Sony/g

      The worst part of it all was that Sony stopped using their own batteries because they knew they were defective. Boycott Sony.

    2. Re:Battery by Anonymous Coward · · Score: 1, Insightful

      Parent is not offtopic. The batteries in those Dell laptops were produced by Sony Corporation, not Dell. That's why the recall extended to nearly every major laptop manufacturer.

  3. a Technical solution I see: by pilgrim23 · · Score: 2, Insightful

    Typical response to a problem: form a committee!

    --
    - Minutus cantorum, minutus balorum, minutus carborata descendum pantorum.
  4. What is Microsoft wrote it? by quadelirus · · Score: 5, Interesting

    One crash in ten years? Why don't the NASA guys write consumer operating systems?

    1. Re:What is Microsoft wrote it? by the_humeister · · Score: 2, Informative

      Because it'd be even less user friendly than Linux. Plus they'd also require people to run 80386 processors with 4 MB memory, if that.

    2. Re:What is Microsoft wrote it? by Calinous · · Score: 3, Insightful

      Why don't computers use NASA-quality hardware, ready for space?
      Why don't all computers use just a single configuration (peripherals, cards, interfaces)?

            The purpose of an operating system is so much wider than what the Mars Global Surveyor had to do.

    3. Re:What is Microsoft wrote it? by h2g2bob · · Score: 1

      Well, 4 MB should be enough for anybody

    4. Re:What is Microsoft wrote it? by edremy · · Score: 5, Insightful
      Actually, they buy their OS's off the shelf. (VxWorks for the rovers, for example)

      That said, you could get software written to this level of perfection if you wanted. It's easy- follow the space shuttle's team's example. You have a stable team of mature developers who work reasonable hours. You test the hell out of the software to the point a single bug in a test is reason to redo the software. You run the software on four identical computers and make sure they all agree.

      Then you hire another entire team to write code that does the same thing, but otherwise has no contact with the first team. That software runs on a fifth computer that takes over if something happens to the other four.

      Willing to pay for that?

      --
      "Seven Deadly Sins? I thought it was a to-do list!"
    5. Re:What is Microsoft wrote it? by quadelirus · · Score: 1

      Seriously, what do we need all these fancy shmancy graphics for anyway?

    6. Re:What is Microsoft wrote it? by quadelirus · · Score: 1

      I totally agree. All an OS does is let you set a desktop background, and for the trouble they seem to have, who needs one? I mean, if I could only run firefox, and no OS wouldn't that be better?

      (I'm joking)

    7. Re:What is Microsoft wrote it? by the_humeister · · Score: 4, Funny

      I don't know. And people with their "keyboard" and "mouse." Idiots I say. The only true way to interact with a computer is by plugging wires into the serial port and generating the necessary electrical pulses myself.

    8. Re:What is Microsoft wrote it? by quadelirus · · Score: 1

      My previous joking aside, that is a good testament to the work being done by the VxWorks and other real-time OS folks-I just figured it was all written in-house, but thinking about it now, as you pointed out, would be next to impossible to fund. It seems that these days most things requiring some sort of OS, from PDAs to Cellphones, to your car's chip, to NASA spacecraft are using off the shelf components. It's just too hard a problem to start from scratch, especially when there are good alternatives out there.

    9. Re:What is Microsoft wrote it? by jsupreston · · Score: 1

      This is the article that supports what the parent says. It's old, but still a good read. http://www.fastcompany.com/online/06/writestuff.ht ml

      --
      "It's a dog eat dog world out there, and I'm wearing Milk-Bone underwear."- Norm (from Cheers)
    10. Re:What is Microsoft wrote it? by quadelirus · · Score: 1

      Just try not to have a SEGFAULT in the serial controller. :-p

    11. Re:What is Microsoft wrote it? by Jerrry · · Score: 1
      Willing to pay for that?

      Yes I am. Spread the cost over all the servers in the world and the cost would still be far less than the cost of all the crashes, infections, and data corruptions that are due to the sloppy way Microsoft writes and tests operating systems.

    12. Re:What is Microsoft wrote it? by camperdave · · Score: 1

      One crash in ten years? Why don't the NASA guys write consumer operating systems?

      "Honey, Is it Verb 37, Noun 40 to start Solitaire, or Verb 40, Noun 37?"

      --
      When our name is on the back of your car, we're behind you all the way!
    13. Re:What is Microsoft wrote it? by markov_chain · · Score: 1

      You have a serial port? What luxury! In my country we learned if we keep our heads real close to the motherboard and think really hard, we can modulate appropriate electrical signal changes on the memory bus and thus fake the inputs.

      --
      Tsunami -- You can't bring a good wave down!
    14. Re:What is Microsoft wrote it? by markov_chain · · Score: 1

      On Mars, you already have Mars as the desktop background. On Earth, we don't have that luxury and hence the need for a GUI.

      --
      Tsunami -- You can't bring a good wave down!
    15. Re:What is Microsoft wrote it? by Anonymous Coward · · Score: 0

      Willing to pay for that?

      Virtualization could cut the hardware costs.

      If you have multiple different virtual OS's running the same tasks on each machine, (RAEOS = Redundant Array of Expensive OS's ;) ), then you should get the same data-integrity checking at a reduced hardware cost, but the [old] hardware is probably cheap as dirt anyway compared to the software development price.

    16. Re:What is Microsoft wrote it? by alienmole · · Score: 1

      Yeah, but I hate the headache I get when the machine crashes.

    17. Re:What is Microsoft wrote it? by timeOday · · Score: 1

      Anyway, the shuttle flight control is only 420,000 lines of code (plus another 1.5M of support code). Nothing to sneeze at, but Visa and linux are said to have 50 and 30 million lines of code, respectively. So that's about two orders of magnitude! I'm also willing to bet the flight control software for the Shuttle hasn't changed much over the past 25 years, yet 275 people support it.

    18. Re:What is Microsoft wrote it? by quadelirus · · Score: 1

      Those extra lines of code in Vista are features, not bugs.

    19. Re:What is Microsoft wrote it? by quadelirus · · Score: 1

      Wow! Now that is what I call user friendly.

    20. Re:What is Microsoft wrote it? by Spikeles · · Score: 1
      --
      I don't need to test my programs.. I have an error correcting modem.
    21. Re:What is Microsoft wrote it? by bill_mcgonigle · · Score: 1

      Yes I am. Spread the cost over all the servers in the world and the cost would still be far less than the cost of all the crashes, infections, and data corruptions that are due to the sloppy way Microsoft writes and tests operating systems.

      But why actually do it when Microsoft can just pocket the money instead?

      It's an interesting idea if you could sell a linux distro with that goal in mind (you get nothing for your money but a disc and promise of future R&D). It might be too clever for most to accept.

      --
      My God, it's Full of Source!
      OUTSIDE_IP=$(dig +short my.ip @outsideip.net)
    22. Re:What is Microsoft wrote it? by StikyPad · · Score: 1

      Virtual OS's are virtually useless for redundancy because they don't protect against hardware malfunctions such as the unscheduled combustion of a component.

    23. Re:What is Microsoft wrote it? by 6th+time+lucky · · Score: 1

      Just dont try using the thing over the new year...

    24. Re:What is Microsoft wrote it? by RockDoctor · · Score: 1
      You run the software on four identical computers and make sure they all agree.


      Four dissimilar computers built by four different teams from four different chains of suppliers of hardware, with only the specification in common. If I recall correctly.
      That may be slight overkill, but the general point is plain - the interfaces are defined ; people implement the hardware to the specification separately ; if software works (including the OS) from separate teams, then it's more likely that both software and OS and hardware are correctly implemented to the specification.
      --
      Birds are not dinosaur descendants;birds are dinosaurs, for all useful meanings of "birds", "are" and "dinosaurs"
  5. *phew* by Daetrin · · Score: 4, Funny
    NASA investigators have determined that a software update performed in June of 2006 may have doomed the 10-year-old spacecraft. Apparently the software error caused the solar arrays to drive against a mechanical stop which then forced the spacecraft into safe mode.

    Glad i'm not the programmer who came up with that bit of code! Their next performace review is going to be _lots_ of fun!

    --
    This Space Intentionally Left Blank
    1. Re:*phew* by Intron · · Score: 1

      There goes the SEI level 5 certification...

      --
      Intron: the portion of DNA which expresses nothing useful.
    2. Re:*phew* by Anonymous Coward · · Score: 0

      That's going to be my performance review you insensitive clod!

    3. Re:*phew* by __aaclcg7560 · · Score: 1

      Actually, a subcontractor will blame another subcontractor for the fault and fighting will break out. NASA will keep peace among the subcontractors by blaming a hacker for mistaking the update as a patch for the Metal Gear Solid vidoe game, and vows not create any acronyms that could be misconstrued as a video game.

    4. Re:*phew* by timeOday · · Score: 1

      Not at all. SEI has no problem with bugs, so long as you follow an elaborate process to fix them, track them, and reconsider the process that lead to them. It's very process- rather than outcome-oriented.

    5. Re:*phew* by Mister+Whirly · · Score: 1

      "a subcontractor will blame another subcontractor for the fault"
      And Slashdot will blame Microsoft, even though they had zero to do with it...

      --
      "But this one goes to 11!"
  6. the software bug was by Anonymous Coward · · Score: 0

    uri = windowsupdate instead of nasaupdate...

    thats what happens when an ex-microsoft employee works for you

  7. "Safe" mode? by Bazman · · Score: 5, Funny

    Funny definition of 'safe mode'. I'd get the main antenna pointing at the earth, the battery radiator pointing away from the sun, and the computer going 'what do I do know, smarty earthlings?' and waiting for a command.

    Maybe NASA's 'safe mode' just put 'safe mode' in the corners of all the returned images and did them in 8-bit colour...

    1. Re:"Safe" mode? by Anonymous Coward · · Score: 0

      I think that's what it did. It asked the question "what do I do know, smarty earthlings?" and got stuck in an infinite loop trying to understand two words and why the question was not capitalized? Maybe it will eventually understand it should have been,

          What do I do now, smartly Earthlings?

      ??? Maybe not.

    2. Re:"Safe" mode? by joeljkp · · Score: 1

      Yeah, that's what I was thinking too. When the MESSENGER spacecraft enters safe mode, it'll turn itself so the antenna's toward Earth and its heat shield is toward the sun. It'll even rotate itself during its orbit to keep itself in that position.

      Different design decisions, I guess, but it still sounds kind of fishy...

      --
      WeRelate.org - wiki-based genealogy
  8. YACCS -Yet Another Computer Corkup in Space by Ancient_Hacker · · Score: 4, Informative
    Just one more example of how Computer Science sint quite up to the reliability requirements of Space:
    • A missing comma in a Do-loop statement causes the first mission to Mars rocket to go off course and blow up.
    • The space-shuttle programs had a race condition that causes the first launch to be scrubbed.
    • The space-shuttle re-entry program had one important variable off by a factor of -4, causing rthe first re-entry to be a bit wobbly.
    • A Ariane guidance program had multiple basic design glitches that caused the first launch to blow up.
    • The F-16 autopilot worked very well, until the plane was deployed to Australia, where on its way there it bounced off the equator.
    • The LEM landing program didnt protect itself from spurious radar data, causing the computer to get behind.

    Aero and space are very unforgiving of human coding errors.

    1. Re:YACCS -Yet Another Computer Corkup in Space by zyl0x · · Score: 2, Interesting

      Be careful not to place too much of the blame on us programmers. Most of these crazy "business logic" equations were created by some math genius in another department. Since most of these equations mean nothing to programmers, we make sure we're typing them in correctly, since there's no way we would ever recognize any type of mistake. Most of the time the problem lies with the math guy, who was too lazy to carry a remainder, or who thought the equation was good enough being precise to four decimal places.

      --
      Blerg.
    2. Re:YACCS -Yet Another Computer Corkup in Space by shawn(at)fsu · · Score: 1

      It's not like the only problems with air and space vehicles have been caused by coding errors, I'm sure engineering has done fairly well for it self too.

      --
      500 dollar reward for tip(s) leading to the arrest of the person(s) who stole my sig.
    3. Re:YACCS -Yet Another Computer Corkup in Space by Anonymous Coward · · Score: 0

      A few failures every now and then promotes competition in the selfdestruct-technology business so it's not all bad news ;)

    4. Re:YACCS -Yet Another Computer Corkup in Space by MBCook · · Score: 1

      Like the F16 thing. Let's not forget that the shuttle has NEVER been in space during a new-years. It is untested (at least in space) and they are not positive what will happen. That's why they were worried in December, they didn't want bad weather to force the shuttle to stay in space during the transition.

      --
      Comment forecast: Bits of genius surrounded by a sea of mediocrity.
    5. Re:YACCS -Yet Another Computer Corkup in Space by spun · · Score: 4, Insightful

      In other disciplines, the engineers ARE math guys. Face it, compared to other engineering types, software engineers and programmers are SLOPPY. This is because engineering has thousands of years worth of spectacular cork-ups with enormous death tolls to look back on, and engineering students are (I'm guessing, IANAE) shown horrific, traffic-safetyesque movies like Blood on the Protractor, Slide Rule Massacre, and London Bridge is Falling Down, Killing Litle Johnny's Entire Family.

      Maybe we CS types need our own safety movies, perhaps When Buffers Attack!, Threads: Your Parallel Friends or Quagmires of Debugging DOOM?, or maybe Metric or Imperial: You Mean there's a Difference? Or maybe we need to recognize that many of us have the same awesome responsibility that other engineers do of protecting human lives from the consequences of our mistakes. I'm told that this point is hammered home in engineering schools, why not in CS departments?

      --
      - None can love freedom heartily, but good men; the rest love not freedom, but license. -- John Milton
    6. Re:YACCS -Yet Another Computer Corkup in Space by Arbitor+Elegantorum · · Score: 1

      According to NASA, MGS outlived its design parameters by 400%, and relayed important information right up to the end. Further, the Mars Rovers have outlived their warranty by 2 years. I think we're doing something right.

    7. Re:YACCS -Yet Another Computer Corkup in Space by unix_core · · Score: 4, Funny

      I think I've seen some of those, starring Troy McLure right?

    8. Re:YACCS -Yet Another Computer Corkup in Space by januth · · Score: 3, Insightful

      I wouldn't call it a failure of Computer Science; it's a QA failure without a doubt.

      Mistakes happen when you code. Sure, you try to minimize them but even the most carefully designed code can't be guaranteed to be 100% error free. That's why you employ, presumably, a top-notch QA team to check and recheck, testing your "perfect" code in ways that perhaps you never even considered.

      This is what you would expect in a terrestrial application. When the platform that your code is going to run on isn't bound to the same gravitational source that you are, you would think...you would *hope*...that the QA team might do an even more thorough job.

      If this event is at all indicative of the QA efforts that NASA will be making for our return to the moon, perhaps we'd be better off staying at home.

    9. Re:YACCS -Yet Another Computer Corkup in Space by Mayhem178 · · Score: 4, Insightful

      For the uninformed, QA = Quality Assurance. A must-have for any self-respecting software model.

      NASA has got it rough, has since the mid 70s. Their wildest successes are regarded as routine and hardly noticed by the public eye. Their failures, on the other hand, are spun to be the worst disasters in human history. Granted, when shuttles explode and people die, it's reasonable that the public be concerned. But it seems to me that for every 20 great things that NASA accomplishes, the media picks 1 failure (and sometimes blows that failure out of proportion) to rile the masses into a furious frenzy calling for the dissolution of NASA.

      --

      "You will pay for your lack of vision..." - Emperor Palpatine to Ray Charles

    10. Re:YACCS -Yet Another Computer Corkup in Space by Fishbulb · · Score: 5, Informative

      The F-16 didn't "bounce off the equator". Before it ever flew, in simulation the computer flipped the plane over when it crossed the equator due to a bug that incorrectly handled southern lattitudes. Additionally, since the computer "flip" happened instantaneously, and the f-16 can roll at much higher G forces than the pilot can take, the flip would have killed the pilot (and the F-16 would have happily continued on its way).

      http://portal.acm.org/ft_gateway.cfm?id=163293&typ e=pdf&coll=GUIDE&dl=GUIDE&CFID=11154656&CFTOKEN=19 136062

    11. Re:YACCS -Yet Another Computer Corkup in Space by kfg · · Score: 1

      Aero and space are very unforgiving of human coding errors.

      The sea is no pussycat either.

      KFG

    12. Re:YACCS -Yet Another Computer Corkup in Space by caerwyn · · Score: 2, Insightful

      CS people are math guys too, at least many of us are. That doesn't mean we necessarily have the expertise to validate aerospace control algorithms on the fly- that's why the's an entire discipline of aerospace engineers, because you can't expect all the *other* engineers to have sufficient knowledge.

      Things like this are built as teams- and team members have to make certain assumptions about the accuracy of the other team members' work. Those algorithms should have been validated before even being handed off to the programmers, and then validated *again* as part of integrated testing.

      --
      The ringing of the division bell has begun... -PF
    13. Re:YACCS -Yet Another Computer Corkup in Space by HangingChad · · Score: 1

      And don't forget the Mars Climate Orbiter "Dirt Dart" mission (http://en.wikipedia.org/wiki/Mars_Climate_Orbiter ). Okay the operators helped by plugging in the wrong units but neither did the software catch the discrepancy in the values.

      The systems aboard the spacecraft were not able to reconcile the two systems of measurement, resulting in the navigation error.

      Operator error but it would be interesting to figure in the number of accidents that the software could have prevented the operator from entering the wrong values, or at least prompted them that the values don't match.

      I'm not blaming the programmers. It's amazing how well things work considering the distance, temperature extremes, radiation and it's not exactly like you can bring it into the shop if something goes wrong.

      --
      That's our life, the big wheel of shit. - The Fat Man, Blue Tango Salvage
    14. Re:YACCS -Yet Another Computer Corkup in Space by spun · · Score: 1

      When I came up with those names, I pictured Troy saying them. Dammit, Phil Hartman, why'd you have to marry a crazy murdering alchoholic bitch?

      *Sigh*

      --
      - None can love freedom heartily, but good men; the rest love not freedom, but license. -- John Milton
    15. Re:YACCS -Yet Another Computer Corkup in Space by Minwee · · Score: 2, Insightful

      Okay the operators helped by plugging in the wrong units but neither did the software catch the discrepancy in the values.

      "On two occasions I have been asked [by members of Parliament], 'Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?' I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question."

      Plus ça change, plus c'est la même chose.

    16. Re:YACCS -Yet Another Computer Corkup in Space by ChrisA90278 · · Score: 1
      The problem is the inability to test the software in a realistic environment. In fact you CAN'T fully test software. For example let's say you write a program to add to numbers and print the sum. Very simple program but all you can do is "spot check" it with a few test numbers. for example I doubt testing would catch the bug in the following program
      get a value for "A"
      get a value for "B"
      if (a == 3248532346863247) Add 3 to A
      print (A+B)

      What or the chances you would use 3248532346863247 as a test value? You could run the abouve program for 100 years and no one would likerly ever find the bug. The only way to find it would be to read the code. It this case it is only four lines of code and anyone could find the error. But what it it were 1,000,000 lines? No human could ever read it but yet having a human read it is to only way to find errors. So you break it up and have 100 humans each read 10,000 lines. What if the bug is in the subtle interaction betwen the parts? The ONLY solution is to design systems that are tolerent of software bugs. Lots of ways to do this. Put a human pilot inside the airplan or Lunerlander or build a computer to watch the computer or simply fly your test out over the ocean where if they blow up no one is harmmed. You just have to asume there will be bugs and you will not be able to detect them
    17. Re:YACCS -Yet Another Computer Corkup in Space by Flavio · · Score: 3, Insightful

      In other disciplines, the engineers ARE math guys. Face it, compared to other engineering types, software engineers and programmers are SLOPPY. This is because engineering has thousands of years worth of spectacular cork-ups with enormous death tolls to look back on, and engineering students are (I'm guessing, IANAE) shown horrific, traffic-safetyesque movies like Blood on the Protractor, Slide Rule Massacre, and London Bridge is Falling Down, Killing Litle Johnny's Entire Family.

      Engineering and applied mathematics are much more demanding than computer programming. Sure, one could argue that "computer science is math too", but my experience is that CS majors don't graduate with a strong math background. And even if they did once know some calculus and linear algebra, they were never required to apply it like an EE or Applied Math person would.

      So while you could find a rigorous programmer or software engineer (and I use the term "software engineer" very loosely, because few individuals actually fit that description), it's often a lot easier to look for an engineer or applied mathematician with good programming skills. Their math and physics is usually significantly stronger, and they actually understand what they're programming.

    18. Re:YACCS -Yet Another Computer Corkup in Space by camperdave · · Score: 1

      Let's also not forget that the shuttle was DESIGNED FROM DAY ONE NOT TO BE IN SPACE DURING NEW YEARS. Too many support staff are on holidays during that time to make it safe.

      --
      When our name is on the back of your car, we're behind you all the way!
    19. Re:YACCS -Yet Another Computer Corkup in Space by mkw87 · · Score: 1

      As a Mechanical Engineering undergraduate senior, I can confirm that we are shown horror stories, but not in a way that actually makes a difference. Yes, we know that what we design will often be used in situations that can kill people, but look at the design procedures for both and you'll see that they really aren't TOO different on a vague level. One of the things that helps mechanical design is all the history we have with it, and the use of safety factors. There are not many things with code that you can do to save your ass if you aren't sure.

      --
      Arguing with an engineer is like wrestling a pig in mud. Soon, you realize the pig is dirty, and he likes it.
    20. Re:YACCS -Yet Another Computer Corkup in Space by Rei · · Score: 2, Insightful

      To put the shoe on the other foot, have you ever seen software written by people who aren't programmers? Uck. The code is usually a nightmare. Things like:

      "Well, here we're using the global "qzv" as a loop variable, but over here we'll use it to mean how many widgets we're looking at, and over here, it's our exit condition. Oh, and we'll set it to '5' over here for no discernable reason. Now, here's where we've cut and pasted the code 15 times so that we could change one variable's type (instead of using templates), but naturally, all of the bugfixes we've applied since then haven't all migrated into all of the versions. Ah, here's the core of the code, where we cast structs and function pointers to void pointers, and then pass those around, with a jurry-rigged method to figure out what they actually contain duplicated in every piece of code that uses. If you scroll up in this 23,000 line file, you'll see eighteen pages of commented out code. Scroll down, and you can see the famous Sea of TODO Notes -- the only place in the file in which comment are actually associated with descriptive text. Unfortunately, most of them contain only the word 'Fixme'. Now, on to the diverse species of macros you'll find scattered about, defined and redefined throughout the code..."

      --
      "Now," she thought, watching the dolphins adjust their bowties, "might be a good time to up my medication."
    21. Re:YACCS -Yet Another Computer Corkup in Space by camperdave · · Score: 1

      Actually, the "warranty period" was only 90 days. The rovers are now entering the fourth year of operation. Kudos to the design and build teams (JPL, I think).

      --
      When our name is on the back of your car, we're behind you all the way!
    22. Re:YACCS -Yet Another Computer Corkup in Space by Anonymous Coward · · Score: 0

      Darn. I wish I hadn't just spent my mod points in the nuclear power thread. Excellent post.

    23. Re:YACCS -Yet Another Computer Corkup in Space by DerekLyons · · Score: 1
      wouldn't call it a failure of Computer Science; it's a QA failure without a doubt.
       
      Mistakes happen when you code. Sure, you try to minimize them but even the most carefully designed code can't be guaranteed to be 100% error free. That's why you employ, presumably, a top-notch QA team to check and recheck, testing your "perfect" code in ways that perhaps you never even considered.

      And even then - there exists the non-trivial possibility that something might slip through. No QA is ever going to reach absolute 1.00 perfection.
       
       
      This is what you would expect in a terrestrial application. When the platform that your code is going to run on isn't bound to the same gravitational source that you are, you would think...you would *hope*...that the QA team might do an even more thorough job.

      They do as thorough a job as can be expected in the real world - where issues like budget and manpower arise, as well as the inevitable (howsoever small) chance that a bug will slip through. On top of this, MGS is over a decade old - which adds in the chance of a tiny 'gotcha' that the original programmers knew about, but the current ones don't.
       
       
      If this event is at all indicative of the QA efforts that NASA will be making for our return to the moon, perhaps we'd be better off staying at home.

      Baseless fearmongering based on a strawman of your own creation. The reality is that NASA (flight) software record is orders of magnitude better than any other organiztion.
    24. Re:YACCS -Yet Another Computer Corkup in Space by Mike1024 · · Score: 1

      Don't forget:

      Mars lander Spirit started randomly rebooting due to a flash memory access problem.

      Mars Polar Lander was lost - there's no definitive proof but it's thought it was a software error (sensing leg deployment ready for landing as actual landing, and hence deactivating thrust too early).

      Mars climate orbiter? Chalk that one up to a metric/imperial conversion error... in software.

      Of course, the argument could be made that there's no real alternative to doing it in software...

      Michael

      --
      "Goodness me, how unlike the FBI to abuse the trust of the American public." -- The Onion
    25. Re:YACCS -Yet Another Computer Corkup in Space by spun · · Score: 1

      This is a very important point, one I allude to in my comment. Engineers have tradition to go on, thousands of years of experience with what works and what doesn't. Cs is too new, we're in a stage like that of Egyptian engineers when they first started putting up the pyramids. "Okay, so a 45 degree angle didn't work, lets try 37..." Give us a few thousand years like you guys have had and I think we'll do okay ;)

      --
      - None can love freedom heartily, but good men; the rest love not freedom, but license. -- John Milton
    26. Re:YACCS -Yet Another Computer Corkup in Space by wik · · Score: 1

      Houston, we have the solution.

      --
      / \
      \ / ASCII ribbon campaign for peace
      x
      / \
    27. Re:YACCS -Yet Another Computer Corkup in Space by Chris+Burke · · Score: 1

      NASA has got it rough, has since the mid 70s. Their wildest successes are regarded as routine and hardly noticed by the public eye. Their failures, on the other hand, are spun to be the worst disasters in human history. Granted, when shuttles explode and people die, it's reasonable that the public be concerned. But it seems to me that for every 20 great things that NASA accomplishes, the media picks 1 failure (and sometimes blows that failure out of proportion) to rile the masses into a furious frenzy calling for the dissolution of NASA.

      Maybe it's just me, maybe it's just people who appreciate some of the engineering difficulties that go into something like a space probe, but I'm constantly astounded by what NASA is able to pull off. I'm as astounded by their near-failures as their flawless successes. When some piece of fairly important hardware fails, a subsequent software glitch causes the main antenna to fail, and they somehow jury-rig their way back into getting in touch with the probe, load new software onto the probe to work around the failed hardware, and then reboot the thing and it all works so the mission can continue as if nothing went wrong, my jaw just drops. The fact that every once in a while something happens that they can't debug from a million miles away and they lose a probe I hardly see as a failing of NASA.

      Not that getting your units wrong is smart, I'm just saying they're at the top of the field and throwing tomatoes at them seems misguided.

      --

      The enemies of Democracy are
    28. Re:YACCS -Yet Another Computer Corkup in Space by lysergic.acid · · Score: 1

      just don't be spun when you write that shuttle life-support system code, that's all... ;-)

    29. Re:YACCS -Yet Another Computer Corkup in Space by Have+Brain+Will+Rent · · Score: 1

      Trust me that isn't what you want. I have seen a lot of code written by physicists, chemists, mathematicians and various and sundry other people who are experts in their own fields. For the most part its all crap that ends up costing them an enormous lot of time (theirs, when they have to try and use what they've written for the next N years) and money (theirs, when and if they decide to have it fixed).

      --
      The tyrant will always find a pretext for his tyranny - Aesop
    30. Re:YACCS -Yet Another Computer Corkup in Space by ralphdaugherty · · Score: 1

      Aero and space are very unforgiving of human coding errors.

            I don't see the Mars lander or whatever problem where something was in metric but the program coded for English measurements, IIRC.

        rd

    31. Re:YACCS -Yet Another Computer Corkup in Space by Flavio · · Score: 1

      I've also seen a fair sample of code written by people without a strong CS background and I agree -- most of the time it's anywhere from bad to horrible. But there are engineers, physicists and mathematicians with good programming (and computer science) skills, and I was referring to those. The kind of people who wouldn't know what the dragon book is and don't read RFCs, but have a solid understanding of data structures, algorithms, modelling and good programming practice.

      There are some areas where the typical CS majors excel. Business programming is one of those. They're also fine for a lot of the industrial code out there. But given the choice, I prefer to leave mission critical scientific code to scientists (engineers, applied mathematicians and physicists). CS guys would be our world's general purpose CPUs, and engineers would be the DSPs -- it's another instance of choosing the right tool for the right job.

    32. Re:YACCS -Yet Another Computer Corkup in Space by multicsfan · · Score: 1

      I think I can reply to this having both an Engineering degree (Computer and Systems Engineering (I started as electrical)) and a CS degree (both from www.rpi.edu). Comments apply to my experience and knowledge in the US and specifically New York State.
      There is a significant difference in how engineers and CS approach problems. As has been pointed out, Engineers (depending on the field) have hundreds of years of past experience. In some fields, the default method is someone does the design and at least one other engineer reviews the design before it is implimented, in some cases several engineers. With civil engineering things like buildings and bridges at least one PE (Professional Engineer) has approved the plans. In general this is at a minimum standard practice and for public buildings/bridges/etc often mandatory by law. In addition, most projects have some form of government inspection done while the project is underway.
      CS is still so new that the favorite language is usually the favorite language of the decade.in the 60's/70's the big ones were Fortan, Cobol, and Pl/1. 80's/90's had C as the big new language, C++/Perl in the 90's and now java. Along the way lots of other languages came and went or are used more in special purpose areas like SNOBOL, LISP, Pascel, Ada, and many more. Computer Science professors tend to push their favorite language of the semester. Usually for some esoteric pure science type reason having nothing to do with being practical.
      Fortran, Cobol, and PL/1 weere in general developed/designed by either engineers or business people to solve practical problems and insure that problems could either be avoided or readily found. C, C++, and from what I've seen java all have alot of builtin gotcha's that can be very hard to track down, particularly C++/Java as so much work is left to the compiler that the programmer may not know if a bug is in his program or the compiler or the documentation for the compiler/language. With Fortran, Cobol, and PL/1 there is a very easy to associate one to one corespondence between the lines of code in the language and the machine code produced.
      These 3 languages are designed to solve real world problems. Cobol is designed to handle financial data, Fortran to handle scientific/engineering calculations, PL/1 is designed to handle both with Cobol equivalent data types for financial data and Fortran data types for Engineering/Scientific plus some additional types to make programming easier to both read as well as document. PL/1 also added programmer control over many errors (events) and allowed the programmer to override the system default when/as needed.
      The addition of the STL for C++ will hopefully be a big help there so there is alot less rolling your own for things done well in the STL. I haven't studied C++ recently so I've only heard of the STL.
      C, perl, C++, and I assume java lend themselves very readily to obfuscted programming contests. This can be done in any language, but those languages tend to encourage many programmers to do this type of coding on a more regular basis.
      Another difference is there is alot more emphases on documenting things including the 'chain' of authority then in programming. Other then at NASA and DOD there is alot less code review and debugging done then there should be and alot more of the programming is sloppy. In some cases this is preassure from management to get the job done quickly. Some know they are asking for a lower quality product, others don't.
      Another problem with many software projects is management can't see the work progress as with some projects there is nothing that will work until you get to the 70% to 90% completed part. Well we've got 20MB of source, 50MB of documentation, but we've still got another 2MB or code to write before we can start running any demos or testing. With most engineering projects there is some type of progress you can see: The architecte shows concepts, the customer selects one or works with the architec until they like the concep

    33. Re:YACCS -Yet Another Computer Corkup in Space by Vintermann · · Score: 1

      "Engineering and applied mathematics are much more demanding than computer programming."

      Than actual programming, yes, but as soon as you want to prove stuff about your code, like construction engineers prove that a bridge will hold given certain conditions, the mathemathics are immensely much more difficult. Also, unfortunately, there are many truly hard problems in software that are hard to pin down, and not directly connected to mathemathics. For better or worse, most developer education focuses on those "soft" issues. How do we write code that can be reused? How do we prepare for changing requirements? How can we keep complexity down? How do we write so that other people can read our code efficiently? many issues either aren't there for other disciplines or have been solved generations ago.

      --
      xkcd is not in the sudoers file. This incident will be reported.
    34. Re:YACCS -Yet Another Computer Corkup in Space by Vintermann · · Score: 1

      "There are some areas where the typical CS majors excel. Business programming is one of those."

      I have a bachelor in software engineering, mostly directed at business programming, but no one there told me to never store currency in a float (I learned that from being curious about Ada). Actually, ints aren't always sufficient either, there are reasons why many good modern business programs have a Binary Coded Decimal library linked in.

      Engineer/Physicists/Mathemathician-type programmers may think that to write business programs you only need "general" programming skills and little specialised knowledge, but that is very wrong. Getting accounting right is difficult, getting supply chain management right is tricky. And don't even think about working at banking or insurance without some serious domain-specific specialisation.

      --
      xkcd is not in the sudoers file. This incident will be reported.
    35. Re:YACCS -Yet Another Computer Corkup in Space by Ancient_Hacker · · Score: 2, Informative
      >Additionally, since the computer "flip" happened instantaneously, and the f-16 can roll at much higher G forces than the pilot can take, the flip would have killed the pilot.

      Well your whole post is called into question due to quite a few questionable items:

      • It seems unlikely that the lattitude would enter at all into any calculation of roll attitude. If so, it's more than a "bug", it's a basic design mistake.
      • The F-16 does have a high roll rate, about 320 degrees per second, but since the pilot is very close to the roll axis, there's very little acceleration at the pilot's position during your basic aileron-roll. Pilots routinely apply maximum roll without dying, or even passing-out.
      • Nobody dies intantly from excess G-s... Fighter pilots overdo it all the time. Usually they let off the stick as they feel the early effects, such as a narrowing or darkening field of vision. If they keep on commanding too many G's, they'll pass out and that will let pressure off the controls, which quickly reduces the G forces. Good fail-safe system.
      • Flipping upside down will quickly send blood to the head, which is exacrtly what's needed to recover from too many positive G's.

    36. Re:YACCS -Yet Another Computer Corkup in Space by Flavio · · Score: 1

      Engineer/Physicists/Mathemathician-type programmers may think that to write business programs you only need "general" programming skills and little specialised knowledge, but that is very wrong. Getting accounting right is difficult, getting supply chain management right is tricky. And don't even think about working at banking or insurance without some serious domain-specific specialisation.

      I once interned at a consultancy developing business software. Our product was in some ways an analogue to Siebel, and the people I worked with also gave Siebel consultancy. So even though I don't claim to have some specialization in business software, I understand what you mean.

      I think everyone would rather have software engineering people writing business software because these apps tend to get really big really fast, and are constantly under modification. Developing this sort of code without a good software model is a recipe for failure, and anyone with experience has seen this in practice. Scientific applications tend to be very specialized, and you can get stable products with good (instead of excellent) programming practice. I can't speak for all engineers/physicists/etc., but this is why I tend to associate software engineers with business code.

    37. Re:YACCS -Yet Another Computer Corkup in Space by spun · · Score: 1

      You know, I think you are the first person on /. who gets the meaning behind my handle.

      --
      - None can love freedom heartily, but good men; the rest love not freedom, but license. -- John Milton
    38. Re:YACCS -Yet Another Computer Corkup in Space by spun · · Score: 1

      You have laid out the differences between the fields very clearly. CS has always seemed to be winging it compared to engineering. Given enough time, CS will become like any other engineering field.

      --
      - None can love freedom heartily, but good men; the rest love not freedom, but license. -- John Milton
    39. Re:YACCS -Yet Another Computer Corkup in Space by Have+Brain+Will+Rent · · Score: 1

      But there are engineers, physicists and mathematicians with good programming (and computer science) skills, and I was referring to those. The kind of people who wouldn't know what the dragon book is and don't read RFCs, but have a solid understanding of data structures, algorithms, modelling and good programming practice.

      Well if they haven't seen the dragon book hopefully they won't ever have to design (or implement compilers/translators/interpreters for) any control languages, scripting languages or indeed anything that requires parsing. And of course without knowledge of AI let's hope they never have to implement anything like, oh say, heuristics or fuzzy logic. Of course some instruction in assembler/machine language and firmware development wouldn't hurt either.

      Add in the things you did mention and, assuming you meant to include computational theory when you said algorithms, you have something approximating a full CS degree.

      Now it is possible for say a mathematician to pick that up on his own - I've known one or two that have but they essentially stopped being practising mathematicians and became fulltime CS people. BTW this comment is coming from someone who started out as a physics student and ended up with advanced degrees in both math and CS.

      There are some areas where the typical CS majors excel. Business programming is one of those. They're also fine for a lot of the industrial code out there.

      I'm sure the CS degree holders of the world appreciate your generousity.

      But given the choice, I prefer to leave mission critical scientific code to scientists (engineers, applied mathematicians and physicists).

      These would be the same people who crashed a probe into Mars because they didn't sucessfully convert between metric and imperial units? Etcetera, etcetera.

      CS guys would be our world's general purpose CPUs, and engineers would be the DSPs -- it's another instance of choosing the right tool for the right job.[What happened to the scientists? oh well.]

      And the right tool for the job is someone trained for the job not someone who picked it up on the fly. Given a need for software/firmware in a mission critical science application you have three choices:

      1. Hire a science person trained in the area and hire a CS person and have them work together.
      2. Hire a science person who has somehow picked up some level of programming ability to do everything.
      3. Hire a CS person who has somehow picked up some level of science ability to do everything.

      The right answer is 1 and you have chosen one of the other two.

      --
      The tyrant will always find a pretext for his tyranny - Aesop
  9. It must have been by wumpus188 · · Score: 1

    .. a Sony battery.

  10. Windows Software? by StumpMan · · Score: 0, Offtopic

    Microsoft Validation required. Please click the Continue button to begin Windows validation.

    1. Re:Windows Software? by Anonymous Coward · · Score: 0

      What type of computer is the Pathfinder utilizing? Is the CPU from Intel or Motorola or custom made? How fast does it run and how much memory does it contain? Is there more than one computer on board? What programming language was utilized in the software?

      The computer is a Radiation Hardened IBM Risc 6000 Single Chip (Rad6000 SC) CPU. It is the same as the IBM R6000 workstation. Lockeed-Martin Federal Systems in Manassas, VA, is responsible for doing the radiation hardening of the Rad6000 SC as well as developing the complete Mars Pathfinder Flight Computer (MFC).

      The MFC contains 128 MBytes of DRAM memory and runs at speeds of 2.5, 5, 10 and 20 MegaHertz. This translates to approximately 2.7, 5.5, 11, and 22 MIPS (this does vary, depending on which benchmark is being used). The code was developed using VxWorks as the real-time OS and "C" and assembly languages. It utilizes object-oriented constructs.

      On the system there is only one computer to control the spacecraft throughout all phases of the mission. The Rover has a very small CPU that it uses once we have landed and the rover is released. All communications to Earth from the spacecraft and rover come through the Rad6000 SC.

    2. Re:Windows Software? by StumpMan · · Score: 1

      Yeah, but, what are the framerates in Doom3?

  11. Dam it by VEGETA_GT · · Score: 0, Redundant

    I told you that letting a Microsoft Programmer onto the team was a bad idea.

    1. Re:Dam it by Anonymous Coward · · Score: 0

      it was an IBM'er. On an R6000.

  12. super tuesday by dcskier · · Score: 1

    they should've waited until super tuesday before issuing the patch. everyone knows not to patch out of cycle.

  13. MGS spaceship? by DrXym · · Score: 1

    Perhaps Big Boss killed it

  14. MGS? by Rob+T+Firefly · · Score: 1

    Everyone knows, it was Solid Snake that destroyed Metal Gear Solid.

    1. Re:MGS? by Bendy+Chief · · Score: 1

      Spacecraft? Spacecraft?! SPACECRAFT!!!

  15. We hardware types always blame software by Quiet_Desperation · · Score: 1

    It's just the way of the world. :)

    1. Re:We hardware types always blame software by daniel23 · · Score: 1


      reminds me of that old sig:

      The 3 most dangerous situations:

      A hardware guy with a software patch.
      A user with an idea.
      A coder with an electric iron.

      --
      605413? Yes, it's a prime.
  16. Movie re-write? by sherms · · Score: 1

    So does this mean they will have to re-write "Red Planet"? Wasn't there a scene where they used components from that machine?

    1. Re:Movie re-write? by Anonymous Coward · · Score: 0

      MGS is an orbiter. They used Pathfinder, iirc, which is a surface vehicle.

    2. Re:Movie re-write? by sherms · · Score: 1

      thanks its been awhile I'll have to watch it again.

  17. Pilot said.... by isieo · · Score: 2, Funny

    Houston, I B.S.O.Ded

  18. The Daily WTF by shadowcode · · Score: 1

    That'd be one hell of a submission to The Daily Wtf.

  19. Is this a sign? by Billosaur · · Score: 4, Insightful

    Some expert is always trumpeting the fact that "Johnny can't program," to which many of us roll our eyes and go back to coding. But could this be a sign that the quality of the help NASA is hiring is such that these kinds of mistakes are now rampant? I mean, this could have been avoided if the code had been tested out on a full-scale mock-up of the machine, to verify that it did what it was supposed to do, before ever sending the commands to the actual machine. If anything, it's a QA failure.

    --
    GetOuttaMySpace - The Anti-Social Network
    1. Re:Is this a sign? by Anonymous Coward · · Score: 0

      Chances are they did run it in a full-scale mock-up before sending to the spacecraft. NASA tends to be very picky about this sort of thing. The trouble is, you always have a human in the loop at some point.

    2. Re:Is this a sign? by benevixit · · Score: 5, Insightful

      In all fairness, writing code for a spacecraft is a lot harder than most of our Earthbound coding projects. These are custom-built machines running one-of-a-kind hardware; one can simulate components independently but it's very difficult to figure out how the hardware is going to behave up there in the vacuum. For example, consider the one function of maintaining orientation. Most spacecraft use telescopes that look for star reference points. They look for particular star configurations and use microthrusters or gyroscopes to adjust their orientation. Imagine what it would take to simulate this: a zero-gravity vacuum with a realistic star-field at focus=infinity. Any laboratory mock up is going to cost a lot more than launching a new spacecraft. And that's just one subsystem. Software upgrades at NASA go through a really rigorous quality control regimen, often requiring programmers to justify _individual_lines_ of their code to a review committee. Even then they usually won't patch noncritical bugs until the primary mission is completed. I think your point is a good one. And the key lesson is not that NASA QA sucks, it's that programming for spacecraft is _tough_. I know they are constantly investigating new ways (like more standardization, code re-use, and formal verification procedures) of improving software reliability.

    3. Re:Is this a sign? by benevixit · · Score: 1

      Yeah, some line breaks would have been welcome. Should have tested my post with a mock-up before submitting I guess.

    4. Re:Is this a sign? by Zontar_Thing_From_Ve · · Score: 1

      Some expert is always trumpeting the fact that "Johnny can't program," to which many of us roll our eyes and go back to coding. But could this be a sign that the quality of the help NASA is hiring is such that these kinds of mistakes are now rampant? I mean, this could have been avoided if the code had been tested out on a full-scale mock-up of the machine, to verify that it did what it was supposed to do, before ever sending the commands to the actual machine. If anything, it's a QA failure.

      I used to work for the US government on a job I thankfully left a long time ago. I can't speak for NASA in particular as I worked for the Department of Defense. Keep in mind that things might be different at NASA. Typically, working for Uncle Sam is not as lucrative as working in private industry. There are compensating benefits though. It's just about impossible to get fired. Uncle Sam gives better vacation benefits than most American employers. Early retirement is very realistic opportunity when working for Uncle Sam. At least where I worked, we tended to attract people who wanted to live in a small town (government salaries go further there) and people who were not very motivated for the most part. You get what you pay for. We would get the guys who graduated at the very bottom of their engineering classes because the guys above them wouldn't work for government salaries.

      To be fair, NASA has cut an awful lot of corners in recent years and had some really bad management make a lot of really bad decisions. I'm still unconvinced that NASA management knows what it is doing. When I worked for the DoD, QA was a joke. It was up to the programmers to test their own code. QA is significantly better in private industry than when I worked for the DoD. It could also be that the programmer's code did exactly what he wanted it to do, but he misunderstood what he was supposed to do.

    5. Re:Is this a sign? by Anonymous Coward · · Score: 0

      Rampant? Clearly you and I have a different definition of rampant. 1 shutdown in 10 years? How many personal computers can claim that...with software that's tested on hundreds or thousands of machines before being released and running in production on millions (*cough... XP Service Pack 1*)?

      And how about testing on a full scale mockup? The MGS costs somewhere around $200 million dollars. Building a second one would be less because that includes a lot of design work, but it's not be like scrapping together a $500 test PC. It also still won't be in orbit, receiving real inputs.

      Yes it's largely a Q/A failure, but you can bet you're rear end that it was hardly obvious. They do test these things on simulators, but nobody can dream up every test case. It took 4 months from the time of the software update for the problem to manifest itself, and even after the failure it's taken over a month to find the likely culprit. Even then, the bug wasn't directly fatal, but left the spacecraft in an orientation where it's radiator was broadside to the sun, gaining heat instead of dissipating it. So even running additional, lengthy, expensive, full-scale tests of the software update might not have brought the bug to light.

      Of course, all this assumes that it actually was the wrong memory address being overwritten that caused the failure. They still haven't ruled out a failure of the 10 year old solar panel motors or gyros or meteoroid damage.

    6. Re:Is this a sign? by ivan256 · · Score: 1

      You also need to take into account that the error could be something like failing to deal with a broken component. They may not have known something was broken, and things don't always break in a predictable way. It's still a software bug for not properly handling an error condition, but some error conditions are unlikely to be predicted.

    7. Re:Is this a sign? by Spikeles · · Score: 1
      Some expert is always trumpeting the fact that "Johnny can't program,"
      His name isn't "Johnny", It's N. Pence - Flight Software http://mars.jpl.nasa.gov/mgs/people/
      --
      I don't need to test my programs.. I have an error correcting modem.
    8. Re:Is this a sign? by AmberBlackCat · · Score: 1

      The article summary makes it seem like the device failed because of one part moving too close to another part. That seems like something that could be tested reliably in a simulation, to me. But I personally think it's more of an engineering failure, for producing the device in a way that makes it even possible for one of its parts to hit its own self-destruct button.

  20. Should have used Gentoo!! by Marcion · · Score: 1

    The updates would have been added in a sandbox and then only moved to the main system if they passed all the tests.

    1. Re:Should have used Gentoo!! by zootm · · Score: 4, Insightful

      No sandbox can avoid the fact that one test was missing.

    2. Re:Should have used Gentoo!! by bhsurfer · · Score: 1

      Man, I'd feel really super important if I wrote a bug that destructive! I feel so inadequate... I need a hug.

      --
      Those are my principles, and if you don't like them... well, I have others.
      Groucho Marx
    3. Re:Should have used Gentoo!! by zootm · · Score: 1

      What you need to do is hold back on producing all those "fun" bugs that we all introduce into systems until you've the reputation as one of the best coders in the world, then go work for NASA and just go wild on some system that won't be used until it's in deep space and you're off working for Google, having destroyed the paper trail.

    4. Re:Should have used Gentoo!! by the_tsi · · Score: 1

      ...But if they installed the update on a gentoo sandbox before installing it on the MGS itself, it wouldn't be compiled for EXACTLY that machine, and as we all know, it's the precise compiling that results in gentoo's 20% performance increase (that and funrolling loops and putting flashy stripes on the computer, along with maybe a 8" exhaust).

    5. Re:Should have used Gentoo!! by Hatta · · Score: 1

      Isn't Mars one big sandbox?

      --
      Give me Classic Slashdot or give me death!
    6. Re:Should have used Gentoo!! by Anonymous Coward · · Score: 0

      You're the second gentoo supporter that I've seen here - you guys are like ants swarming around here...

    7. Re:Should have used Gentoo!! by bhsurfer · · Score: 1
      Thats it! Great idea! I'm hiring you as my personal manager.

      [rubs hands together in childlike glee, picturing large & spectacular catastrophes to come]

      --
      Those are my principles, and if you don't like them... well, I have others.
      Groucho Marx
    8. Re:Should have used Gentoo!! by lysergic.acid · · Score: 1

      As far as I'm concerned, the largest space disaster to date is still Leprechaun 4.

  21. Better than a metric-English conversion error by ccmay · · Score: 3, Insightful
    I guess those things happen. But at least it wasn't an error converting units, like the other Mars spacecraft that was lost. That is just incredibly stupid. Glad I'm not the "engineer" who wasted thousands of man-years and hundreds of millions of taxpayers' dollars because I was too stupid or lazy to convert between meters and feet.

    On a positive note, it has provided me an instructive example for when I help my teenagers with their math homework. If they say it's "almost" correct, I tell them that the guy who screwed up the Mars mission probably said the same thing.

    -ccm

    --
    Too much Law; not enough Order.
    1. Re:Better than a metric-English conversion error by Anonymous Coward · · Score: 0

      Stupidity is in the assumptions you've made.

      The conversion issue is down to similar assumptions - the engineer who wrote the height module did it to return feet, and we can assume (eek!) that it was perfect in every way, verifiably correct at all times.

      The engineer who used that module assumed that it returned values in metres, and his code could be proved to be correct in every way.

      No-one needed to convert between the units, they needed to *know* the values returned would be in different. units. Unfortunately, each of the teams that wrote their modules always used a particular unit, just not the same ones as each other.

      an example: If your teenagers give you maths homework, that they did using a different base, and you mark it as wrong, it would be you that was incorrect because (you are, what was it, "too stupid or lazy") to convert to the base you assume everyone uses.

    2. Re:Better than a metric-English conversion error by kfg · · Score: 2, Insightful

      If you wish them to grow up to be good little engineers; ask them to define how "almost" correct it is.

      KFG

    3. Re:Better than a metric-English conversion error by iamlucky13 · · Score: 4, Informative

      It wasn't one engineer. It was a team effort. And it wasn't a very simple matter of "forgetting". Several factors combined, including re-use of code from the MGS mission (a conversion factor was in the old code, but not recognized when the code was adapted for the doomed MCO) and budget constraints that limited pre-flight testing (so bug was missed...and in fact might have still been missed even with more testing). The effects of the bug were also subtle enough that 3 minor main engine firings were conducted without enough error showing up to reveal the problem. It wasn't until the long orbital insertion firing that the error in the trajectory became noticeable, and by then it was too late. The team's first clue something was wrong was when the spacecraft didn't radio home after the engine burn.

      The details are really convoluted, but the Wikipedia page on the mission has a decent write up explaining how the mistake was made, with additional resources cited. The PDF paper giving a perspective from the MCO team is particularly revealing, if you've got some time on your hands.

  22. Legend has it by BillGatesLoveChild · · Score: 1
    > Apparently the software error caused ... overheated the battery and destroyed it.

    Legend has it at Microsoft that if you introduce a bug that breaks the nightly build you have a stupid mascot that perches on your desk the next day. Wonder what the other NASA programmers will do to this guy?

  23. Auto Update... by MetaKey · · Score: 1
    And that is why you shut off that damn auto update thing on your PC.

    Pathetic Earthlings...

  24. So what if the battery is dead? by Viol8 · · Score: 1

    Surely it can still function on its solar arrays when its on the daylight side of the planet? Or would it drift too much out of alignment when in the dark? Or is there some other issue?

    1. Re:So what if the battery is dead? by iggymanz · · Score: 0, Flamebait

      You're our hero, as a slashdotter you've transcended to the next higher level. Not only did you not RTFA, you didn't even read the summary, that little bit about the heat radiators being pointed at the sun so the craft was cooked. I'm hoping you don't read the last part of the last sentence, maybe you'll achieve slash-enlightenment and be our Slashbuddha.

    2. Re:So what if the battery is dead? by Viol8 · · Score: 1

      Actually it clearly states it cooked the battery not the whole craft. So why don't you go RTFA instead of attempting some lame karma whoring you cretin.

    3. Re:So what if the battery is dead? by smoker2 · · Score: 2, Insightful

      I expect the electronics runs off the battery, and the solar just charges the battery. If the battery's dead, nothing will run.

    4. Re:So what if the battery is dead? by Beryllium+Sphere(tm) · · Score: 1

      The battery's failure mode matters. If it has an internal short, nothing will help.

    5. Re:So what if the battery is dead? by iggymanz · · Score: 1

      my definition of cooked is having vital innards destroyed so machine can't function. not entirely unlike your head.

  25. Obligatory by 8ball629 · · Score: 0, Troll

    Sounds like a Microsoft OS update to me.

  26. zing! by steak · · Score: 2, Funny

    that was the sound of me hitting the bullseye.

    [quote]at least if something went wrong some guy at nasa could tell his grand kids that he bricked something from ~140 million miles away.[/quote]

    http://slashdot.org/comments.pl?sid=214508&cid=174 27542

  27. Where's K'Breel? by Amazing+Quantum+Man · · Score: 2, Insightful

    We need his report! Tripmaster Monkey, where are you?

    --
    Fascism starts when the efficiency of the government becomes more important than the rights of the people.
  28. Reliability compared to what? by Vellmont · · Score: 1


    Just one more example of how Computer Science isn't quite up to the reliability requirements of Space

    And how many failures have happened because of an enginering mistake?

    You seem to assume that there's zero failure in space for everything else, and 6 problems in.. 30 years? is some horrible record.

    All information only makes sense in context. What's the failure rate of other components of the system?

    --
    AccountKiller
  29. Bits by michaelmalak · · Score: 1
    Maybe NASA's 'safe mode' just put 'safe mode' in the corners of all the returned images and did them in 8-bit colour...
    I think you meant to say 4-bit color.
  30. Rocket Science by Anonymous Coward · · Score: 0

    I'm glad to hear that rocket scientists make mistakes also.

  31. Microsoft Rules/Ruins !! by mgpandey · · Score: 0, Redundant

    Might be they upgraded it to Vista !!!

  32. Time for a recall of bad parts by Fry-kun · · Score: 1

    Does anyone else think it's about time to make a small satellite with a few "claws" to fly around our existing satellites and replace their various parts?
    It could probably do repairs to the ISS as well (spacewalks should be for fun, not for work).

    --
    Did you know that "FTW" ("for the win") is a direct translation of "Sieg Heil"?
  33. Safe Mode by cadeon · · Score: 1
    . . . which then forced the spacecraft into safe mode.

    We all know a machine Safe Mode doesn't allow remote management.

  34. Vampire Hackers by Doc+Ruby · · Score: 1

    No, everyone knows it's the Martian vampires. That SW glitch pointed the solar collectors at the Martian surface, overpowering the thin layer of blood that protects the biters from the weak rays of the Sun. We need to find out how the vampires reached the MGS to destroy it. Probably they have moles at NASA or a contractor with access to the controllers. We have to fund deployment of my SOLASER Space Debt Inc (SDI) weapon to fry them before they fry us.

    --

    --
    make install -not war

  35. Yeah right by Anonymous Coward · · Score: 0

    Face it, they bricked it in a firmware update while trying to circumvent the built in DRM and they are trying to blame the software manufacturer.

    I can see the ebay item now "10 y/o excellent condition spacecraft, dead battery quick fix, bricked"

  36. An easy fix... by Autonomous+Crowhard · · Score: 1
    Just have a nearby human replace the dead battery and restart the machine.

    Oh... right... manned exploration is a waste of money and robots are all we ever need.

  37. Re:"almost" correct by Migraineman · · Score: 1

    Funny, I have this conversation with my wife all the time. She's an elementary school teacher, and we regularly butt heads about how to deal with this. She's willing to grade a math problem as "correct" if the student demonstrated the correct process, but made a simple clerical error resulting in the wrong answer. She argues that the method is more important than a single result. Uh huhhh. So if I botch the balance in my checkbook, the bank will pat me on the head, say "that's okay," and front me the money I shouldn't have? I think not.

    There aren't many "absolute truths" in this existence, but math is one of them. Your calculations are either "correct" or "not correct." "Almost correct" is someone being spineless. I'd much rather know that I botched a calculation so I can perform it correctly the next time, rather than exist in blissful ignorance. Telling me that I'm stooopid is a personal attack; telling me my calculation is incorrect is a statement of fact. Folks need to learn that the latter statement isn't necessarily a bad thing. You learn by making mistakes.

  38. Lack of QA strikes Nasa AGAIN. by Banner · · Score: 1

    I tell you, you see all these ridiculous failures at NASA, it's pretty obvious that they either don't do QA, or that the QA teams are literally hamstrung. These things are the stuff that good QA and Test programs find, making people check bolts on a tilt table before ruining a 50 millon dollar satellite are what process and checklists are all about.

    These aren't 'normal workplace errors' that you have to live with, they're -stupid- errors, made because of stupid managers.

    1. Re:Lack of QA strikes Nasa AGAIN. by Anonymous Coward · · Score: 0

      Wow. You really are just asking for it, aren't you?

      1.) The spacecraft had already exceeded its mission life by 3 or 4 times. At that point in a project, it's probably fair to get more ambitious with the new things one tries to achieve through software updates, and spend less time and money on QA with marginal returns. They don't even know for sure that it was the software bug. Given it's age, it could very easily have been a hardware failure, such as the attitude control gyros that have been spinning continuously at high RPM's for 10 years...

      2.) You'd better believe it was tested. NASA generally only gets one try at a given mission. But nobody can think of every case, and no simulator can duplicate every possible failure, much less minor issues that lead indirectly to a failure, which this is, assuming the software was the root cause. Over the course of four months, it seems the radiator may have ended up pointing in the wrong direction.

      3.) Yes QA is hamstrung. It's always hamstrung if you have a limited budget, and you always have a limited budget. Even if this weren't the case, point number 2 still applies. QA is also affected by the lack of consistency in NASA's product. The same checks applied to a hypersonic test jet don't necessarily apply to a wheeled surface rover or to a solar powered orbiter.

      4.) It was good of you to point out that these aren't everyday errors. Specifically, they're once in a project (10 years in this case) errors. They just happen to have serious consequences. Of course, no software project outside of NASA has ever had bugs with an update (certainly not Windows), and no program, especially not one with an open beta participated in by hundreds of thousands of users such as Firefox, has ever been released with undiscovered security holes or even known memory leaks.

      5.) All this said, people will still make mistakes, even with processes and checklists in place. Sometimes it's bad judgement, like Challenger, sometimes it's a careless oversight, like tightening bolts on a tilt table, sometimes it's simple clumsiness, like breaking a worklight bulb in Discovery's cargo bay during launch prep. For more examples of mistakes by professionals, watch a baseball game. Millions of dollars are at stake there. Or watch the paper for medical lawsuits. Lives may have been lost there. Mistakes are bad. They're usually preventable. They're also inevitable any time people are involved.

    2. Re:Lack of QA strikes Nasa AGAIN. by JhohannaVH · · Score: 1

      And this is different from any other US Corporation how?

      Never forget, what we use, abuse, and refuse, all is created by human hands. Which by design are imperfect.

      --
      Sorry man... the Internet pooped on me.
    3. Re:Lack of QA strikes Nasa AGAIN. by Banner · · Score: 1

      Ah yes, good old rationalization. You're a non-techinical manager, aren't you? By your logic then when I worked on those strategic weapon systems years back, if I made a mistake and a lot of people died, well, who cares? They outlived thier original planned lifetimes anyway.

      Other industries do better than this, stop making excuses for poor management at NASA.

    4. Re:Lack of QA strikes Nasa AGAIN. by Anonymous Coward · · Score: 0

      Columbia was a big deal. People died, and in hindsight, it definitely was preventable. There's no denying that, and I certainly never said "who cares," or implied that astronauts in their 30's or 40's had "outlived thier planned lifetimes."

      We're talking about a probe here. A successful probe. No one died and no promised returns were compromised (it exceeded the promises). I admit part of my response was knee-jerk, because I get sick of people holding NASA up like an example of all things wrong in the world, when the critic isn't even really sure what happened in the first place. NASA is far from perfect, it's true, but it also has a lot of very capable engineers and scientists who put together fantastic and very ambitious missions that succeed. Frankly, I don't know what a much of this criticism is expected achieve, since it's seldom constructive. Ok, yes, improve QA is a reasonable suggestion, but there is also a point where the cost of further QA exceeds the potential benefits, and you seemed to assume QA was a mere formality for NASA.

      The fact is, no other entity has shown an ability to perform NASA's job better than them. You could point to successes in the software and auto industry, but neither of them go into space, and if one looks it's easy to find thousands of instances of mistakes, bad decisions, and compromises in their products, too. For those that do go into space; the ESA, the Russians, JAXA, etc; all have experienced as many or more problems as NASA. Read about Hayabusa or Beagle or the history of Mars exploration if you don't believe me.

      And no, I'm not a non-techie manager. I'm an engineer. I know the sorts of challenges a lot of projects face...even defense projects. The sort of things that lead to nuclear missiles failing to launch, SAM's missing their targets because of software bugs, V-22's crashing due to underwing vortices, and M-16's jamming after relatively minor mud exposure.

  39. one serious error in ten years rampant? by peter303 · · Score: 1

    My pc doesnt last ten days before crashing.

  40. Luxury! by avronius · · Score: 4, Funny

    We used to live in a vacuum tube. When the computer was running, and your bit was accessed, you almost had enough light to read by. Mother would disconnect the tube when she went to bed, causing floating point errors for almost eight clock-cycles...

    Or at least, that's how I remember it...

  41. We need Computer Engineering, not Scientists. by Banner · · Score: 1

    What is really needed is to get RID of Computer SCIENCE and move it over to the Engineering department and give us Computer ENGINEERING. Scientists don't build stuff, they investigate things, they don't -care- about better ways to build things, better ways to avoid mistakes, it's not their job. Engineers however are all about building the same damn bridge 100 times and making it better and safer each time.

    There is no discipline in Computer Programming these days, because Computer Programmers don't know how to engineer stuff. The simplest program is done differently by every programmer where if engineers were doing it they'd all be taught to do it the exact same way. Standardization is how you get rid of most errors. You'll notice that nobody is making new bolts or nails anymore, they're all standardized.

    1. Re:We need Computer Engineering, not Scientists. by spun · · Score: 1

      Don't get rid of computer science. Theorists are needed in all fields. Breakthrough advances rarely come from the applied sciences. But I really do wish we CS types could take some lessons from the engineering guys, who have a much longer history.

      One reason I love OSS is that its goal is to standardize and reuse code. I hate reinventing the wheel over and over again just because of some dumb non-disclosure clause.

      I think part of the problem is that computer programming takes a special blend of language, mathematical and logical ability. The type of person who is good at all three and thus a great programmer is different from the type of person who is drawn to engineering.

      Now it sounds like I'm dissing engineers but I don't mean to disrespect them or IT people. What I mean is that engineering requires a certain kind of very methodical personality. Computer science requires more of an ability to think outside the box, to see problems in new and unexpected ways.

      --
      - None can love freedom heartily, but good men; the rest love not freedom, but license. -- John Milton
    2. Re:We need Computer Engineering, not Scientists. by radish · · Score: 1

      What is really needed is to get RID of Computer SCIENCE and move it over to the Engineering department and give us Computer ENGINEERING

      And in a lot of the top institutions that's exactly what has happened. My degree is an engineering one, not a science.

      The simplest program is done differently by every programmer where if engineers were doing it they'd all be taught to do it the exact same way.

      If you re-write the exact same code over and over you're an idiot. The problem is that (at an application level) nothing is ever quite the same, so you can't just reuse 100% of an existing solution, there are always tweaks needed. It's actually the same with engineering - taking the mythical "identical bridge" example it's obvious that different bridges have different dimensions, foundations, environmental constraints, etc. So the engineer creates something new by assembling standard parts together and tweaking where needed.

      Standardization is how you get rid of most errors. You'll notice that nobody is making new bolts or nails anymore, they're all standardized.

      Agreed, which is why we have standard components. No-one should be writing sorts, or collections, or string manipulation routines from scratch, you should use an accepted standard. Good developers take that a step further and where possible use standard versions of much larger components, such as XML parsers, or application servers, or what have you. An engineer's bag of nails is my Java Standard Class Library.

      The real difference between the engineer building a suspension bridge and a large scale software project is one of complexity and resourcing. Software developers are building much more complex systems in a much shorter period of time and with much less in the way of resources. That's why corners are cut, because our clients demand it.

      --

      ---- Den ene knappen er powerknapp, den andre er Bender voice knapp "Bite My Shiny Metal Ass"

    3. Re:We need Computer Engineering, not Scientists. by alienmole · · Score: 2, Insightful

      We're never going to improve as long as people insist on comparing software development to building bridges, i.e. a more sophisticated understanding of the problem is needed. In software, once you have a program for a bridge you can make a billion bridges, all alike or customized by certain parameters, just by running the program. So being "able to build the same damn bridge 100 times" doesn't get you anywhere. Making it better and safer each time? That's another story, and once again, the comparison to bridge building doesn't hold up, because you're talking about improving the design, not the building practices or materials.

      If there was any merit in this canard, don't you think that before now, you'd have had some engineers who also knew software come along and revolutionize the software industry?

      Standardization is how you get rid of most errors. You'll notice that nobody is making new bolts or nails anymore, they're all standardized.

      You haven't written a line of code in your life, have you? If you have, tell me what level of standardization you're even talking about, in the software context.

    4. Re:We need Computer Engineering, not Scientists. by Banner · · Score: 1
      The real difference between the engineer building a suspension bridge and a large scale software project is one of complexity and resourcing. Software developers are building much more complex systems in a much shorter period of time and with much less in the way of resources. That's why corners are cut, because our clients demand it.


      Building said bridge is a lot more difficult than coding anything you can think of. Because you can always rewrite your code and patch it. You only get -one- chance to build your bridge. People are much more lackadasical about writing code because they can make changes whenever they want, and even then most coders rarely ever test any of their stuff. It's the old joke of the Engineer and the Programmer riding in a car all over again.
    5. Re:We need Computer Engineering, not Scientists. by Banner · · Score: 1

      I've been coding since 1970, I think I've probably written more than my fair share of code. I've also designed circuits and other hardware devices (I started in Robotics).

      The problem is something like 80 percent of all software programmers have never been exposed to any kind of engineering course in their lives, know nothing about how to design or build stuff, and probably 50 percent of them don't even have a degree in a related field! (I work at a fortune 500 company and I think only 20 percent of the programers here have college degrees in CS, and this is pretty much the standard in silicon valley).

      My comparison is apt because whether you are building a bridge, a circuit board, a chip, or anything, the same rules apply. You design it first, you look at your design, you test your design, then you build it, then you check your finished product. Unlike 99 percent of programmers out there who just start writing code without a clear design in their head. Any monkey can write code, writing code is -simple-. It's the design of the program and the system that is the hard part, and that design should be -completely- fleshed out before the first line of code is even generated.

      I've worked on tactical nuclear bombers. Haven't seen us accidentally NUKE anyone in the last twenty years have you? No I know a lot about writing code, because I'm an expert in the field. People put up with crappy code and crappy programmers because they don't know any better. But as in all fields this will change with time, once the people keeping the books realize you do in fact get what you pay for.

    6. Re:We need Computer Engineering, not Scientists. by Anonymous Coward · · Score: 0

      Reminds me of an into to a book which attempte to debunk the comparison between engineering and computer science... "Building software is exactly like building a bridge... To a boat." You laughed because it's true. It's not about programmer discipline at all, it's about perception.

      To an executive, a computer program looks exactly like the TPS report sitting on his desk. Something that can be easily fixed with a little bit of white out and yelling at his subordinates. They see resistance to their ideas as laziness or stupidity on the part of the developer (not that that doesn't exist...) rather than as an attempt to create robust, secure code. Executives make decisions about how software is to be developed all the time. How often do you see an executive say "Replace all this schedule 60 pipe with schedule 40. I don't like how the schedule 60 looks. What? I don't care that you've installed all of the fittings already, just re-do it!"

    7. Re:We need Computer Engineering, not Scientists. by Pastis · · Score: 1

      > OSS is that its goal is to standardize and reuse code.

      You can make it your goal to reuse and standardize on OSS code, but there's no such goal with OSS.

      Actually OSS is the software discipline that encourages the most forking, which is almost the opposite of reuse. It's quite common to find a piece of software under an OSS license that has been reused then forked in various projects.

    8. Re:We need Computer Engineering, not Scientists. by spun · · Score: 1

      Forking is reuse. Your reusing the code you forked, right?

      --
      - None can love freedom heartily, but good men; the rest love not freedom, but license. -- John Milton
  42. Re:"almost" correct by kfg · · Score: 2, Insightful

    So if I botch the balance in my checkbook, the bank will pat me on the head. . .

    Why should the bank even care? I don't even remember the last time I balanced my checkbook.

    "Almost correct" is someone being spineless.

    I just measured the hight of a tree with a meter long chunk of 2x4 and a bubble protractor. I get a figure of 10 meters. How many feet is that? 32.808399 is not the right answer. Using it is likely to result in your shell missing the top of the tree. 30 is the right answer. Why?

    Neither you nor you wife is correct, or incorrect either. Define what "correct" means and define the degree of incorrectness and precisely why it is incorrect.

    Arithmatic is exact, the things you use it to model often are not. Modeling states and calculation of figures are two seperate acts and skills. They both need to be taught and understood.

    Telling me that I'm stooopid is a personal attack; telling me my calculation is incorrect is a statement of fact. Folks need to learn that the latter statement isn't necessarily a bad thing.

    Here I am with you 100%.

    KFG

  43. MGS was currently a low priority for NASA by jespley · · Score: 2, Interesting

    I'm a scientist that works with the MGS data so I don't know the engineering side well. However, I do know that last year NASA was strongly considering dropping all support for MGS in order to spend the limited Mars program money on newer missions (the idea being that we had gotten 90% of the useful science from MGS). Instead they decided to keep MGS funded with a bare minimum of money and hence a bare minimum number of personnel. I imagine that the poor overworked engineers running the operational show at JPL just didn't have the time to doublecheck everything as they would in an ideal world. As their end user, I'm just grateful for all the work they did over the years to keep the thing running.

  44. I had a class like that. by Nanoda · · Score: 1

    The name of it escapes me right now, but I did take a class where we reviewed certain classic software failures. (A good class for me since I'd already read about them :).

    If you'd like to read a few, check out:
    Therac-25 (Race conditions, software lockouts in lieu of hardware)
    London Ambulance Service (Poor software design and design process)
    Ariane 5 (Cutbacks on testing procedures, inappropriate software re-use, variable overflows, flight hardware allowed to generate error output)

    then there's the Denver airport baggage system, the Mars Climate Orbiter, etc.

    In general, you may want to read the Risks Digest, where stuff like this happens every month!

  45. Milestone by Hugonz · · Score: 1

    So now we can milestone the first paperweight in space...

  46. MGS Spacecraft? by Anonymous Coward · · Score: 0

    Snake? SNNNAAAAAAAAAAAAKE!

    Bum bum ba ba-dum, BUM DA DUM!
    . //GAME OVER//
    Continue?

  47. Emulation? Testing? by SanityInAnarchy · · Score: 1

    Subject says it all... When developing for, say, the OLPC, or handheld computers (or PDAs, or smartphones, hell, even the iPhone), you either actually run everything on the device before shipping it to consumers, or (more likely) you emulate the embedded device on your desktop, so you can dig into the guts of it with a debugger, and then you test it on the device anyway.

    Why is it that the iPod, hell, even my Java phone is more reliable than these aerospace things?

    --
    Don't thank God, thank a doctor!
  48. Nope. by Anonymous Coward · · Score: 2, Informative

    Additionally, since the computer "flip" happened instantaneously, and the f-16 can roll at much higher G forces than the pilot can take, the flip would have killed the pilot

    A single, half-roll to inverted in the Falcon wouldn't have exerted enough Gs on the pilot to do anything worse than to exclaim WTF!, and disengage the a/p. A roll in and of itself in an aircraft doesn't really induce much Gs.... a "bank-and-yank" turn does, and that's what the F16 can do at higher Gs than the pilot can take... not the roll.

  49. Metric by Anonymous Coward · · Score: 0

    I knew I forgot a
        double distanceInMeters = feetToMeters(distanceInFeet);

  50. Re:"almost" correct by zippthorne · · Score: 1

    Which is why you don't necessarily assign points at the per-question level of granularity. If she gave partial credit for partially correct problems her students would still feel the burn of missing part of the problem (the actually doing the math part correctly).

    And before you say it's all wrong if part of it's wrong, think about applying that standard to the entire assignment and you'll realize how specious it is.

    As a lab instructor, I've even had to mark things wrong which have the answer correct: there are many wrong ways of setting up calculations that happen to arrive at the "correct" answer. In my case it usually involved careless with units. It wasn't a very high-level lab.

    Give credit where credit is due, no more and no less. The level of granularity should depend on how much time you want to spend grading it and how important the assignment is among other things.

    --
    Can you be Even More Awesome?!
  51. Stupid Errors? by WK1 · · Score: 0
    These aren't 'normal workplace errors' that you have to live with, they're -stupid- errors, made because of stupid managers.

    What's the difference?

  52. Name The Next Meteor NASA..... by IHC+Navistar · · Score: 1

    NASA used to occupy the technological equivalent of the "top seat of the totem pole", way up in space. But recently, half-assed engineers, lazy technicians, buereaucratic posturing, and elitism have turned the star in the sky into a brilliant meteor crashing toward Earth.

    NASA used to be at the forefront of technological innovation and development. A demise this rapid is only explainable by the aformentioned reasons.

    The solution:

    Fire EVERYONE, REGARDLESS OF SENIORITY, and hire people who care more about technological innovation and development and national pride rather than egocentric self-glorification. When you work for a company like NASA, one that is SUPPOSED to be essentially a publicly beneficial scientific/technological R&D entity, you should be putting science and technology, and most importantly knowledge and national achievement (including global achievement) before personal glory. Just as a police officer or firefighter puts those that they are protecting at a much higher priority than his own safety.

    NASA has been corrupted and polluted by 'scientists' and technicians who are in the business for personal gain, rather than technological gain.

    --
    Knowing Google's lust for data collection, the Soviet Union is still alive and well inside the psyche of Sergey Brin....
  53. THis is one of the joys of embedded systems by EmbeddedJanitor · · Score: 1
    In a desktop app you can (generally) hit some sort of assert or exception or whatever and halt the software. The user might get annoyed but nobody gets killed etc.

    In a realtime control system, a fault is a system failure. If there is no backup/recovery procedure then there is no such thing as a "safe mode".

    --
    Engineering is the art of compromise.
  54. O_O by Vacardo · · Score: 2, Funny

    Well, that's that tops my list on "Worst Times to Get the Blue Screen of Death".

  55. Re:An easy fix... Not. by JayBat · · Score: 1
    You could launch 100 MSG replacements for what a single manned mission would cost.

    There might be good reasons for a manned spaceflight, but popping into Mars orbit to do repairs ain't one of 'em.

    -Jay-

  56. SW isn't the most basic element of failsafe design by vitya404 · · Score: 1

    I think, this time, the hardware failed. If you can actually drive permanently sg. against a mechanical stop, it's not well designed. Or, if it provides an interrupt that actually switches to degraded mode (and not failsafe: that means nothing can go wrong), then that is a system design problem.

  57. Engineering competence by Have+Brain+Will+Rent · · Score: 1

    In Canada all traditional engineers wear an iron ring - can't remember which finger. The story I was told was that in the very early 20th century a bridge was being built in Quebec and, long story short, due to engineering errors it collapsed killing masny people. The townspeople took some of the metal from the collapsed bridge and made rings out of it for the engineers responsible. After that it was adopted as a tradition that all engineering grads get iron rings to remind them of the responsibility they carry.

    --
    The tyrant will always find a pretext for his tyranny - Aesop
    1. Re:Engineering competence by innocent_white_lamb · · Score: 1
      --
      If you're a zombie and you know it, bite your friend!
    2. Re:Engineering competence by Have+Brain+Will+Rent · · Score: 1

      The website you linked to is a somewhat different story than what I was told years ago (by someone who had already been a P.Eng. for decades) but is not necessarily in conflict with the version I heard as there is no description on the website as to why the particular symbol chosen was an iron ring.

      --
      The tyrant will always find a pretext for his tyranny - Aesop
  58. For the record.. NOT MY CODE! by davido42 · · Score: 1, Funny
    Just sayin.

    Not like I've ever worked for NASA.

    --

    BitWorksMusic.com -- odd tunes for odd times

  59. The radiator pointing was unintended by Anonymous Coward · · Score: 0

    Broadsiding the radiator to the sun was one of the symptons of the fault, not an intent of safe mode. Presumably, over a period of 4 months after the software update that had the bug, the orientation drifted until the radiator rotators rans out of travel. At that point, I couldn't guess whether it immediately went into safe mode because it couldn't rotate the radiators, or if it first overheated, but it doesn't matter much. The end result is cooked electronics.

    As I understand it, when a NASA probe goes into safe mode, it stops performing active processes (like firing thrusters, changing orientation, or taking pictures), broadcasts a status back to earth, then waits for instructions. The idea is to avoid doing anything that would make a problem worse.

    While NASA's computer scientists are probably smart enough not to turn the spacecraft in a bad orientation if it encounters a glitch, they may not have considered that it may enter safe mode in a bad orientation. Most likely though, they decided the potential for the spacecraft to find itself stopped with the radiators facing the sun is lower than the potential making things worse if they try to identify and mitigate that condition autonomously with the computer on the fritz.

  60. Turns, not rolls induce G-Forces by Anonymous Coward · · Score: 0

    The A/C is correct, banked turns in an airplane induce G-Forces, rolls do not unless you try to climb or turn while rolling the aircraft. A roll will induce some radially extending centrifugal force but not enough to harm a pilot.

  61. Re:"almost" correct by alienmole · · Score: 1

    Your wife is correct that understanding the process is much more important than not making clerical errors. Clerical errors can always be caught by cross-checking, but if you don't understand the process, you can't get anywhere. Math tests are artificial: you're not trying to build a spaceship, you're trying to test whether someone has learned something. In a more realistic situation, cross-checks would occur, and you'd have time to correct errors.

    In an ideal situation, a large majority of a grade ought to be awarded based on a demonstrated understanding of the process. Clerical errors might be worth a minor deduction, but if the choice is to grade a question in a binary way, as correct or wrong, grading it as correct makes sense. The real problem in that situation is that the grading system is too coarse.

  62. /. is Populated by self-righteous ignorami by Anonymous Coward · · Score: 1, Insightful

    I am soooooo sick of this attitude on Slashdot. Everytime something goes wrong, especially if it involves NASA, it's assumed to be a colossal blunder rooted in incompetetance and greed. Next time you feel like making a comment like this, either do some background research, or stuff a sock in it. I'm normally not a jerk on Slashdot, but I think I just snapped.

    First of all, what was the move for personal gain that caused this? A software bug? Typically those are accidental, and they are most certainly not limited to NASA. Do you have some evidence of underhanded action happening in this case? No? Didn't think so.

    Secondly, what is the rapid demise you're referring to? Do you realize that the last 10 years have seen a brilliant upswing for NASA? With the exception of the unfortunate Columbia tragedy, which itself opened a lot of eyes and spurred many improvements within NASA and especially the manned space program, successful missions have been practically hand-delivered to the American people. Let me name a few: Stardust, Mars Rovers, Mars Reconnaisance Orbiter, Odyssey, Pathfinder, Deep Space 1, Deep Impact, Spitzer, Cassini, and Clementine. There's quite a few excellent missions coming up soon or en route, too: New Horizons, Messenger, Mars Phoenix Lander, James Webb Space Telescope, Mars Surface Laboratory, and the Lunar Reconnaisance Orbiter.

    Third, did you have any clue when you opened your trap that the Mars Global Surveyor completed its mission 5 years ago? Every orbit it made after that and picture it returned was a bonus to the American taxpayers and the global scientific community. MGS mapped the entire planet, much of it twice. NASA had been considering finally shutting down the project to free up resources for newer, higher priority missions, like MRO.

    Fourth, what is your brilliant plan of firing everyone going to achieve? It will leave an organization with no one who has a clue what's going on. The people who know how all the missions currently in operation will be gone. There will be no one to train any replacements. People who know the in's and outs of spacecraft design will be sitting at home jobless watching as people who have no experience in space exploration try to start back up from the 1950's. The best you can do is to identify those people who are genuine problems or true underachievers and fire them. Then you get rid of specific problems and motivate everyone else to be straight shooters, without eliminating key talent.

    Any questions?

  63. More propaganda by dreemernj · · Score: 1

    When are they going to admit the truth of how this was destroyed? Oh well, we'll all know once Megatron lays seige to the earth for our delicios oil and rubies and everything else that can be made into energon cubes.

    --
    1 (short ton / firkin) = 89.1432354 slugs / keg
  64. Re:MGS? How about some fucking clearer headlines by Anonymous Coward · · Score: 0

    TH1S FLAMEBA1T SH1T 1S
    WHY SH1TD0T NEEDS TO BE FUCK1NG CRASHED!!!!!!!!!!!!!!!!!

  65. just proof that... by botkiller · · Score: 1

    Updating your software can always make more issues even though it fixes others. I guess they'll think twice before applying that next winXP hotfix.

    --
    brian botkiller "Condensing fact from the vapor of nuance" - Neal Stephenson, Snow Crash
  66. Re:"almost" correct by Migraineman · · Score: 1

    I've got no problem with partial credit in cases with clerical errors. I have a ton of problem when it's graded as correct when it isn't. There's a certain amount of discipline required to solve problems. If you're sloppy, you'll make simple errors, but the result is still wrong.

    Calculation errors, process errors, logistics errors ... they're all part of the real world. The teacher who doesn't mark the simple errors is doing two things - providing an oversight function that won't always be availble, and providing positive reinforcement that being sloppy is okay (in fact, it's being rewarded.) There's an indirect punishment applied to the students who do all the details, because they expended "unnecessary" effort to obtain the same reward as the sloppy kids. Supplying the positive punishment of a partial/full credit deduction restores the balance to where it should be. I hate to say this, but many teachers are more concerned about lawsuits than teaching. That goes all the way up into the school administration too. I'm sure you're familiar with the term "promote them up and out." Schools are less likely to get sucked into a lawsuit if they just graduate everybody.

    When you turn the kids loose into the real world where there is no one double checking everything they do, all your Mars probes end up somewhere near Jupiter.

    This has the potential to turn into a huge rant, so I'll stop here.

  67. Re:"almost" correct by Migraineman · · Score: 1
    I hate to do this, but I'm block-copying from another reply I made because I think it's relevant:
    I've got no problem with partial credit in cases with clerical errors. I have a ton of problem when it's graded as correct when it isn't. There's a certain amount of discipline required to solve problems. If you're sloppy, you'll make simple errors, but the result is still wrong.

    Calculation errors, process errors, logistics errors ... they're all part of the real world. The teacher who doesn't mark the simple errors is doing two things - providing an oversight function that won't always be availble, and providing positive reinforcement that being sloppy is okay (in fact, it's being rewarded.) There's an indirect punishment applied to the students who do all the details, because they expended "unnecessary" effort to obtain the same reward as the sloppy kids. Supplying the positive punishment of a partial/full credit deduction restores the balance to where it should be.

    Understanding the process is important, but all the book-learning in the world is completely useless if you can't apply it. If you have to hand-wave your way around every answer you come up with, how is that a good thing? I'm an employer, and there's no way in hell I'm going to assign someone to double check all of your work. I expect you to do your job, and do it correctly. If you job is to calculate the orbits of spacecraft, I expect you to be able to handle the calculations and get them right. Similarly, if you're scheduling trucks for deliveries at the loading dock, I expect you to have a certain skill set in logistics. In either case, if I'm getting coin-toss accuracy out of your work, you're not going to be working here very long. I don't care how well you understand the process, if you can't perform it properly and accurately, you're costing me money, and that makes you fired for non-performance of job duties.
  68. Re:"almost" correct by ralphdaugherty · · Score: 1

    Your calculations are either "correct" or "not correct."

          I agree with the wife. Partial credit for incorrect answer but correct method. And also only partial credit for correct answer achieved with incorrect method, such as counting fingers instead of memorizing multiplication tables. :)

      rd

  69. Re:"almost" correct by ralphdaugherty · · Score: 1

    Why should the bank even care? I don't even remember the last time I balanced my checkbook.

          The bank will care when one uses their incorrect balance in writing a check and writes a check for money they don't have. The bank responds with an overdraft charge.

          I don't pay for overdraft protection. I balance my checkbook instead, but a lot more frequently and up to date with an online account than I used to with a monthly statement.

      rd

  70. Re:"almost" correct by alienmole · · Score: 1

    First, as I think I mentioned, the problem with grading it correct when it is isn't is a problem with the testing procedure. I'm guessing that these are situations when for whatever reason, it's considered necessary to make the choice a binary one, when it sounds as though it should allow for partial credit. If so, that may be a problem with the testing procedure at some level, and it wouldn't be very fair to take that out on a child who understands correctly how to do the work.

    Beyond that, keep in mind that an elementary school math test is not an employee screening test. It has a completely different purpose, and it's applied to people at entirely different levels of development. Extrapolating from jobs involving spacecraft orbits and truck scheduling to elementary school children, and talking about firing people, seems ludicrous to me. You may think there's a connection, but unless you've studied child educational development and have some basis for that thought beyond your own unrelated work experience, you're almost certainly wrong.

  71. Re:"almost" correct by kfg · · Score: 1

    The bank will care when one uses their incorrect balance in writing a check and writes a check for money they don't have.

    Ahhhhhhhhhhhh! What the bank cares about is overdrafts. I don't do that. Because I have a sense of number without having to perform a calculation. This also allows me to know that anyone who claims the average ocean level rose 2mm last year is a numeric moron, no matter how many or what particular letters he writes after his name; again without performing any calculations.

    I don't pay for overdraft protection.

    Niether do I. I don't need it. I don't write out checks for more than I have. Even though I haven't balanced my checkbook in at least five years and often go a few months between looking at statements.

    There are only two reasons to balance your checkbook; because you have to know to the penny how much you have to avoid an overdraft; and to check the bank's calculations for errors to the penny.

    But making good approximations is one of the most valuable numeric skills you could possess. To have a good sense of how much you've got to spend. I'm sorry if this cuts a bit close to home, but anyone who can't look at a derived number and have an innate sense of how correct/incorrect it is isn't very good at math. They simply know arithmetic, which is a valuable, but limited, numeric skill. Especially for an engineer, who is working with the messy and imprecise real world and not merely a man defined abstraction.

    The arithmetic performed by the failed Mars lander was as perfect as a computer could perform it. There was a failure of method. Missuse of units, not numbers. A human being looking at the mathematical model in toto would have picked it up without being given a single number to calculate.

    And relying on perfect arithmatical skills to determine how many feet tall the tree is may well result in your shell sailing over the top of it, instead of knocking off the top few feet as you inteneded, because your perfect arithmetic gave the wrong answer.

    And you didn't have a sense that it was wrong, because your arithmetic all checked out.

    If your checking account bears interest, do you balance your checkbook to the fraction of a penny?

    When I write physics examples on the blackboard I use the number "10" for gravitational acceleration. It makes performing quick calculations in front of students easy. But is this number correct or incorrect?

    KFG

  72. Re:I believe it was running a version of Linux by pakar · · Score: 1

    so... do you really think that windows 95 would have been any better? :o

    "Damn, we got a BSOD. Who's up for a spacewalk?" =)