Slashdot Mirror


Pluto Probe Back To Normal, Cause of Snafu Found

Tablizer writes: NASA has provided an update to the problem with the New Horizons probe that will fly by Pluto next week. "The investigation into the anomaly that caused New Horizons to enter "safe mode" on July 4 has concluded that no hardware or software fault occurred on the spacecraft. The underlying cause of the incident was a hard-to-detect timing flaw in the spacecraft command sequence that occurred during an operation to prepare for the close flyby. No similar operations are planned for the remainder of the Pluto encounter.

80 comments

  1. No hardware or software fault? by tomhath · · Score: 4, Insightful

    The underlying cause of the incident was a hard-to-detect timing flaw in the spacecraft command sequence that occurred during an operation to prepare for the close flyby.

    So a "flaw" in the command sequence isn't a software fault? Sure sounds like one to me. Glad to hear the craft is functioning again though.

    1. Re:No hardware or software fault? by MightyYar · · Score: 4, Informative

      I'm pretty sure that "fault" has a specific meaning in NASA parlance. There was obviously a software bug, but it probably didn't "fault".

      --
      W..w..W - Willy Waterloo washes Warren Wiggins who is washing Waldo Woo.
    2. Re:No hardware or software fault? by Anonymous Coward · · Score: 4, Insightful

      There's a gap between "flawless" and "faulty" whos length, as it so happens, is remarkably similar to the distance that New Horizons has travelled so far.

    3. Re:No hardware or software fault? by JeremyR · · Score: 1

      The article doesn't elaborate, so I'm guessing this refers to a command sequence sent from the ground. If these are generated by software, it still could have been a software fault, but not on the spacecraft.

    4. Re: No hardware or software fault? by MightyYar · · Score: 2

      This is not a manned mission, and not even the nuttiest nutter thinks that man is going to Pluto. You are trolling the wrong article.

      --
      W..w..W - Willy Waterloo washes Warren Wiggins who is washing Waldo Woo.
    5. Re: No hardware or software fault? by Anonymous Coward · · Score: 0

      Did someone who likes space shoot your mother or something? You always pop up on these threads.

    6. Re: No hardware or software fault? by Anonymous Coward · · Score: 0

      Well, it probably won't. So fuckin' slap it up ye. lol

    7. Re:No hardware or software fault? by 140Mandak262Jamuna · · Score: 2
      I would guess "fault" is their word for crash. This one did not crash, some audit method failed, it entered a safe fall back mode.

      Can't blame NASA though, when the commands are transmitted over 3 billion miles, the signal would degrade so much it is possible some critical command or an command argument was not correctly received.

      --
      sed -e 's/Chuck Norris/Rajnikant/g' joke > fact
    8. Re:No hardware or software fault? by fustakrakich · · Score: 1

      I'm guessing this refers to a command sequence sent from the ground.

      init 6 ?

      --
      “He’s not deformed, he’s just drunk!”
    9. Re:No hardware or software fault? by geogob · · Score: 1

      I believe they meant that the software (or hardware) on the spacecraft behaved as expected, but the error was rather due to an handling mistake, sending the commands with the wrong timing. If you asked me, such an handling mistake should be catched by the on-board software and handled properly (which means telling the operator right away to RTFM). I would thus qualify this as a software issue, regardless of what they say.

      The official statement is simply putting the "you're holding it wrong" response to a whole new level.

    10. Re:No hardware or software fault? by dissy · · Score: 1

      So a "flaw" in the command sequence isn't a software fault?

      I don't see why it must be.

      Imagine you wrote a shell script to first create a temp folder, then recursively delete the source data folder, followed by copying the source folder to the new temp folder.

      Oops, your data is gone!

      Is that a fault with the delete command doing exactly as you instructed it to?
      Or is that a fault in your sequence commands in the script?

    11. Re:No hardware or software fault? by Applehu+Akbar · · Score: 1

      No, this was Safe Mode: Click Start, Shut Down, select Restart, then hold F8 down during reboot.

    12. Re:No hardware or software fault? by prefect42 · · Score: 1

      I'm not sure that's a sound argument. You should be checksumming such that you're confident that what you're doing is what you were asked to do, and working in transactions, such that if you've not received a whole command group, you're not running any of it. I'd think it was only in desperate circumstances you'd issues a command that says do this, or in fact do anything plausible if you don't fully receive this, because you're about to fly into something hard...

      --

      jh

    13. Re:No hardware or software fault? by tambo · · Score: 1

      Can't blame NASA though, when the commands are transmitted over 3 billion miles, the signal would degrade so much it is possible some critical command or an command argument was not correctly received.

      Nonsense - that's one of the easiest problems to solve in all of computer science: you just tack on a hashcode, checksum, parity bit, etc., and the receiver verifies that it got the right message. If it doesn't verify, the receiver doesn't follow it, and when the sender doesn't get an acknowledgment, it retransmits the message.

      That technique is baked into every communications protocol. Hamming even invented a technique to allow automatic correction...in the 1950's.

      --
      Computer over. Virus = very yes.
    14. Re:No hardware or software fault? by fustakrakich · · Score: 1

      init 1 then...

      --
      “He’s not deformed, he’s just drunk!”
    15. Re:No hardware or software fault? by Impy+the+Impiuos+Imp · · Score: 1

      Yep. Not particularly strenuous CRC formulae can detect errors that may happen in a data stream running the entire age of the universe.

      --
      (-1: Post disagrees with my already-settled worldview) is not a valid mod option.
    16. Re:No hardware or software fault? by Demonoid-Penguin · · Score: 1

      Yep. Not particularly strenuous CRC formulae can detect errors that may happen in a data stream running the entire age of the universe.

      Yep. Collision free and works with Ada and decision voting. Trivial when you thunk about it. (goddamned rocket scientists act like they know stuff)

    17. Re: No hardware or software fault? by Demonoid-Penguin · · Score: 4, Funny

      Did someone who likes space shoot your mother or something? You always pop up on these threads.

      As a small child he wanted to be an astronaut. Then he heard it meant reading and, um, stuff. Bitter now, and still a child.

    18. Re:No hardware or software fault? by tomhath · · Score: 1

      Imagine you wrote a shell script...

      That's software.

    19. Re:No hardware or software fault? by Tablizer · · Score: 2

      Come on NASA, was it a "fault", "snafu", "glitch", or "bug". Come clean now!

      Personally, I suspect it was a snag.

    20. Re:No hardware or software fault? by Anonymous Coward · · Score: 0

      And given the ~16 hour comms delay we now know what went wrong...

      The operator fell asleep waiting for the response (32 hours roundtrip) and missed the F8

      Ah well, there's plenty of time after it passes Pluto...

    21. Re:No hardware or software fault? by A10Mechanic · · Score: 1

      As long as it isn't a BFRC they're OK with it.

    22. Re:No hardware or software fault? by dissy · · Score: 1

      That's software.

      That's software doing exactly as instructed, and as expected.
      The question is: Is the software working perfectly to be considered a software fault?

      A developer or operator fault most certainly. But there was no part of the software doing anything it wasn't told. No part that had any expectation of working differently than it did.

      Here we call that operator error.

      "I right clicked this file and selected delete. When it asked if I was sure I clicked Yes. Now I'm shocked, appalled, and confused why that file got deleted!! Your software is broken!"

      Now arguably we don't know the exact details of this particular case, it very well Could have been a software fault and it wasn't reported as such.
      It could also have been a fault with the documentation, where even if the command worked as originally intended, it didn't work as documented.
      Honestly with such a complex system it could have been one or more of any number of things.

      I'm making no claim to what actually happened.
      Just providing example on how such a "Not a software fault" situation could have happened.

    23. Re:No hardware or software fault? by chmod+a+x+mojo · · Score: 1

      Congratulations, you can read! Now go practice reading the rest of the post, it describes how the "software" is not faulty yet gives an unwanted outcome due to command timing.

      --
      To err is human; effective mayhem requires the root password!
    24. Re:No hardware or software fault? by Anonymous Coward · · Score: 0, Insightful

      Software that doesn't do what it's intended to do is faulty. It doesn't matter if it's due to a race condition that programmer didn't expect (apparently what caused the probe's issue) or whether the programmer made a mistake (the pointless example of a shell script that deletes files). NASA didn't intend to put the probe into sleep mode with those commands. The shell script writer in GGP's post didn't intend to delete all the files.

      Now you go practice reading what a software fault is (hint: there's an old saying "the computer only does what you tell it to do, not what you want it to do")

    25. Re:No hardware or software fault? by Xylantiel · · Score: 1

      I believe the use of the word "fault" here means that there is nothing broken on the spacecraft, hardware or software. It behaved as it was supposed to, it was just fed a bad command sequence. i.e. any software fault was in the auditing software on the ground. Even then it may not be a "fault" (i.e. breakage) but just some conditions that aren't accounted for in the audit.

    26. Re:No hardware or software fault? by Will.Woodhull · · Score: 3, Insightful

      I'm guessing it was an unanticipated race condition. Everything works correctly, everything passes all tests, but for some extremely rare constellation of input values software module "B" is able to complete its calculations and report its results before "A" can-- which has a probability of occurrence so low that it rounds to zero-- and that screws the pooch. If the probability of this happening again approaches zero, it would be fair for NASA to say there was no error in the programming, but instead an unexpected glitch in operations that is unlikely to ever recur.

      You can never test for every possible corner condition. More than that, in probably every real world situation, the longer the time since the last hard reboot, the more likely it is that the software will encounter some corner conditions. That Pluto bird has been running for quite a while.

      --
      Will
    27. Re: No hardware or software fault? by buchanmilne · · Score: 1

      "no hardware or software fault occurred *on the spacecraft*"

      There may have been a hardware or software fault on the ground, that resulted in an invalid command sequence. The desired behaviour in this case may be to enter a safe mode, so that you have a known means to recover (rather than bricking).

    28. Re: No hardware or software fault? by MachineShedFred · · Score: 2

      Yeah, we don't like that "science" stuff around here! We'd much rather be completely ignorant of the universe around us than have you "space nutters" actually discovering things!

      Signed,

      The Flat Earth Society

      --
      Slashdot still doesnâ(TM)t support Unicode after it was added to the HTML standard in 1997.
    29. Re:No hardware or software fault? by Tablizer · · Score: 1

      Belgian Flatcoated Retriever Club?

      Sticking with the idea that "Pluto" is a dog, eh?

    30. Re:No hardware or software fault? by Anonymous Coward · · Score: 0

      >"the computer only does what you tell it to do, not what you want it to do"

      Which is why I'm working on the API, header files, and other code (for all OSes) of the DWIM command: "Do What I Mean".

    31. Re: No hardware or software fault? by Anonymous Coward · · Score: 0

      I'd rather see the military scaled way back and all churches forced to pay taxes like everyone else. That would provide a lot of spare income with zero negative effects.

    32. Re:No hardware or software fault? by bondsbw · · Score: 2

      Of course not. Pluto is a dwarf dog.

      --
      All my liberal friends think I'm a conservative, all my conservative friends think I'm a liberal.
    33. Re:No hardware or software fault? by roc97007 · · Score: 1

      I'm guessing this refers to a command sequence sent from the ground.

      init 6 ?

      init 1, apparently.

      --
      Oliver's law of assumed responsibility: If you're seen fixing it, you will be blamed for breaking it.
    34. Re:No hardware or software fault? by roc97007 · · Score: 1

      > The operator fell asleep waiting for the response [...] and missed the F8

      Happens to me all the time.

      --
      Oliver's law of assumed responsibility: If you're seen fixing it, you will be blamed for breaking it.
    35. Re:No hardware or software fault? by roc97007 · · Score: 1

      I believe they meant that the software (or hardware) on the spacecraft behaved as expected, but the error was rather due to an handling mistake, sending the commands with the wrong timing. If you asked me, such an handling mistake should be catched by the on-board software and handled properly (which means telling the operator right away to RTFM). I would thus qualify this as a software issue, regardless of what they say.

      The official statement is simply putting the "you're holding it wrong" response to a whole new level.

      Well, ok, one could argue that any obscure corner case should be handled appropriately. But at some point, you have to launch the thing.

      --
      Oliver's law of assumed responsibility: If you're seen fixing it, you will be blamed for breaking it.
    36. Re:No hardware or software fault? by danlip · · Score: 1

      when the sender doesn't get an acknowledgment, it retransmits the message

      When your round-trip communication time is on the order of 10 hours you might want to modify that strategy.

      (not that it is hard to do so, just transmit the message multiple times with a sequence number so the client can detect the repeats)

    37. Re: No hardware or software fault? by Anonymous Coward · · Score: 0

      Id say, if I were you: (...) Any. Where. [for a tad more dramatic effect]

    38. Re:No hardware or software fault? by ColaMan · · Score: 1

      If you asked me, such an handling mistake should be catched by the on-board software and handled properly (which means telling the operator right away to RTFM).

      Well, that's what happened. Commands were sent, probe responded with a WTF!? and halted, people double-checked things - Oh, there's the problem, probe was reset back to normal.

      Unfortunately, the round-trip time to the probe is nearly 9 hours, and nobody wants to be that guy that broke it good and proper, so they double check everything before replying, maybe even testing with hardware back here first. So these things take a while to sort out.

      It's better to do that than to accidentally overwrite your antenna-pointing code with a software update for battery management, like JPL did with Viking 2.....

      --

      You are in a twisty maze of processor lines, all alike.
      There is a lot of hype here.
    39. Re:No hardware or software fault? by Anonymous Coward · · Score: 0

      Temporal malfunction?

    40. Re: No hardware or software fault? by Demonoid-Penguin · · Score: 1

      Did someone who likes space shoot your mother or something? You always pop up on these threads.

      As a small child he wanted to be an astronaut. Then he heard it meant reading and, um, stuff. Bitter now, and still a child.

      Which, on reflection, seems mean. I should have pointed out that much of the blame for his thwarted ambitions lie with his kindergarten teacher - who told him everyone is good at something, and you should choose a career doing what you're best at.

      Being an astronaut seemed like the only job where you didn't have to walk to a toilet, or wipe.

    41. Re:No hardware or software fault? by Whiteox · · Score: 1

      It's hard to determine the breed. Pluto does look a bit like a Rhodesian Ridgeback without the ridges. Otherwise a rather large Hungarian Vizsla except for the eyes.
      Hmmm.....

      --
      Don't be apathetic. Procrastinate!
    42. Re:No hardware or software fault? by Anonymous Coward · · Score: 0

      It was a snafu, which means that the situation was normal, thus "no hardware or software fault". Just a fuckup.

    43. Re:No hardware or software fault? by Anonymous Coward · · Score: 0

      Isn't that the whole point? The Probe received some faulty command data, and instead of running it, the probe just put itself into safe-mode

    44. Re:No hardware or software fault? by 140Mandak262Jamuna · · Score: 1

      That is probably what actually happened. It got some command, it knew it was garbled, so it did not execute it. Then what? More commands are sure to follow. Ground control would assume the command has been executed. It can wait to check status, it takes 5 hours for the status to be reported back. The ground control thinks the machine is in some state, but the machine knows that is not true due to one skipped command. So it probably sends out a status update saying, "Need to synch everything up. Going back to safe mode, known state". It takes 5 hours for ground control to know this. Another hour or so of procedures to restart from the known state safe mode. Then 5 more hours to send more commands to the probe, 5 more hours for confirmation that it is back to normal.

      --
      sed -e 's/Chuck Norris/Rajnikant/g' joke > fact
    45. Re:No hardware or software fault? by ConceptJunkie · · Score: 1

      Or maybe they just needed time to scrub all the pictures of the invading Vogon fleet. Don't want people to panic...

      --
      You are in a maze of twisty little passages, all alike.
    46. Re: No hardware or software fault? by ConceptJunkie · · Score: 1

      I came to grips with the idea of never being an astronaut when I was about 7. I'd read that there was a height limit of 6 feet for astronauts, which even in the early 70s might have been out of date information. but since my father was taller than that, I figured I would be also.

      --
      You are in a maze of twisty little passages, all alike.
    47. Re: No hardware or software fault? by Demonoid-Penguin · · Score: 1

      I came to grips with the idea of never being an astronaut when I was about 7. I'd read that there was a height limit of 6 feet for astronauts, which even in the early 70s might have been out of date information. but since my father was taller than that, I figured I would be also.

      Ouch! I'm only 5'10" and had never considered that anyone might want to be shorter. But then I never desired to be an astronaut - that I recall.

  2. Kilometers? by Anonymous Coward · · Score: 1

    No, the plans were drawn in miles!

  3. The question is... by Anonymous Coward · · Score: 1

    If Pluto is Mickey's Dog, then how can Goofy be Mickey's best friend?

    Truly NASA is the only one who can answer this important conundrum.

    1. Re:The question is... by Hognoxious · · Score: 1

      Pshaw! I'm sure StartsWithABang has an informed opinion on the subject.

      --
      Confucius say, "Find worm in apple - bad. Find half a worm - worse."
  4. No exactly a SNAFU by jgtg32a · · Score: 3, Informative

    While NASA has had some spectacular bugs in the past, they aren't common enough to start throwing around SNAFU.

    Situation Normal: All Fucked Up

    1. Re:No exactly a SNAFU by Anonymous Coward · · Score: 0

      But some are FUBAR: Fucked Up Beyond All Recognition

      Hence the all purpose techie names "foobar", "foo" and "bar". Not sure where "baz" came from.

    2. Re:No exactly a SNAFU by Anonymous Coward · · Score: 0

      Relax, it's just soulskill's new favorite word

    3. Re:No exactly a SNAFU by Anonymous Coward · · Score: 0

      Ah, the constant degradation of culture no one cares about.
      Why put obscenities where they are not nececessary?
      I don't know.
      Worse yet, this days it's hard to find a single facebook post or youtube video without an F word.

    4. Re:No exactly a SNAFU by Sharkford · · Score: 1

      If you're going to use highly technical terms, please follow the relevant RFCs: http://www.rfc-editor.org/rfc/...

    5. Re:No exactly a SNAFU by Anonymous Coward · · Score: 0

      Not sure where "baz" came from.

      Galveston ;)

  5. That's a software fault, though by Anonymous Coward · · Score: 0

    It's definitely not not a software fault.

  6. DRINK when someone uses the word "anomaly"... by xxxJonBoyxxx · · Score: 3, Funny

    I always take a shot when someone uses the word "anomaly" in a space story. The legacy of STTNG continues.

    1. Re:DRINK when someone uses the word "anomaly"... by Anonymous Coward · · Score: 0

      Well at least this one can't be blamed on the Mars Defense Perimeter.

    2. Re:DRINK when someone uses the word "anomaly"... by Anonymous Coward · · Score: 0

      Meanwhile, your liver calls for U. N. C. L. E. It's old fashioned that way.

    3. Re:DRINK when someone uses the word "anomaly"... by Anonymous Coward · · Score: 0

      No doubt they put the probe back online by reversing the polarity of the neutron flux.

  7. I would have done dry run of entire sequence by peter303 · · Score: 1

    A few months or years ago to look for possible race conditions. A software simulator or backup craft is not quite the same. The main sequence is less than a day due to the high velocity of the spacecraft.

    1. Re:I would have done dry run of entire sequence by bobbied · · Score: 5, Insightful

      There are just some things simulations cannot find and rare "race conditions" are on that list. Of course, it all depends on how much fidelity you build into your simulation. However, at some point you have to say "Enough! If we spend any more on simulation and test we could just build and launch multiple spacecraft." So you accept the risks and move on. Race conditions are pretty hard to find in the first place, especially if they are not deterministic and only hit you every so often.

      --
      "File to fit, pound to insert, paint to match" - Aircraft Maintenance 101
    2. Re:I would have done dry run of entire sequence by AmiMoJo · · Score: 2

      The main thing is to make sure that you can recover from unexpected failures. It looks like NASA did well getting that right here.

      --
      const int one = 65536; (Silvermoon, Texture.cs)
      SJW, n: "Someone I don't like, and by the way I'm a fuckwit" - AC
    3. Re:I would have done dry run of entire sequence by JoeMerchant · · Score: 3, Interesting

      The fun part is when you do build and launch multiple (whatevers) and they all go down with the same "rare" fault.

  8. Have Spacesuit, Will Travel by Slartibartfast · · Score: 1

    I'm just sayin'. Those creepies wouldn't want to be observed before they hit Tombaugh Station.

  9. 1 sec time change? by radiumburn · · Score: 5, Funny

    Lets blame the 1 second time change - probably couldn't connect back to the local satellites because of a time certification error haha.

    1. Re:1 sec time change? by Tablizer · · Score: 1

      They are not used to a time zone change to Pluto Local Time. The Plutonians* were not willing to help.

      * "Plutocrats"? "Plutoids"? Reminds me of a joke about Hillary allegedly selling nuke mines to Putie. She's the "Plutonium Plutocrat".

    2. Re:1 sec time change? by Thud457 · · Score: 1

      ixnay on the pluterday talk, you'll get the 1%'s whining about class warfare again.

      --

      the preceding comment is my own and in no way reflects the opinion of the Joint Chiefs of Staff

  10. Translation.... by Hohlraum · · Score: 1

    Someone sat on the keyboard.

  11. 100% failure in 72 hours by gsslay · · Score: 5, Funny

    It can only be attributable to human error. They checked out the AE-35 Unit and it had no problems at all.

    I've still got the greatest enthusiasm and confidence in the mission.

  12. A relief by Anonymous Coward · · Score: 0

    The USA is #1. So is our women's soccer team.

  13. Sounds like an unexpected timing race condition by DutchUncle · · Score: 1

    Sounds like: All software worked as designed, and two real-time events occurred (at exactly the same time / within the same timestamp resolution) || (in the reverse order to anticipated, possibly due to delayed reporting/recognition) || (at the same time as a higher-priority interrupt). Not technically a software fault; a *design* fault perhaps, but not a fault in the software as designed and implemented.

  14. I smell tomfoolery by Last_Available_Usern · · Score: 1

    The underlying cause of the incident was a hard-to-detect timing flaw in the spacecraft command sequence that occurred during an operation.

    I've been a sys admin for a very long time and this sounds very familiar to many mad-libs style answers I've provided to uninitiated management immediately following an irreparable mistake.

  15. Obviously somebody running the simulation... by zawarski · · Score: 1

    ... waited to the last minute, panicked because the details for Pluto were not complete, and bought themselves some time.

  16. GIGO by aNonnyMouseCowered · · Score: 1

    Maybe the software was working the way it should but not the way the humans intended it to? Like the killer robots/AI of sci-fi.

  17. And if they find the Gamilons? by BulletMagnet · · Score: 1

    I'll start digging.

    (and showing myself out)