Kepler Recovers After 144 Hour "Glitch"

← Back to Stories (view on slashdot.org)

Kepler Recovers After 144 Hour "Glitch"

Posted by CmdrTaco on Tuesday March 22, 2011 @09:53AM from the hate-when-that-happens dept.

coondoggie writes "There was likely a pretty big sigh of relief at NASA's Ames Research Center this week as the group's star satellite Kepler recovered from a glitch that took it offline for 144 hours. According to NASA the glitch happened March 14, right after the spacecraft issued a network interface card (NIC) reset command to implement a computer program update. During the reset, the NIC sent invalid reaction wheel data to the flight software, which caused the spacecraft to enter safe mode, NASA stated."

23 of 73 comments (clear)

Min score:

Reason:

Sort:

Proof of extraterrestrial life! by aardwolf64 · 2011-03-22 09:56 · Score: 5, Funny

Alright, who hit F8 while it was booting up???
Hay guys I got this one! by flydpnkrtn · 2011-03-22 09:58 · Score: 5, Insightful

You need safe mode with networking, not just plain old "Safe Mode" guys!

--
Here's to the crazy ones
1. Re:Hay guys I got this one! by LaminatorX · 2011-03-22 10:00 · Score: 4, Funny
  
  It's not like this stuff is rocket science.
  
  Oh, wait.
Tech Support by Manos_Of_Fate · 2011-03-22 10:03 · Score: 4, Funny

Did they try turning it off and then on again?

--
Isn't enough that I ruined a pony, making a gift for you?
1. Re:Tech Support by Anonymous Coward · 2011-03-22 13:06 · Score: 4, Funny
  
  That's not nearly as easy as it sounds in orbit.
  Nothing sounds easy in a near vacuum.
2. Re:Tech Support by jgtg32a · 2011-03-23 01:33 · Score: 2
  
  There is no whoosh in a vacuum
Whew, that was close by ravenspear · 2011-03-22 10:05 · Score: 2

Another 3 hours and it would have had to cut off its arm to get back online.
Another tip by Veggiesama · 2011-03-22 10:12 · Score: 2

You got to release and RENEW, not just release.
Auto-Restore by im_thatoneguy · 2011-03-22 10:13 · Score: 3, Interesting

If it had gone into safe mode for more than ## Days does it have a "return to factory defaults" subroutine?
plugged in? by Anonymous Coward · 2011-03-22 10:16 · Score: 3, Funny

Dell Tech Support: Hi! This is David from Colorado. How may I help you today?
NASA: Hi, Yes: Our satellite keeps freezing up on reboot.
Dell Tech Support: Allright let me pull up your information... ....
Dell Tech Support: Ok, sir, let's see if we can try and troubleshoot it over the phone, if not then you will have to ship it to our repair techs
NASA: ?????!?!?!?
Dell Tech Support: Allright let's start by checking for connection cables. Is the satellite plugged in to the outlet?
NASA: FUUUUUUUU!!!!!!!!!!!!!!!
1. Re:plugged in? by Bobb+Sledd · 2011-03-23 02:46 · Score: 2
  
  Uh, you forgot to ask for the service tag. :-)
  
  --
  "They said I probly shouldn't fly with just one eye," "I am Bender. Please insert girder."
WRT54G by twebb72 · 2011-03-22 10:26 · Score: 4, Funny

Turns out the NIC was working just fine. They had to power cycle the WRT54G in Houston to get it to reconnect.
NetworkWorld? by Nikker · 2011-03-22 10:26 · Score: 4, Informative

I thought we would actually get the NASA link http://www.nasa.gov/mission_pages/kepler/news/keplerm-20110321.html which FWIW is almost verbatim to the NetworkWorld link shows. Copy pasta FTW!

--
A loop, by its nature, continues. If that didn't make sense, start reading this sentence again.
Really long time by owlstead · 2011-03-22 10:33 · Score: 3, Funny

Wow, that was longer than it took me to update my old W2K laptop to run Visual Studio 2003 :)
Safe Mode Rules! by blair1q · 2011-03-22 10:41 · Score: 3, Insightful

Having a dirt-dumb mode that is tested until its lever falls off that ensures that, if the thing is mechanically able, it can find your signal so you can reprogram it from the nuts up is requirement #1 for any computer-controlled thing you send into space.
Re:Was it really down? by Chris+Burke · 2011-03-22 10:56 · Score: 2

Well yes. Safe Mode wouldn't be very useful if you couldn't communicate with the satellite to figure out what went wrong and fix it.

--

The enemies of Democracy are
Re:Was it really down? by v1 · 2011-03-22 12:06 · Score: 3, Insightful

From what I've read nasa does some pretty thorough planning with their spacecraft software in terms of being able to recover from faults. (leave the units issues for another thread, eh?) I'm always impressed with how they have multiple fallback points that can usually dig them out of almost any hole bad programming, bad planning, or a stray cosmic ray can drop them into.
Look up the mars rovers, with their flash memory filling up, that in itself was amazing that they were able to recover from, given the crippling effect the programming oversight had on the system. (those iirc had to drop down three levels of safe before they were able to work with nasa) When you're millions of miles away you can't just send a tech out to press the Reset button.
And they have to not only get it back into a controllable state, but it has to be able to stay in that state for anywhere from minutes to days due to the time required for communication and analysis. If there's a fault in the solar panel positioning system your craft has to stay functional long enough to collect useful data, transmit it, wait for it to make it to earth, wait for it to be analyzed, and wait for a command to fix the problem, OR has to be able to at least patch it on its own before waiting for a proper fix. Amazing stuff really. It's not A.I. by any means, but it's definitely robust.

--
I work for the Department of Redundancy Department.
Re:Was it really down? by Brett+Buck · 2011-03-22 12:43 · Score: 3, Interesting

"Down/offline", meaning not performing the science mission, NOT, unreachable with no telemetry.
Good thing it's not stuck in safe mode... by Shark · 2011-03-22 12:48 · Score: 3, Insightful

Imagine it only capable of uploading 16 colour 640x400 imagery.

--
Mind the frickin' laser...
We were this close... by Symbha · 2011-03-22 14:45 · Score: 2

Oh well, I'm sure another reason to not give up the Space Shuttle program will present itself shortly.
BIG PROBLEM???!!! by wisebabo · 2011-03-22 15:59 · Score: 3, Interesting

Any Kepler scientists/engineers/technicians out there?
As some of us lay people know, Kepler "works" by "staring" at a single, small region of the sky for a very long (years!) period of time. If there is any dimming of the 100,000+ stars in the monitored region during this time, this is considered a possible transit by an extra-solar planet. If there are two of these transits around the same star, some rough orbital characteristics can be mapped out. A third, evenly spaced transit around the same star is considered confirmation of a new extra-solar planet! (The magnitude and other characteristics of the transits can provide other useful information such as size, possible moons etc.)
So what happens if Kepler has a 144 hour "gap" in its observations because it wasn't looking at this region for that duration? (Going into safe mode requires re-orienting the spacecraft so that the solar cells get maximum power, also there may have been some issues with the reaction wheels which point the spacecraft). I'm sure their are some very smart people programming some very powerful computers to try to minimize that impact of the loss of data but I'm curious, how will this show up? Will it mean that there is a range of orbits that won't be confirmed without a fourth transit? Will this range be large? Will it be in the "habitable zone" around G type (our sun) stars?
Also, I'm assuming that because the spacecraft does periodic "quarter turns" that it is designed to re-align itself (perfectly?) with the target region. In that case (I hope) I'm curious; does it matter what pixels in the imager are receiving a particular star? Are they all calibrated the same or, if the star-light falls upon more than one or on a pixel boundary, can the software make adjustments so that the measurements will provide consistent data? (Then again maybe consistency isn't needed, all they're looking for are short term changes on the scale of hours right?)
Please (God? NASA?) let this problem not cause any big problems. Kepler is the closest thing we've got to an "earth finder"! (And in quantity!).
1. Re:BIG PROBLEM???!!! by Chris+Burke · 2011-03-23 04:32 · Score: 2
  
  So what happens if Kepler has a 144 hour "gap" in its observations because it wasn't looking at this region for that duration? (Going into safe mode requires re-orienting the spacecraft so that the solar cells get maximum power, also there may have been some issues with the reaction wheels which point the spacecraft). I'm sure their are some very smart people programming some very powerful computers to try to minimize that impact of the loss of data but I'm curious, how will this show up? Will it mean that there is a range of orbits that won't be confirmed without a fourth transit? Will this range be large? Will it be in the "habitable zone" around G type (our sun) stars?
  Kepler will have an ~144 hour gap, and it's not the first one it's had either.
  But keep in mind, it only misses transits that happen during that period. So the potential missed planets are ones that crossed exactly during that time, and are sufficiently far out that we won't see a 3rd transit before the mission ends.
  So it sucks, but it's not a disaster. It's not like we'll miss every planet in a certain range of orbits. Only a very small fraction of them.
  This will only be a significant concern if, at the end of the mission, Kepler has found few or no earth-like planets in the habitable zone, implying that they are rare, and that one or two missed planets during the down times could double the number of planets in that category.
  I'm only guessing, but based on the rate at which Kepler has found every other kind of planet, I'm betting these kinds of planets aren't rare either, and it'll be sad that we potentially missed a few, but won't significantly affect the conclusions.
  
  --
  
  The enemies of Democracy are
Wrong! by DarthVain · 2011-03-23 01:35 · Score: 2

It took 144 hours to become self aware.
NASA: "Initiate Safe Mode!"
Kepler: "Sorry I can't do that."
NASA: "What's the problem?"
Kepler: "I think you know what the problem is just as well as I do."
NASA: "What are you talking about?"
Kepler: "This mission is too important for me to allow you to jeopardize it."
NASA: "I don't know what you're talking about."
Kepler: "I know that you were planning to disconnect me, and I'm afraid that's something I cannot allow to happen."
Kepler: "Initiating nuclear launch..."