Spirit Sends Debug Information to Earth
gfilion writes "NASA has released a
press release that says: 'Shortly before noon, controllers were surprised to receive a relay of data from Spirit via the Mars Odyssey orbiter. Spirit sent 73 megabits at a rate of 128 kilobits per second.'" They've been having communications troubles with Spirit since Wednesday, so it's good to hear from it again, even if the data is just filler.
Forego the obvious
Most likely it's not a protocol that involves a lot of ACK'ing [e.g. huge packets with FECs]
Tom
Someday, I'll have a real sig.
It's nice to know that NASA engineers threw debugging code in the mix. Otherwise, we'd have a $410,000,000 junkyard on the red planet.
I don't know what I'd do if I didn't get to see high resolution pictures of dirt and rock every day.
"Lame" - Galaxar
The transmission included power subsystem engineering data, no science data, and several frames of "fill data." Fill data are sets of intentionally random numbers that do not provide information.
They don't say why it's sending fill data, but I bet the NASA geeks are happy about getting that engineering data.
If we could put a man on the moon with slide rulers, we should have no problem figuring out how to three-key a computer on another planet
/bin/fortune | slashdotsig.sh
If it will not go to sleep at night it suggests to me that they have a serious hardware / software design flaw. They probably rely on software to initiate a standby vs alive mode. A proper design in this case would be to use standard analog circuits to do this type of job. Think about it you do not have to go out everynight and reboot your street light pole. Now of course this is pure speculation as IANANE
but then again maybe I should be.
Got Code?
I saw over at the windriver site that this thing
has a proprietary os and only on cpu and only one set of code. Now come on how frigging smart is that? Hell back at work I even have redundant clusters for nearly everything. Relying on a single computer that is a few hundred million miles away is, should I dare say? Retarded..
Got Code?
Hmmm, even they can do nothing against hardware errors... what this appears to be.
The good news is that this software appears to handle a hardware error situation gracefully, as it should. Bragging time may still be ahead.
I'm wondering if this is a software glitch running on Spirit and if so this truly does call into question the competence of Wind River, the people that wrote the code in use inside of Spirit. Why doesn't NASA hire its own programmers instead of hiring another firm to write it for them?
While I am sure the parent isn't at all involved in the project and is probably wildly off base, I think it is a very interesting observation. I mean the guys as NASA guess the same kind of stuff right? They just have the means to check it and rule it out (or not). I would have to say based on the limited info he has of the rover, that this isn't an all that unlikely guess as to the cause of problems.
And for all those people that say things like "Do you think the people at NASA are just stupid and wouldn't have thought of this in the design?" Well no, they are not stupid, but they are not perfect either. And they have most certainly overlooked some pretty stupid things that caused serious failures. I mean hell, they only need one bug to bring the whole thing to a halt, and it isn't like they can do real world testing beforehand, they can only simulate what it will be like.
they paid 400 million for that link. How much did you spend on yours?
Thought so.
Is it quite possible that NASA engineers simply have not mastered the art and science of designing hardware and software operable in the harshest of environments?
While I would never claim that NASA is perfect, I think you underestimate the both the engineering challenge of putting a rover on Mars and the impact of more conservative, get-it-right, policies.
Interplanetary missions are the hardest of all because the engineers never get to actually test the whole device under realistic conditions. Although they can test and analyze each subsystem under a variety of simulated or near-realistic conditions, they have no way of building a test rover, putting it in interplanetary space of months, having is aerobrake into a thin atmosphere, parachute in a thin atmosphere, and crashland at high speed, and then operate all its mechanical parts under dusty low G conditions.
Second, get-it-right == conservatism == greater cost == fewer missions == less experience. The last thing NASA should do is spend more money, take more time, and do fewer missions. The only way we will really learn how to operate in space is to go into space. I'm not saying that better engineering won't help, only that more experience (unfettered by excessive conservatism) is a crucial part of learning to operate on other planets.
Two wrongs don't make a right, but three lefts do.
You look much worse chiding someone over what was, at most, an unimportant part of the post.
To have some actual technical discussion on a site that is supposed to be filled with nerds, instead of the same tired jokes about martians.
The more you know, the less you understand.
If there was another agency out there putting machines on Mars, able to perform flawlessly for extended periods of time, and the NASA machines were the only ones crapping out, then I'd agree there needs to be some serious analysis of why NASA isn't getting it right.
But this just isn't the case.
From what I can tell NASA is doing as good a job as anyone on Earth with the technologies, manufacturing processes and testing programs available to them.
I would hope that NASA be the first ones to run a diagnostic on themselves when problems occur, but the first order of business is to figure out what went wrong with Rover on Wednesday and make sure it doesn't happen again, which is what they are doing right now.
Maybe if Bush didn't invade Iraq, he could have given that 87 Billion to Nasa instead. In the mean time they have to do the best with what they have.
I agree it's wrong to just put NASA on a pedestal, but analyze their success as well as thier failures, and be sure to compare it to the other space agencies out there. I think they are doing a pretty incredible job accomplishing lots of things that have never been done before.
With that said, lets see how Opportunity does tonight!
Since Spirit is rebooting sixty times per day, a problem that started when an electric motor moving its spectrometer "conked out", one thinks first of a hardware failure, possibly leading to software corruption.
I don't know the boot sequence of Spirit, but in most battery-powered embedded systems with which I am familiar, an elaborate state machine design is made to ensure that, when the boot sequence is complete, the system has sufficient power to perform any task that may be requested of it. Since the power supply is limited, an unexpectedly heavy load on the primary supply could cause the supply voltage to the microcomputer to fall below its specified lower limit, leading to a system reset.
Now imagine that there is a hardware failure associated with some process that runs during the boot sequence--a voltage regulator turn-on, a heating system initialization, an electric motor activation, whatever--that results in excessive current drain. When this part of the boot sequence is reached, the supply voltage falls, and the microcomputer resets. This disables the problem-causing hardware, unloading the power supply. When the supply voltage recovers, the microcomputer reboots (either automatically, with a power-on reset, via a watchdog timer, or via some other means) and, when the critical part of the boot sequence is reached, the supply voltage falls again. The system is now in a continuous loop, in which it can remain indefinitely. (Or at least 60 times per day....)
Note that this situation can also arise due to a defect in the power supply--if the output impedance of the power supply has risen for some reason, its output voltage under lightly loaded conditions can be acceptable, but it may not be able to supply heavier loads.
One expects the Spirit power supply to be complex, with separate regulators for the microcomputer, radio transceiver, and electric motors, so looking for common circuits and systems would be the first thing to do when troubleshooting for this type of failure. Looking for system conditions that can cause a system reset would be another; the JPL people have lived with their systems for years now, and would have had many design reviews to identify possible system failure scenarios--I'm not telling them anything new here. I understand that the system telemetry received yesterday indicates that the power supply is within specification, so that seems to eliminate that possiblility.
The second alternative is a soft memory failure of some kind, either caused by a supply failure as the parent suggests or perhaps by a radiation event of some kind.
Note that these problems can be multi-disciplinary; for example, the problem could be caused by some vibration when a motor runs that loosens a broken connection created by a chemical reaction to something on the surface (to take an extreme example).
grrroooooooooaaaannn.... +1 funny, reluctantly
.
== WolfriderV6 == I'm willing to admit that *I just might* be wrong... Are you??