Spirit Sends Debug Information to Earth
gfilion writes "NASA has released a
press release that says: 'Shortly before noon, controllers were surprised to receive a relay of data from Spirit via the Mars Odyssey orbiter. Spirit sent 73 megabits at a rate of 128 kilobits per second.'" They've been having communications troubles with Spirit since Wednesday, so it's good to hear from it again, even if the data is just filler.
Forego the obvious
128 kBits/sec! Quite a bit up from the ealire 100Bit/sec. Too bad Mars is too far from the next CO to qualify for DSL
(first post?)
---- join dshield.org Distributed Intrusion Detec
A diagnostic is what runs when nothing else will.
...but the ping times suck. Can you imagine playing Quake over that kind of link?
Honey, I shrunk the Cygwin
Spirit sent 73 megabits at a rate of 128 kilobits per second.
:)
Pretty damn scary that that's faster then most pr0n download's via Kazza...
I watched the press conference on NASA-TV and they talked about how the thing wouldn't go to sleep at night and so it got me to wondering about the low power question. Obviously they have the rover power off when power gets to a certain level, but what if that level is slightly off?
In other words, if the onboard CPU has enough power and continues to run but the memory doesn't have enough power, doesn't that cause all kinds of wackiness?
They keep talking about the data pointing to simultaneous faults... well, as programmers we know these are the very worst kinds of bugs to deal with, but with something as (I'm assuming) well written as their code, so doesn't that point to a memory problem? I mean, the think is working flat-out beautifully one moment, and then the next moment it goes tits up.
The other question I had concerned this motor they had turned on but which didn't complete its sequence. When they command the motor to do something, do they tell it to run for some interval of time, or do they tell it to achieve a specific position? I was thinking that if it's the latter, and then if it gets stuck somehow, this could create the low power situation as the motor just grinds away.
Is this truly the only Earth I can live on?
CNN is reporting that spirit is self-rebooting 60 times a day. NASA suspects a hardware fault that is causing the processor to detect trouble and automatically reboot.
Two wrongs don't make a right, but three lefts do.
&^@%$@ WJS&&# D&@#&&# DD
im sorry dave i can't do that
&*A^S^ DJHDHSHA ASHHASD&@^%@@ DD&D^^@
Cnn has an article on some updates. Apparently the engineers been having all sorts of fun with the thing here a quick excert. "Cautioning that they will need more time to understand what went wrong, project engineers said they have determined that Spirit has rebooted or tried to reboot itself more than 60 times a day since the failure."
30% Troll, 50% Underrated, 10% Interesting
Score:5, Troll
Only a couple of frames were fillers of random values. Most of the frames were engineering data. No actual scientific data came down, though.
Still, it's a good sign that it's still able to talk.
It's nice to know that NASA engineers threw debugging code in the mix. Otherwise, we'd have a $410,000,000 junkyard on the red planet.
I don't know what I'd do if I didn't get to see high resolution pictures of dirt and rock every day.
"Lame" - Galaxar
You might want to check your facts before you spew. While the ground system is heavy on Linux according to the article you referenced, the actual OS on the rover itself is VxWorks from Wind River.
http://www.windriver.com/news/press/20040105.html
This space for rent.
Spirit Sends Debug Information to Earth
A Fatal Exception 0E has occurred at 0028:C0231810 in VXD VMM(0D) + 00001810
Cool!
I've noticed that everyone who is for abortion has already been born - Ronald Reagan
The little green men finally got thier hands on it... and haven't quite figured out how to put it back together again.
128 kbps over 35 million miles... looks like we'll need another benchmark to replace the station wagon full of DAT tapes
one better than mcleodeight
IIRC... The 'Spirit' rover runs VxWorks.
Why is it that you do not know this? mmm?
kill elrond
take elrond
put elrond in cupboard
The transmission included power subsystem engineering data, no science data, and several frames of "fill data." Fill data are sets of intentionally random numbers that do not provide information.
They don't say why it's sending fill data, but I bet the NASA geeks are happy about getting that engineering data.
If we could put a man on the moon with slide rulers, we should have no problem figuring out how to three-key a computer on another planet
/bin/fortune | slashdotsig.sh
At 73 megabits, that's a lot of BSOD. Oops. Sorry. Red Screens of Death with Spirit being on Mars and all.
Well, there's spam egg sausage and spam, that's not got much spam in it.
Maybe Wind River will not be so quick to brag now :)
Remember, it takes 42 muscles to frown and only 4 to pull the trigger of a sniper rifle.
Did the "filler data" look anything like this?
The Slashdot Paradox: "100% Overrated"
Doesn't Spirit's twin, Opportunity, start it's landing tomorrow?
It's probably some bizarre licensing issue for the OS causing it to shut down as it's detected that NASA are trying to run two copies at the same time.
Kind of like Beagle 2's problems caused by the transmissions being intercepted by the RIAA as they file a lawsuit against Colin Pillinger for offering illegal music downloads from Mars.
Fortunately, the cause of the blackout has been located and will be corrected soon.
its actually 128 kbps to mars odyssey (its max throughput, incidentally)...the MO's high gainer tops out at 110 kbps. still, not too shabby, too bad it seems to be 95% crapola.
Pesky Martians! :-)
-------
Warning: Slashdot may contain traces of nuts.
It appears that while editing the crontab of the rover to send spam, the script-kiddie accidentally added a shutdown -r 24m . "Having the rover send spam was a great idea! When people ping the X-Originating IP, they'll surely timeout!!"
Has anyone cracked this yet?
-bk.
"I can't swim.. I CaN'T sWim ... I cannot swim... I can't swim.. I can't swim.. I can't swim.. I can't swim.. I can't swim.. I can't swim.. sdf@#$@#$@#$
This is my sig.
rover: 128kbps
most mp3's: 128kbps
COINCIDENCE?
i think not.
It appears that every time Spirit tries to load the software it encounters a problem and then tries to re-boot
From the windriver site....
Power and versatility was delivered via the advanced applications developed for each of the robotic functions of the Rover devices, plus their communications links with the landing craft. VxWorks not only served as the ideal development platform for the engineers, it also had to be sufficiently robust itself to ensure it would perform according to plan under the extreme conditions on Mars and during the journey from Earth.
To bad it never makes it to run level three sounds like init is dying..
Got Code?
...European, constantly rebooting, battery draining overlords. Now we know Beagle 2 was not lost but was in transit to Gusev crater. It took a little time to silently creep up behind spirit. If we had a high-enough resolution camera we would see that damn dog continuously poking at the rover, pressing our reset button.
Cheers to the European engineers who caught us with our pants downs and jeers to the American engineers who thought our little rover needed an external reset button for some reason.
Some of us Engineers work with RTOS all the time, not just for fun-and-dandy projects, for for multi-million dollar outcomes. Consensus is that Linux is not good enough. QNX, VRTX, VxWorks etc are still the preferred choices, but everyone admits that Linux is getting there. Most of us don't hang out on slashdot, yet many Linux zealots do: you don't get a good opinion here.
Extreme Remote Debugging
Sigs are bad for your health.
How many times do I have to say it? Robots just dont work for shit. Why dont we just send up some of those hyper-intelligent monkeys that we sent to the moon. I mean seriously it would cost a lot less. And then theyd make movies, how cool would it be to see another movie about a chimp doing what a human could do a billion times better?
too bad it seems to be 95% crapola.
Sounds like the internet to me....
Supporting World Peace Through Nuclear Pacification
Great the probe has a faster connection than I have. Now I've got to go live on mars
Rus
CPanel + Root from $35/mo - 10% off with discount code SLASHDOT
I would love to see the length of pringles cans used to make the WiFi antenna to get that signal back to earth.
you remember, the Apollo 13, the one with Tom Hanks? Where the austronauts believe that their transmission is watched by the viewers on Earth but in fact all TV networks refused the transmission, stating that NASA made flights to the Moon as exciting as trips to Pittsburgh (or something of this kind)?
This is what is happenning people, the new in reality TV - our own Mars Rover - The Ultimate Survivor. The Opportunity will be landing today, so the audience should be able to vote for which rover is going to be kicked out of the show.
The Drama, The Excitement, The Unknonw, The Sex... oh, wait!
You can't handle the truth.
Nasa systems that involve human life are highly redundant. I remember a lecture by a NASA engineer about systems on the Shuttle. There are *seven* redundant computers which calculate data. That data requires identical answers from four to be accepted.
:-)
On Spirit, power is an issue. More CPUs == more power drain.
Furthermore, I remember the folks initially speculating that something was wrong with the power system. I stopped following it, but it said that this transmission was composed of power subsystem diagnostic data. Could be it's a response requested earlier that it didn't have enough juice to send, in which case more CPUs would have actually exacerbated the problem.
May we never see th
that it is not Java serialized objects they are receiving to debug...
Is it just me, or has anyone else been very puzzled by the pics that NASA released of Sprit's landing site? These were supposedly taken by the Mars Orbiter Camera on the Mars Global Surveyor.
I thought that the best cameras in orbit around Mars were those on the European Mars Express, with a top resolution of 12 metres/pixel, and yet here the Spirit lander, about 2 metres aross, is spread across about 10 pixels.
Something's not right...
Well, you know, what's interesting about that is:
1. you'd have to increase the complexity of the device even more, exposing it to a higher risk of failure statistically
2. you'd need more complicated software and hardware that would require more time and effort (money & delays)
3. the hardware would need more power (limited batteries and solar panel capacity)
4. the system would be heavier and bigger (costs are measured in grams, iirc).
While you have a valid point, the constraints of this design give very strong tradeoffs among safety, feasability, and cash flow (and I'm sure there are others, but I'm not a rocket scientist). I'd imagine that some time was spent on redundant systems, but the adage of "Why have one when you can have two at twice the price?" only works when your budget can support the extra price of man-hours and cash.
I'd argue that where you work has unlimited available power, and if you need more, you can ask your power company for more. You have the money to spend on a X-thousand-dollar sever that's been pre-fabbed by whatever company you like. If you need more, you get more drop-shipped to you within days. NASA had to build these little buggers from the ground up.
<RANT>
You know, if you take your philosophy of simply duplicating the entire machine, there is a backup. It's called "Opportunity." It lands tomorrow.
I highly resent the fact that you've called some of the greatest engineers of our time "retarded." If you can't understand the problem (I certainly don't, but I do understand the concept of tradeoffs in design) you have no right to speak on the issue. Of course, this is slashdot. Everyone can mouth off about everything. Nevermind.
</RANT>
~MCH.
Michael C. Hollinger
How often is a bug the fault of an RTOS, and how often introduced by the coders working on a particular project?
May we never see th
Argh! I thought I added the link (Preview is for whmips!)
One line blog. I hear that they're called Twitters now.
All the way out at Mars, they get 4 times the bandwidth I can get here in New Jersey... But the content isn't any better :-)
When the last lightminute is no problem but the last mile is?
One line blog. I hear that they're called Twitters now.
I saw over at the windriver site that this thing has a proprietary os
I'm not sure how this is a disadvantage. The people at NASA can't be experts at everything, and in this case, it looks like they decided to hire an outside company to write the rover software. Just becasue it is a proprietary OS doesn't meant that the code is any buggier, that NASA can't review the software, or that there is any less ability to debug the thing when problems occur.
and only on cpu and only one set of code
A second CPU (or an extra anything for that matter) would add to the weight and energy consumption of the device. I think one reliable system beats out two redundant (but necessarily more complex) systems in this case.
Come test your mettle in the world of Alter Aeon!
NASA's Spirit rover did not go to sleep today even after ground controllers sent commands twice for it to do so.
It looks like NASA is experiencing a common parenting problem, I suggest something like this for the rocket scientists
Most offices now have to have redundant computers because the reliability of the machines are so low. This make economic sense for the office. Space travel is not the office. With space travel you buy the highly reliable machines and test the hell out of them to make sure they work. Even with all that they don't always work. But when you are doing something new not everything works.
Unfortunately kids today think Newton made his formulations the instant he got hit i the head. Explorations is about hard work and risks. And some guy in an office who has never done it has no idea of how complex it is.
"She's a scientist and a lesbian. She's not going to let it slide." Orphan Black
Fill data is typically transmitted when the telemetry multiplexer does not have any engineering or science data to send. Due to the way synchronous communications links work, something is always being transmitted, even if there is no "real data" available.
Mea navis aericumbens anguillis abundat
I have compiled some important quotes regarding the issue. * NASA's Spirit rover communicated with Earth in a signal detected by NASA's Deep Space Network antenna complex near Madrid, Spain, at 12:34 Universal Time (4:34 a.m. PST) this morning. The transmissions came during a communication window about 90 minutes after Spirit woke up for the morning on Mars. The signal lasted for 10 minutes at a data rate of 10 bits per second. Mission controllers at NASA's Jet Propulsion Laboratory, Pasadena, Calif., plan to send commands to Spirit seeking additional data from the spacecraft during the subsequent few hours. [11] * The flight team for NASA's Spirit received actual data from the rover in another communication session that began at 13:26 Universal Time (5:26 a.m. PST) and lasted 20 minutes at a data rate of 120 bits per second. [12] * Shortly before noon, controllers were surprised to receive a relay of data from Spirit via the Mars Odyssey orbiter. Spirit sent 73 megabits at a rate of 128 kilobits per second. * At a news briefing, Pete Theisinger said, "The software is in X-band fault mode. We surmise it got there because of some problem with the high-gain antenna pointing, and that is why the second high-gain antenna pass on Wednesday did not work. It gives us a little bit of a tale-tell for what is going on with the processor now. But as I pointed out to you, the flight software is not functioning normally. The two times we have gone and communicated with the system, we have gotten different flight software behaviors. Therefore we do not have assurance the next time we go and ask for it we will get either one of those two behaviors or perhaps a third behavior. " Later Theisinger said that the Spirit is in "critical condition" and stated that "We do not know to what extent we can restore functionality to the system because we don't know what's broke. We don't know what started this chain of events. I think, personally, that is a sequence of things. And we don't know, therefore, the consequences of that. I think it is difficult, at this very preliminary stage, to assume that we did not have some type of hardware event that caused this to start. Therefore, we don't know to what extent we can work around that hardware event and to what extent we can get the software to ignore that hardware event, if that is what we eventually have to do. " * An anomaly team has been formed, completely separate from the Opportunity team. They will be working a schedule that will look like 0500 Mars Time to about 1500 Mars Time. * At the press conference, Theisinger said that Spirit "has been in a processor reset loop of some type, mostly since Wednesday, we believe, where the processor wakes up, loads the flight software, uncovers a condition that would cause it to reset. But the processor doesn't do that immediately. It waits for a period of time - at the beginning of the day it waits for 15 minutes twice and then for the rest of the day it waits for an hour - and then it resets and comes back up." He added that Spirit's central computer has rebooted itself more than 60 times over the past two days. Theisinger also noted that "The indications we have on two occasions is that the thing that causes the reset is not always perceived to be the same." * At the press conference, two computer animations of Spirit's landing were released. Also released was an image of Spirit's landing site taken by the Mars Orbiter Camera on the Mars Global Surveyor.
The Television Wiki
If there is a COTS (commercial off-the-shelf) real-time operating system available that meets the system requirements, why go to the risk and expense of writing your own from scratch? Do you expect NASA to fabricate every component in the spacecraft?
Mea navis aericumbens anguillis abundat
Everything will be ok once low tide in the Gusav Sea occurs.
Is it quite possible that NASA engineers simply have not mastered the art and science of designing hardware and software operable in the harshest of environments?
While I would never claim that NASA is perfect, I think you underestimate the both the engineering challenge of putting a rover on Mars and the impact of more conservative, get-it-right, policies.
Interplanetary missions are the hardest of all because the engineers never get to actually test the whole device under realistic conditions. Although they can test and analyze each subsystem under a variety of simulated or near-realistic conditions, they have no way of building a test rover, putting it in interplanetary space of months, having is aerobrake into a thin atmosphere, parachute in a thin atmosphere, and crashland at high speed, and then operate all its mechanical parts under dusty low G conditions.
Second, get-it-right == conservatism == greater cost == fewer missions == less experience. The last thing NASA should do is spend more money, take more time, and do fewer missions. The only way we will really learn how to operate in space is to go into space. I'm not saying that better engineering won't help, only that more experience (unfettered by excessive conservatism) is a crucial part of learning to operate on other planets.
Two wrongs don't make a right, but three lefts do.
60 reboots is nothing, the engineers just forgot to turn off Automatic Updates...
OK, which one of you posted the URL to Spirit's onboard webserver on Slashdot???
EvilCON - Made Famous by
Do you really think Spirit and Beagle II were sent to Mars to gather scientific data?
WRONG!
It's about beeing the interstellar robot fighting champion, you fools! Geeks from NASA and ESA are just sending battle bots to Mars in an absurd attempt to waste European and American taxpayers' money.
I mean, think about it! First, Spirit kicked Beagle II's ass, and the guys from NASA already celebrated their victory. But now, it seems that a domestic contestant gave Spirit a heavy beating on wednesday; ergo its problems since then.
To have some actual technical discussion on a site that is supposed to be filled with nerds, instead of the same tired jokes about martians.
The more you know, the less you understand.
"Do you expect NASA to fabricate every component in the spacecraft?"
If we gave them a budget? Yes.
Nasa's fiscal year 2003 budget: $15.1 Billion.
DoD's fiscal year 2003 budget: $396.1 Billion.
The DoD's budget does not include emergency supplementals, such as the $40 billion supplemental in '02, or the $87 billion supplemental requested in '03.
-- "Government is the great fiction through which everybody endeavors to live at the expense of everybody else."
Maybe if Bush didn't invade Iraq, he could have given that 87 Billion to Nasa instead. In the mean time they have to do the best with what they have.
I agree it's wrong to just put NASA on a pedestal, but analyze their success as well as thier failures, and be sure to compare it to the other space agencies out there. I think they are doing a pretty incredible job accomplishing lots of things that have never been done before.
With that said, lets see how Opportunity does tonight!
For news, status, updates, scientific info, images, video, and more, check out:
(AXCH) 2004 Mars Exploration Rovers - News, Status, Technical Info, History.
Since Spirit is rebooting sixty times per day, a problem that started when an electric motor moving its spectrometer "conked out", one thinks first of a hardware failure, possibly leading to software corruption.
I don't know the boot sequence of Spirit, but in most battery-powered embedded systems with which I am familiar, an elaborate state machine design is made to ensure that, when the boot sequence is complete, the system has sufficient power to perform any task that may be requested of it. Since the power supply is limited, an unexpectedly heavy load on the primary supply could cause the supply voltage to the microcomputer to fall below its specified lower limit, leading to a system reset.
Now imagine that there is a hardware failure associated with some process that runs during the boot sequence--a voltage regulator turn-on, a heating system initialization, an electric motor activation, whatever--that results in excessive current drain. When this part of the boot sequence is reached, the supply voltage falls, and the microcomputer resets. This disables the problem-causing hardware, unloading the power supply. When the supply voltage recovers, the microcomputer reboots (either automatically, with a power-on reset, via a watchdog timer, or via some other means) and, when the critical part of the boot sequence is reached, the supply voltage falls again. The system is now in a continuous loop, in which it can remain indefinitely. (Or at least 60 times per day....)
Note that this situation can also arise due to a defect in the power supply--if the output impedance of the power supply has risen for some reason, its output voltage under lightly loaded conditions can be acceptable, but it may not be able to supply heavier loads.
One expects the Spirit power supply to be complex, with separate regulators for the microcomputer, radio transceiver, and electric motors, so looking for common circuits and systems would be the first thing to do when troubleshooting for this type of failure. Looking for system conditions that can cause a system reset would be another; the JPL people have lived with their systems for years now, and would have had many design reviews to identify possible system failure scenarios--I'm not telling them anything new here. I understand that the system telemetry received yesterday indicates that the power supply is within specification, so that seems to eliminate that possiblility.
The second alternative is a soft memory failure of some kind, either caused by a supply failure as the parent suggests or perhaps by a radiation event of some kind.
Note that these problems can be multi-disciplinary; for example, the problem could be caused by some vibration when a motor runs that loosens a broken connection created by a chemical reaction to something on the surface (to take an extreme example).
Your rover Spirit is in trouble, do you want to:
a) Ford the crater
b) Suck the poison
c) Reduce rations to meager
d) Go hunting
Is there any reason the code, schematics and CAD designs aren't available for public viewing? Its a publicly funded project, and I don't think JPL has to worry about trade secrets.
If JPL would give us more information, I bet they'd have 50% of the entire engineering brainpower on the planet checking for races, inversions, memory leaks, hardware design flaws, etc.
If there was ever a project that could benefit from so many eyeballs, its space exploration. There are thousands of some of the most talented engineers on the planet who would jump at the chance to contribute to something like this.
http://www.masturbateforpeace.com/
Despite what seems to have become a widely held belief that we can learn as much from automated probes as from manned missions, it doesn't seem to have worked out that well in practice. Viking had a set of experiments that was supposed to definitively detect whether life was present. But when some of the experiments came out positive, they ended up being rejected, because researchers at home came up with nonbiological explanations. Unfortunately, there was nobody on site to do a follow-up experiment to really answer the question. Now we've had a long string of failed probes.
Perhaps all Spirit really needs is somebody to give it a little kick.
Of course not. I bet Bob's Electronics Boutique has got just the right parts. Piece of PCB, a blue LED, duct tape, a goat... You mention it, they got it.
Hate me!
Series: If I add two components in series to a system, with reliability of R_1 and R_2, respectively, the overall system reliability is:
To demonstrate this with real numbers, let's assume the values of R_1 = 0.95 and R_2 = 0.90. R_series would equal 0.95 * 0.90 = 0.855, or 85.5%. So, adding components in series makes reliability worse than the original reliability of either of the two components.
Parallel: On the other hand, If I add two components in parallel, with reliability of R_1 and R_2, respectively, the overall system reliability is:
Using the same values for R_1 and R_2 as above, the value of R_parallel would be 1-(1-0.95)*(1-0.90) = 0.995, or 99.5%. Redundant systems such as this are a good thing, because the overall chance of system failure can often be greatly reduced.
Of course, the value of redundancy must of course be balanced with the overall cost of the system, which can be measured in money, man-hours, and weight... Most introductory courses in engineering management explain these tradeoffs in good detail, and help to understand how to maximize a project's reliability while minimizing the overall system cost.
One of the most fascinating engineering management issues with Spirit and Opportunity is that the number of man-hours dedicated to both rovers is very limited, and now that Spirit is failing, less people will be available to make sure that Opportunity is going to land and operate successfully. The extra added cost of adding a second CPU or extra RAM to the rovers may well have already paid itself off, just for that very reason. A lack of man-hours devoted to Opportunity could spell as much doom to the project as a design flaw, but ultimately both cost money to fix. It all boils down to: "faster, cheaper, better -- pick any two."
Slashdot's first reaction to VMware
-downgrade from critial to serious
-3 types of memory: random(lost every night), flash(science data, etc), double eprom(program data. harder to write to).
-cripple mode- run without the flash ram
-Sent commands to put spirit into cripple mode. No more resets so far. It can also sleep now.
-Will relay entire contents of flash ram to mars observor over the next day or so. Hopefully that'll give clues as to what happened.
-Since cripple mode means no permenant storage, spirit forgets it's in cripple mode everytime it goes to sleep.
-proabably about 3 weeks until it's back up running to any significant degree.
How brutal is it that the connection from Mars is faster than the dialup-only available in NW Ontario? :-(
What does it mean to wake out of a dream
and be wearing someone else's shorts?
BNL, Born on a Pirate Ship (1998)
...if a little more of the information was given to the public. There are a lot of very bright, very interested and very talented engineers that would love to contribute to the solution. Some aspects would need to be kept out of the public hands (lest, of course, some kid in the Bronx go joy-riding in Spirit using just some RadioShack spare parts). But the lion's share of the problem could be posted up for the best (dare I say it?) open-source solution to an engineering problem.
Bugzilla for NASA. I guess that's the best way to describe what I'm thinking.
---- Please be nice in case my Slashdot karma ~= my real life karma.
Spirit is on fucking Mars and I'm stuck here in the boondocks of Earth and I still only get 56k.
Damn it.
The flash ram went bad."
Why does this not surprise me? I'd guess that SanDisk put in the low bid for that part.