Slashdot Mirror


Mars Rover Spirit Back Online

Skyshadow writes "Just in time for the arrival of its twin, the Spirit Mars Rover is back in working order. Programmers at the JPL have traced the problem to the rover's flash RAM, which it uses to maintain its filesystems. They are using a ramdisk in the rover's RAM to bypass the bad flash memory, and are working on a workaround for the bad flash. Good news, but the rover is still potentially weeks away from full operational status."

386 comments

  1. They found the problem by Anonymous Coward · · Score: 5, Funny

    They signed up for Mars Online with 3000 free hours. What they didn't realize was that the free 3000 hours only applied to the first month of service. Once they paid their MOL bill, they got hooked back up. All the probes friends on Mars use MOL!

    1. Re:They found the problem by Anonymous Coward · · Score: 0
      You need to read this one.

      Now, that one was funny!

    2. Re:They found the problem by meiocyte · · Score: 1

      Is that a Phobos month (8 earth hours) or a Deimos month (30 earth hours)? Either way, that sounds like false advertising..=)

      --
      The thing in the box has no place in the language-game at all; not even as a something; for the box might even be empty.
  2. Weeks away? by adrianbaugh · · Score: 5, Funny

    They should boot faster, using linux. Then they'd only be ten seconds away :-)

    --
    "'I pass the test,' she said. 'I will diminish, and go into the West, and remain Galadriel.'"
    - JRR Tolkien.
    1. Re:Weeks away? by anakin357 · · Score: 0, Offtopic

      I found this rather funny.

      Just because it's humor from another story doesn't mean it's a troll.

      --
      http://www.fsckin.com/
    2. Re:Weeks away? by Anonymous Coward · · Score: 0

      While they may have booted faster, they would have avoided the downtime had they gone with MicroSoft, whose Trusted Comput--oh, wait...

    3. Re:Weeks away? by Anonymous Coward · · Score: 0

      *sigh* YACSAR (Yet Another Cheap Shot At Redmond). Can't Mr. Softy get no love?

    4. Re:Weeks away? by Anonymous Coward · · Score: 0

      Maybe from this guy.

    5. Re:Weeks away? by adrianbaugh · · Score: 0, Offtopic

      I'm glad someone read it as intended :-) Clearly the moderators haven't been reading the front page carefully enough. Oh well, I have karma to burn...

      --
      "'I pass the test,' she said. 'I will diminish, and go into the West, and remain Galadriel.'"
      - JRR Tolkien.
    6. Re:Weeks away? by Anonymous Coward · · Score: 0

      Nope, the internet says Mr. Cx (a Czech midget??) could not be found. Not even an Out of Office reply. *sigh*

  3. Checksums by Anonymous Coward · · Score: 1, Insightful

    Sounds to me like they need to send back checksums of the contents of the Flash memory and figure out if part of it got corrupted somehow. Then re-flashing that section would probably fix the problem.

    1. Re:Checksums by Anonymous Coward · · Score: 0

      Holy crap you should work for NASA dude! You're the only one that figured it out!@#

      w00t 4 u d00d!

    2. Re:Checksums by webtre · · Score: 0

      No, some punk alien just shoved a gum stick wrapper in the USB port. On different OSes this does different things, but it always does something. Most major MS-based OSes will wither reboot or freeze the system. So it probably got a gum wrapper jamed in there my some bored alien who wanted a cheap laugh. Silly earthlings.

      --
      litigious bastards
      suck it sco!
    3. Re:Checksums by Anonymous Coward · · Score: 5, Informative

      I'm watching NASA tv at the moment and they're explaining possibilities now. At the moment, they only have a very broad explanation of what's going wrong. However the newest knowledge is;

      There are two separate flash memories on Spirit. At the moment, part of the problem is software which can read part of the flash memories as some of the operational software which is kept in flash ram seems to be coming up before the system reboots.

      The system is rebooting no matter which flash memory is being accessed, it has the same bug both ways, so the flash ram itself looks to be OK, but the interface between the flash ram and the software looks to be causing resets.

      Even if there were more backup flashrams, it looks like they'd still have this problem. Perhaps many, all on different controllers, and even an entire backup computer would have prevented this. at 100watts total power available for the rover, an entire extra computer may be a bit much to have fit. But then sending two rovers would also negate problems, and thats just what they've done

      It seems most likely at the moment, according to NASA, that the family of components that are involved with the hardware addressing of the flash memories looks to be where the problem is.

    4. Re:Checksums by NanoGator · · Score: 1

      Could the rebooting be caused by solar activity or radiation or something along those lines?

      Just wondering if something unexpected in the environment could be the cause. The thought of sending people to Mars is a little scary if random bits get flipped.

      --
      "Derp de derp."
    5. Re:Checksums by questamor · · Score: 1

      The problem causing the rebooting may have been triggered by radiation, but by avoiding certain parts of the flash ram the machine boots up fine (if somewhat crippled), so it's avoidable.

    6. Re:Checksums by webtre · · Score: 0
      The problem causing the rebooting may have been triggered by radiation

      No, like I said, the natives over there on Mars need to take whatever they shoved into the USB port out of the USB port. It's funny that all you have to do to make a computer freak out and reboot is simply shove something conductive into the USB port. Paperclip anyone?

      --
      litigious bastards
      suck it sco!
    7. Re:Checksums by uberdave · · Score: 1

      The rovers are no doubt using milspec chips which are radiation hardened. Don't forget, they have an onboard nuclear heater.

    8. Re:Checksums by pjotrb123 · · Score: 1

      FYI, the problem is bad Flash RAM (like a memory stick), and not bad Flash ROM (like a PC BIOS).

      --
      I liked my next sig a lot better
  4. You mite listen to Jimmy, But you can't hear Jimmy by niko9 · · Score: 5, Funny

    /riff/Move over Rover, let the ramdisk take over!/riff/

    Wonder wehre they got they flash ram from?

    --

  5. Mars Defense System... by TheKidWho · · Score: 0, Redundant

    Its a martian virus...

  6. Not "online" at all... by |>>? · · Score: 0, Informative
    The rover status has been updated from critical to serious. Peter Theisinger stated:
    "We made good progress overnight and the rover has been upgraded from critical to serious. We have a working hypothesis we are pursuing that is consistent with many of the observables and consistent with operations that we performed on the vehicle last night. It involves the flash memory on the vehicle and the software used to communicate with that memory."

    You can read all about it at: Spaceflight Now - where you can continue to follow the status of both spirit and opportunity (which currently is hours away from landing).
    --
    |>>? ..EBCDIC for Onno..
    1. Re:Not "online" at all... by Mr.+Darl+McBride · · Score: 3, Insightful
      You can read all about it at: Spaceflight Now - where you can continue to follow the status of both spirit and opportunity

      Nicely karma-whored. That's the link from the article. :)

    2. Re:Not "online" at all... by |>>? · · Score: 0, Redundant

      How this reply came to be:

      I'd just been reading Spaceflight Now, switched back to /., read the posting, thought to myself: "Hmm, that's not what I just read.", hovered my mouse over the last link - even clicked it - saw CNN, read the caption thought, "Yup, that's just what I read, but the posting is wrong.", bashed in my response, even hit Preview, then hit Submit, then looked again, then noticed that my link was the same as the one in the post, thought - "Hmm, ahh well, so much for that contribution.", then noticed that people were moderating it up....

      Thanks anyway...

      --
      |>>? ..EBCDIC for Onno..
    3. Re:Not "online" at all... by DaveAtFraud · · Score: 0, Offtopic

      And the funny thing, as you noted, is that your original post got modded as "Informative" which means our beloved moderators don't even RTFA before modding.

      Go ahead. Mod me down as "off topic". I've got karma to burn. But only if *YOU* RTFA.

      --
      They that can give up essential liberty to obtain a little temporary safety deserve neither safety nor liberty.
      Ben
  7. Warranty by DarkHelmet · · Score: 5, Funny
    They are using a ramdisk in the rover's RAM to bypass the bad flash memory, and are working on a workaround for the bad flash.

    I think they should return the bad flash part to where they got it and exchange it for a new part... although getting the memory back to the store by the 30 day warranty might be a little difficult.

    I hope they bought the extended warranty.

    --
    /^[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,4}$/i
    1. Re:Warranty by Albinoman · · Score: 5, Funny

      The real question is: Can they get their flash RAM supplier to pay for shipping?

    2. Re:Warranty by evilnissan · · Score: 3, Funny

      Now Nasa has just to wait for Tigerdirect.com to send a replacement, or get store credit..

      --
      This Sig for rent.
    3. Re:Warranty by questamor · · Score: 4, Insightful

      Curiously, is there any difference with flashram on Spirit, and the stuff we have here? I didn't know about any radiation hardened flash ram... or even if there's any difference between the physical chips themselves in CF, SD, MemorySticks etc.

      The nasa report mentioned the problem seems to be revolving around the software that accesses the flashram. It could be filesystem corruption, or a physical problem with the flash ram itself, or even a broken interface to the flash ram. It's about the equivalent of having a machine a thousand miles away and just seeing that a certain drive won't mount, at the moment. Finding out whether there's a problem with the SCSI card it's connected to, or the drive itself, or a filesystem corruption, or a head crash... that comes in the next few weeks

    4. Re:Warranty by Anonymous Coward · · Score: 1, Funny

      If they got it at Fry's electronics, they might take it back and put it on the shelf again for some other sucker to buy.

    5. Re:Warranty by Anonymous Coward · · Score: 0

      Of course - they use (Mars) Express Saver, and it's free!

    6. Re:Warranty by Anonymous Coward · · Score: 0

      If they got it at Fry's electronics, they might take it back and put it on the shelf again for some other sucker to buy.

      Maybe Beagle used Polar Lander's stuff.

    7. Re:Warranty by dnoyeb · · Score: 1

      I think without the protection of the earths atmosphere their is lots of bombardment of electrical waves and other such phenomonon.

    8. Re:Warranty by colmore · · Score: 1

      Probably went with the pricewatch.com low bidder. Last time I did that, I got some bad RAM too.

      --
      In Capitalist America, bank robs you!
    9. Re:Warranty by MrFreshly · · Score: 1

      I don't know...did they buy on-site service with that??

    10. Re:Warranty by mindstrm · · Score: 1

      Not the atmosphere.... that does little.

      IT's the earth's magnetic field that saves us from turning into a giant bbq...

  8. Maybe just maybe... by TheMadPenguin · · Score: 2, Funny

    it was their AOL bill that wasn't paid? hmmmm...

    --
    Linux with kernel panic...
    MadPenguin.org
  9. heh... /. was right! by Smitty825 · · Score: 5, Interesting

    During all of the "Spirit is broken" columns, I kept reading /. comments saying that it was likely a memory error due to the non-consistent errors...I guess a million monkeys with a typewriter can be correct :-)

    --

    Doh!
    1. Re:heh... /. was right! by AndroidCat · · Score: 4, Funny

      I thought I got it rather spot on. :^P (I guess that makes me the millionth monkey?)

      --
      One line blog. I hear that they're called Twitters now.
    2. Re:heh... /. was right! by cubicledrone · · Score: 4, Insightful

      Amazing, isn't it? Writing comments correctly debugging an $800 million spacecraft on another planet without even looking at it, and most programmers still can't rent a fuckin' job.

      Now let's all sing the company song...

      --
      Business isn't willing to pay for products, innovation and careers, so we get brands, mortgage commercials and layoffs.
    3. Re:heh... /. was right! by iminplaya · · Score: 1

      And here I was thinking it was just one of those cheap taiwanese(sp) capacitors that keep popping up(so to speak).

      --
      What?
    4. Re:heh... /. was right! by Anonymous Coward · · Score: 0
      most programmers still can't rent a fuckin' job.

      RENT a job? What the hell does that mean? Never heard anyone use the phrase "rent a job" in my life. Anyway...

      MOST programmers won't get a fucking job because there's simply too many of you even 3 years after the dotcom bust. Switch to plumbing, or nursing, or construction, or something bluecollar, because the productivity gains and offshoring will continue to get worse/better (depending on how you look at it).

    5. Re:heh... /. was right! by kfg · · Score: 1

      Your manner is offensive. (And yes, there are fields where paying to work is the norm, the word "rent" is used in refering to such and more people are waiting to do so then there are positions)

      Your conclusion, however, is correct.

      KFG

    6. Re:heh... /. was right! by DAldredge · · Score: 1

      Those jobs are earmarked for the new blue-card workers that Bush is creating. You know an unlimited H1B like program with no limits on wages or # allowed?

    7. Re:heh... /. was right! by anubi · · Score: 2, Interesting
      Yeh.. hats off to you guys!

      I was barking up the tree of spike on the power supply. For the exact same reasons.

      I have had power supplies or bypassing go bad due to the increasing ESR of aging capacitors, and by golly they come up with the damndest intermittent failures you would ever want to see. They will have you debugging every process in the system until you put a storage oscilloscope on the power supply line and watch it like a hawk.

      --
      "Prove all things; hold fast that which is good." [KJV: I Thessalonians 5:21]

    8. Re:heh... /. was right! by More+Trouble · · Score: 4, Funny

      Now let's all sing the company song...

      "Oh, say can you see..."

      :w

    9. Re:heh... /. was right! by Anonymous Coward · · Score: 0

      MOST programmers won't get a fucking job because there's simply too many of you even 3 years after the dotcom bust.

      Horseshit. The entire economy depends on computers. The reason programmers can't get jobs is because management lacks the wisdom to reward competence and education, and instead seeks the all-important money-grab.

    10. Re:heh... /. was right! by Anonymous Coward · · Score: 0

      Mod insightful if I had the mod points to do it

    11. Re:heh... /. was right! by Myopic · · Score: 1

      damn right.

    12. Re:heh... /. was right! by argStyopa · · Score: 4, Funny

      How hard is that, really?

      Thousands of /. posters solve all the world's problems in a few snide lines of comment, despite rarely leaving their little veal-fattening pens or even RTFA. Fixing a software glitch a few million miles away is child's play in THIS neighborhood, my friend.

      --
      -Styopa
    13. Re:heh... /. was right! by elpapacito · · Score: 1

      Maybe because there is a million McDonald-didnt-want-em-learn-C++-in-2-weeks programmers and only one hundred thousand programmers with more then x thousand hours programming experience ? You know they shouldn't write "will code for food" on their signs, but rather "will rejoin cooking workforce"

      On a tangent: the market being driven by pointy haired bosses doesn't help, either.

    14. Re:heh... /. was right! by HermanAB · · Score: 2, Insightful

      Company song? "You load 16 tons, what do you get? A little bit older and deeper in debt..."?

      --
      Oh well, what the hell...
    15. Re:heh... /. was right! by benpark22 · · Score: 1

      NASA should make their rover software open source, so programmers around the world can debug for them.

    16. Re:heh... /. was right! by Old+Wolf · · Score: 1

      I always thought that was "Jose can you see?"

  10. The mission is not yet out of danger by Space_Soldier · · Score: 1, Insightful

    The status has been upgraded from critical to serious condition. Opportunity will most likely have the same problem since they are twin brothers and had an identical build process. They better figure out what is wrong with this rover before sending Opportunity to invetigate its part of Mars.

    1. Re:The mission is not yet out of danger by endersdouble · · Score: 1, Insightful

      Y'know, I don't think you are right about that, actually. Defective flash RAM just happens sometimes...just because the same parts and process build Opportunity does not mean that the same part will have a flaw. What they should have done is tested the ram on earth, but even so, most likely it'll be fine (we hope.) And even if it isn't, what are you going to do now?

    2. Re:The mission is not yet out of danger by dnoyeb · · Score: 1

      What they should have done is have a backup system. Such as everything that flies in the US does. WTH else is all the expense for?

    3. Re:The mission is not yet out of danger by Aardpig · · Score: 3, Funny

      Opportunity will most likely have the same problem since they are twin brothers and had an identical build process.

      I quote from my post a couple of days ago:

      Parent: So even if Spirit gives up the ghost, her kin can carry on the flame (albeit in a less interesting location).

      Me: Not if the problem is due to a design fault. That's the drawback of sending multiple identical probes: if one is intrinsically fucked, they all are.

      I now bask, contented, in the glow of my own brilliance....

      --
      Tubal-Cain smokes the white owl.
    4. Re:The mission is not yet out of danger by damiam · · Score: 1

      I'm sure the ram had the shit tested out of it on Earth, but testing on Earth and operation on Mars are two different things.

      --
      It's hard to be religious when certain people are never incinerated by bolts of lightning.
    5. Re:The mission is not yet out of danger by Anonymous Coward · · Score: 0
      Are they dying the TWIN T-rOWERS Spirit and Opportunity? One of them is dying, and the other is begining to *#%#%@%!.

      open4free

  11. The epitome of remote administration by Faust7 · · Score: 4, Interesting

    Engineers guessed that Spirit's troubles were in its Flash memory and set about sending the rover a complex series of instructions to see if they could get it to bypass the corrupted memory. Theisinger said engineers sent Spirit a command just before its daily "waking up," telling it to shut down and restart in what is known as "cripple mode," using RAM instead of Flash for its start-up instructions.

    Some people may take this sort of thing for granted, but I for one find it remarkable that we can essentially reboot and perhaps even fix a system that is on a whole other planet.

    Just wait until we have Interplanetary, Interstellar, Intergalactic Remote Desktop. I'm only half-joking.

    1. Re:The epitome of remote administration by Daychilde · · Score: 5, Funny

      It's all good until tech support says, "So... Do you have a boot disk?" :-)

      --
      A cheerful little bird is sitting here singing.
    2. Re:The epitome of remote administration by Anonymous Coward · · Score: 0

      I for one find it remarkable that we can essentially reboot and perhaps even fix a system that is on a whole other planet.

      I second that - I just installed an Elmo educational program on my kid's PC and now Windows is TOTALLY fucked up; I don't think even JPL could save it.

    3. Re:The epitome of remote administration by Molander · · Score: 1

      I'am sorry to seem so negative but most of the time the server room seems to be on a different planet. We butterfingered consultance never seem been trusted enough do do real work^w^w^wdammage....

      Nation of Whatever / Tm.

      --
      -Sig-
    4. Re:The epitome of remote administration by Skuld-Chan · · Score: 1

      it reminds me of when I was trying to get the automount dameon working on a linux machine down in california (I was in oregon) and inadvertantly caused the machine to kernel panic.

      Ended up having to call someone who worked at the machine room to track down the crashed system and restart it :(.

    5. Re:The epitome of remote administration by vrmlknight · · Score: 1

      Or maybe we don't like to let some guy who is only going to be here for 30-90 days they we just met get access to a room that if someone trips over a wire/ unplugs something it could take down critical components of the company until it brought back up.

      I build and support servers and there is hardly a reason to ever be in the server room. only to upgrade hardware or completely rebuild a system not of which some consultant should ever be doing.

      --
      This must be Thursday, I never could get the hang of Thursdays.
    6. Re:The epitome of remote administration by Anonymous Coward · · Score: 0

      Some people may take this sort of thing for granted, but I for one find it remarkable that we can essentially reboot and perhaps even fix a system that is on a whole other planet.

      Making a system that you can reboot remotely is easy. But making a system that you can power cycle is harder.

    7. Re:The epitome of remote administration by blincoln · · Score: 4, Interesting

      It's all good until tech support says, "So... Do you have a boot disk?" :-)

      You joke, but newer servers can do this remotely too.

      We have a bunch of Compaq servers at work, and one of the really cool features of the remote administration software is that you can send a virtual floppy image to the machine from anywhere in the world that can open a web browser connection to the server's remote administration board.

      A few months ago one of our servers in Denver died, and I had to boot it up in Windows 2000's command prompt only safe mode... but the local admin password had never been written down. I was able to make virtual floppy images of a tool that resets the local admin password, send them over the wire, and boot off of them from the remote administration system.

      Okay, it's not fixing a super-expensive robot on another planet, but I thought it was pretty cool.

      --
      "...always new atoms but always doing the same dance, remembering what the dance was yesterday." -Richard Feynman
    8. Re:The epitome of remote administration by Molander · · Score: 1

      I agree as I am the the guy that will only be there for 30 - 90 days.

      The problem is that sometimes (most of the time) the Customer thinks that I know more than the sysadmins about their local enviroment and that I will somehow know how to fix all their problems using the magic of NEW software...


      What they (hopefully) have not realiased is that new software means "more money on software". But then my customer have a lot of money to spend if the get they want...
      --
      -Sig-
    9. Re:The epitome of remote administration by JumboMessiah · · Score: 2, Informative

      Put this in /etc/sysctl.conf

      kernel.panic = 120

      That will tell the kernel to reboot itself 2 minutes after a panic. It has saved me in the past before :).

    10. Re:The epitome of remote administration by Daychilde · · Score: 2, Funny

      Of course I joke... heh. Mostly because of my past jobs working tech support, and the guy sitting next to me one time who tried to get a customer to type "a colon space setup dot exe" for about 5 minutes until I was off my call, heard what he was doing, and slapped him silly.

      Well, okay, I didn't slap him, but I wanted to. Badly. :-)

      But on your response -- that works. I mean, if you're doing something that you could just about do on another planet, it should count. Maybe not so glorious, but still. :-)

      --
      A cheerful little bird is sitting here singing.
    11. Re:The epitome of remote administration by foniksonik · · Score: 1

      Totally off-topic but isn't that the biggest security hole ever? I mean you just reset the local admin password remotely... was there any security before you were allowed to do this? Could anyone do this who knew the ip address?

      --
      A fool throws a stone into a well and a thousand sages can not remove it.
    12. Re:The epitome of remote administration by Psychotext · · Score: 3, Interesting

      Oh god... I really, really hope you have a superb firewall & username & password blocking that machine off from the world. I did just read you right, you didn't have the admin password, so you you used a tool over the remote administration to hack past it?

      Mmmm... hackalicious. :-)

      (I've actually used a similar remote kvm system with lights out boards but until you write it down it just doesn't sound that risky!)

      --
      People that believe in their opinions don't post AC.
    13. Re:The epitome of remote administration by Catbeller · · Score: 2, Funny

      I can only smile as I recall the heady days of the PC revolution. All the ancient Big Iron sealed away in a hermetically sealed room, and all the expensive and unapproachable priesthood that tended and worshipped the Iron, would be sent packing. The whole shebang replaced by inexpensive PC's controlled by the user, O glorious day! All the expense and complexity, gone!

      Snicker. Meet the new iron, same as the old iron.

    14. Re:The epitome of remote administration by Anonymous Coward · · Score: 1, Informative

      I'm not the grandparent poster, but RILOE uses a seperate admin system that's secured. We personally keep ours on a seperate lan that can only be reached by a vpn for even more security. It's a pretty neat system though.

    15. Re:The epitome of remote administration by vrmlknight · · Score: 1

      have you been in a server room lately there is still expensive hardware in there. but instead of being 1 big huge box that was $1-5 mil we now have 200 + U1 server each worth $3,000-6,000 and several DB boxes $10,000-$30,000 and the Ever present SAN $250,000-$500,000 and other various network hardware. and the end users hardly have any more control then they use to. your e-mail account sits on one of those systems as do your important files, e-mail account permissions.

      --
      This must be Thursday, I never could get the hang of Thursdays.
    16. Re:The epitome of remote administration by IgnoramusMaximus · · Score: 2, Insightful
      All the expense and complexity, gone!

      I assume you are being really, really sarcastic. In reality the PCs multiplied complexity and expense by orders of magnitude. And only now after decades of chaos and misery all the turd-brained suckers/managers who were responsible for this snake-oil sales bonanza which created the likes of Microsoft are now retreating to the only sane method of enterprise computing: centralized storage and processing. After billions of dollars wasted and con-men very rich by now, the circle is now complete. Of course the managerdiots are now seeing this old idea as "new" after someone smart re-labaled those old concepts with new sales tags like "thin client" or "data warehousing" etc. They would never have allowed someone to hint that they have been taken for a ride and the "priesthood", unlike them, actually had a clue. It is a sad testament to the depths of human stupidity.

      PCs are greatest thing ever for game players, home computer users and many other applications like engineering or science. They make no sense on adminstrative workers' desks, yet those presently constitute something like 80% of business PC deployment and come with hordes of MSCEs without whom they would come to a grinding halt within days.

    17. Re:The epitome of remote administration by Anonymous Coward · · Score: 0
      In some ways its not suprising. Its the same thing we do with our satellites that are only a couple hundred miles away.

      But to be honest, I'm still impressed by it, even though sometimes it seems the only thing that happens on earth is making sure that part works so we can actually fix it for its real mission when it gets on orbit. :)

      I'd be more impressed if my home machine were that stable to firmware upgrades.

    18. Re:The epitome of remote administration by sjames · · Score: 1

      But making a system that you can power cycle is harder.

      :-) But actually I know of several ways. Some watchdog cards will cycle power if the CPU doesn't periodically tell them not to. Several Intel chipsets have a bit you can flip for a 3 second power cycle. Newer server boards have a seperate embedded system manager that can cycle power for you. The manager runs on the 3.3 volt standby power along with the wake on lan etc.

    19. Re:The epitome of remote administration by Old+Wolf · · Score: 1

      and Microsoft makes it controllable via unencrypted tcp/ip. I'm only half-joking too

    20. Re:The epitome of remote administration by Anonymous Coward · · Score: 0

      Never has a Slashdot monicker been ever so apt.

      Yes, Virginia, he was being sarcastic, as any 5 year old could guess.

  12. So basically... by cperciva · · Score: 4, Funny

    If I understand this properly, they've got a damaged filesystem on the flash RAM. Not really a big problem, you just have to send someone over to the console to boot it up in single-user mode and run fsck. ... oh yeah, sending someone over to the console is a little bit difficult here. :)

    1. Re:So basically... by AKAImBatman · · Score: 1

      They must be using Sun's OpenBOOT. No need to be anywhere near the machine. ;-)

    2. Re:So basically... by Anonymous Coward · · Score: 0

      They are using VxWorks OS. It looks like they are also using VxWorks flash/file system drivers, which are crap. There are dozens of TSRs logged for problems related to flash/file system problems. I had to write my own flash driver to for vxWorks to fix many problems.

  13. Where is the redundancy? by MWChapel · · Score: 3, Interesting

    Shouldn't they have like 5 Flash RAM's? Really,they shouldn't have one of anything. In my computer if my BIOS fries, I pop open the box and replace it. If it fries on mars, obviously I kiss my megamillion dollar project goodbye, all for a $5 Flash ROM.

    1. Re:Where is the redundancy? by cperciva · · Score: 4, Insightful

      It's not just a $5 flash ROM. If they wanted control redundancy, they would need extra flash RAM, RAM, ROM, CPU, motherboard, arbitration hardware, and arbitration software.

      Also keep in mind that this isn't a $5 flash ROM chip. When you consider the hostile environment, the testing, the power, and the fuel required to get everything to Mars, that flash ROM probably cost at least fifty thousand dollars.

    2. Re:Where is the redundancy? by MajorDick · · Score: 1

      Heck even some current motherboards have dual BIOS in case of failure. Your right, they should have at least double, but really should have triple redundant systems.

    3. Re:Where is the redundancy? by Anonymous Coward · · Score: 0

      " In my computer if my BIOS fries, I pop open the box and replace it."

      Yes because poping open the box and replacing it on Mars is so easy isn't it.

    4. Re:Where is the redundancy? by Anonymous Coward · · Score: 0

      Where was the time? I totally agree with you, but since they were pressed for time I'm betting redundancy was satisfied (in their minds) by the second vehicle.

      Also consider weight. That was one of their biggest concerns, at least in their interviews on that PBS special aired a few weeks ago.

    5. Re:Where is the redundancy? by Anonymous Coward · · Score: 0

      My thoughts exactly.

      1) Why just 256mb Flash RAM? Not like 512mb or 1GB Flash is much heavier.
      2) Why not 2x Flash RAM (Mirrored)? Mirroring firmware isn't that complex- is it?
      3) Why only 128mb Volitile memory?

      Why do I feel like my old Commodore 64 just got sent to mars?

      They should put my boss on the next Mars project. Damn DB project needs 1GB memory and 40GB Disk space, he wants a Quad proc cluster, 4GB memory, and 500GB array! (And a disaster recovery site!)

    6. Re:Where is the redundancy? by mykepredko · · Score: 1

      Shouldn't they have like 5 Flash RAM's? My guess is that they have at least two with some kind of arbitration circuitry.

      I don't know anything about the architecture of the computers on the Rover(s), but I suspect when the term "Flash RAM" is used, they are talking about the redundant Flash memory, the mux/demux and arbitration circuitry. This means that if something on the Flash memory subsystem fails, it is simply described as a "Flash RAM" problem. I would suspect that the Flash memory would be considered to be a lot less reliable than the mux/demux/arbitration electronics.

      Modern spacecraft seem to use rad-hardened versions of commercial processors which do not have the completely separate dual memory channels of the AP101s used in the shuttle orbiters.

      myke

    7. Re:Where is the redundancy? by Anonymous Coward · · Score: 0

      As Pete mentioned today, they do have redundant flash memory modules, they are 256 Megabyte modules. In fact, the system was switching between each flash memory module each time it tried to boot up, and each time encountered the same fault during the boot sequence.

      Whatever cause, the issue was written out to both modules. He went on to explain that this to some degree favors a software error, but it could still be an issue between a common interface between the modules (Main board?) and the processor.

      Another possibility is that since a portion of the flight software also resides in the flash ram, it may simply be that when the system fully boots from flash ram, it is trying to access another totally unrelated rover component.

      Running via the 128 Megs of RAM is somewhat like using windows in safe mode. It usually boots this way, but most of your external devices are not enabled since it doesn't bother trying them.

    8. Re:Where is the redundancy? by Anonymous Coward · · Score: 0

      Hey Mr. Know it all. when these things are designed YEARS in advance you don't get the luxury of brandy new technology. This isn't a Unreal tournament box they built to play games.

    9. Re:Where is the redundancy? by fermion · · Score: 2, Informative
      increased number of components means increased complexity. increased complexity means increased cost to maintain reliability. Cost increase much more than linearly. For non-humna missions, extra components not justified.

      Using redundant low reliability components is the cheap office solution, not the space exploration solution.

      --
      "She's a scientist and a lesbian. She's not going to let it slide." Orphan Black
    10. Re:Where is the redundancy? by Anonymous Coward · · Score: 1, Insightful

      I'm watching NASA tv at the moment and they're explaining possibilities now. At the moment, they only have a very broad explanation of what's going wrong. However the newest knowledge is:

      There are two separate flash memories on Spirit. At the moment, part of the problem is software which can read part of the flash memories as some of the operational software which is kept in flash ram seems to be coming up before the system reboots.

      The system is rebooting no matter which flash memory is being accessed, it has the same bug both ways, so the flash ram itself looks to be OK, but the interface between the flash ram and the software looks to be causing resets.

      Even if there were a dozen 1GB flashrams, it looks like they'd still have this problem. Perhaps many, all on different controllers, and even an entire backup computer. at 100watts total power available for the rover, an entire extra computer may be a bit much to have fit.

      At the moment the family of components that are involved with the hardware addressing of the flash memories looks to be where the problem is.

    11. Re:Where is the redundancy? by tftp · · Score: 1
      that flash ROM probably cost at least fifty thousand dollars

      Talk about poor return on investment... Use 10 cheaper ones and have more forward error correction than you will ever need.

    12. Re:Where is the redundancy? by DuranDuran · · Score: 2, Funny

      > but really should have triple redundant systems.

      I laugh at your puny triple redundant systems!! They should have QUADRUPLE redundant systems!!!

      --
      "You can justify anything by putting it in quotes, adding a famous name and making it a sig" - Albert Einstein
    13. Re:Where is the redundancy? by ctr2sprt · · Score: 1
      If they didn't have redundancy, they wouldn't be able to do what they are now. For starters, the rover is still operating. It's waking up and going to sleep. Secondly, it was able to detect the error, although it couldn't correct it automatically. Thirdly, it was able to transmit to its bosses at NASA that it was having trouble. And finally, it's still able to receive signals and commands of sufficient complexity to re-bootstrap itself without using the defective hardware.

      Seriously, you just shouldn't assume the people at NASA are stupid. If it's obvious to you, it was obvious to them. For all you know, there are five mirrored flash disks on the rover. And if there aren't, I'm willing to bet there's an incredibly good reason.

    14. Re:Where is the redundancy? by Anonymous Coward · · Score: 0

      Its not unheard of to have 7 levels of redundancy on the Space Shuttle.

    15. Re:Where is the redundancy? by Anonymous Coward · · Score: 0

      "Your name is Rover and you sputter just like Janet
      just like yo' muthu twisting 'cross a dusty planet"
      --Richard Feynman.

      -8, 2^3 Redundant (systems)

    16. Re:Where is the redundancy? by Hawkxor · · Score: 2, Interesting

      First of all: obviously they've thought of that. Adding to what others have said in their responses tearing the parent apart, I'd like to mention that the problem is probably not due to a general defect in the RAM card. It probably has to do with the conditions on Mars, the landing, etc - in which case the same problem would be affecting all of the (even redundant) Flash RAM cards: so it really is amazing that they got this working at all.

    17. Re:Where is the redundancy? by BoldAC · · Score: 1

      Redundancy works well if you have "limitless" power. A completely redundant system here would use twice the power (give or take). Storage of power here is the weakness link.

      Of course, they do have the ulimate redundant system... and it will be landing in a couple of hours.

      AC

    18. Re:Where is the redundancy? by vt0asta · · Score: 1

      Talk about poor return on investment... Use 10 cheaper ones and have more forward error correction than you will ever need.

      All 10 of them worthless if the silicon cracks at sub-zero temps, intense temperature differentials, extreme vibration, extreme gravitational forces, etc. You don't sound like you understand or read what "hostile environment" means.

      --
      No.
    19. Re:Where is the redundancy? by DarthTaco · · Score: 1

      Redundancy works well if you have "limitless" power. A completely redundant system here would use twice the power

      Cold sparing allows redundancy without using much more power than a non-redundant system.

    20. Re:Where is the redundancy? by esanbock · · Score: 1

      Really? We're talking about the same people who lost a mars orbiter due to a Metric/English conversion problem! They're morons!

    21. Re:Where is the redundancy? by Anonymous Coward · · Score: 0
      " In my computer if my BIOS fries, I pop open the box and replace it."

      Yes because poping open the box and replacing it on Mars is so easy isn't it.

    22. Re:Where is the redundancy? by grozzie2 · · Score: 2, Insightful
      Really,they shouldn't have one of anything

      I think you folks all missed the point completely. They have full dual redundancy on EVERYTHING in the MER program. Not only are the computer systems somewhat an issue, there's little issues like landing in one piece, etc etc. to that end, they built 2 full systems, packaged them on 2 different rockets, and fired them off a month apart from each other. this gave full dual redundancy to every system and every component, from the initial launch igniters, to every bit of hardware that landed on the surface. Then to maximize the redundancy, they set them to land on different halves of the planet, serious physical isolation of component set A and component set B.

      If one complete set of hardware arrives on the surface, and returns scientific data, the mission is considered a success. the real issue and difference of this program is, they went with dual redundancy in everything, from launchers to arrival, not just separate systems mounted on the same physical hardware host.

      This type of full redundancy does make a lot of sense when you consider, the highest risk portion of the mission is the entry and landing phase, followed closely by the launch phase. Dual redundant systems mounted on the same rover platform may well give for better chances of success whilst on the surface portion of the mission, but leaves a huge single point of failure during launch, and another one during entry and landing.

      Take a peek over at nasa tv, and you well see what real mission redundancy is all about, second lander about to enter martian atmosphere.

    23. Re:Where is the redundancy? by Myopic · · Score: 1

      i definitely read a summary of the mission which stated that all the major systems on the rover had redundant parts, so i don't know what the problem with the memory is. in any case, NASA has always put backups of important parts into its spacecraft, and they have often been needed when primary units failed.

    24. Re:Where is the redundancy? by MurphyZero · · Score: 1
      Use 10 cheaper ones and have more forward error correction than you will ever need.

      Don't forget they are under power constraints (already mentioned) and also very serious, weight constraints. It had to be launched in the first place. Start throwing 10-time redundancy on basic equipment and you're tossing the science equipment overboard. Before long you have a highly redundant brick for all it is useful for. Or, you have a humongous, heavily redundant, HEAVY piece of scientific equipment that requires two Saturn V strapped together to get it into low Earth orbit. These trade-offs are a necessity of life in designing a space mission.

      However, I will also say that NASA has an extreme history of misplaced optimism. Their risk analyses are often extensive, however they either have probabilities that are an order of magnitude or more off (NASA estimated early shuttle launches probability of catastrophic failure somewhere between 400 to 1 and 10000 to 1--current estimates should be between 50 to 1 and 100 to 1. However NASA will likely evaluate it closer to 200 to 1). The other common NASA problem is to ignore problems are negligible danger, foam strikes anyone?

      So when planning this mission, I would guess that NASA did realize the potential problems with the memory and either went "the chance of it happening is 100,000 to 1" or "We have built-in redundancy, no problem" or even "If those first two statements fail us, we can fix it in less than an hour and be back on mission."

      --
      Our founding fathers removed the guys in charge. Be American. Vote incumbents out.
    25. Re:Where is the redundancy? by xiangpeng · · Score: 1

      Well, the "redundancy" measure is that they sent 2 instead of 1 of those rovers to Mars I guess :)

      --
      You must defeat Sheng Long to stand a chance.
    26. Re:Where is the redundancy? by paul248 · · Score: 1

      They do have redundancy. It's named "Opportunity"

    27. Re:Where is the redundancy? by johnburton · · Score: 1

      From what I can gather they did have redundany here in that they had two banks of flash memory. And writing to either of those banks caused the reboot problem.

      --
      Sig is taking a break!
    28. Re:Where is the redundancy? by tftp · · Score: 1
      I know a little about these things; but I left that company many Moons (or Marses?) ago. Most of conditions that you refer to are not common on Mars.

      • Sub-zero temperatures are bad for any equipment, and satellites usually have their own thermal insulation, heat sinks and internal airflow to keep the temperature optimal. I don't think Spirit cools down to 150K at night. That would kill the batteries for sure. Temperature differentials are also eliminated this way.
      • The strongest vibration occurs on launch, and there are many successful designs that take care of that (pretty much any satellite which has any electronics :-)
      • The strongest G forces, as far as Spirit is concerned, are on Earth.
      • Radiation is a problem. Mars has no magnetic field or dense atmosphere, and so it is unprotected. I would suggest good shielding, besides redundant design and self-repair capabilities.

      In any case, 10 flash chips wouldn't occupy much more space than one. If you can get them there (rad hardened or not) then you benefit. Another issue is that Flash is a poor choice for this mission anyway...

    29. Re:Where is the redundancy? by vt0asta · · Score: 1

      Look, you are making a really wierd point, and you didn't follow what the previous poster said.

      The cost of getting a flash chip to Mars is probably $50k, making each flash chip $50k. I believe he originally sited the cost as $1k for the actual cost to manufacture. Now add in the weight of the chip and related circuitry x distance x fuel and you come up with the flash chip's total cost. They probably have the flash nulled with zeroes, I think ones weight more, but I'm not sure (*smirk*). If your plan is to add 10 of them, then multiply $50k x ~10.

      Anyway, I'll take your word on the sub-zero temp insulation, as for the launch, that's precisely what I was talking about, and as for G forces you are forgetting launch and landing both are greater than 1G. Radiation would probably be done by shielding as you stated, but now it has to protect the size of 9 more chips.

      Anyway, this whole thread is moot anyway. There is redunancy. Where? A whole 'nother rover on the opposite side of the planet.

      --
      No.
  14. 2 years ago, back at NASA R&D... by Dark+Lord+Seth · · Score: 5, Funny

    Engineer 1: Ho-hum.. Little bit of ... whatever it is, 'ere... Hand me that thingamajig, will you?
    Engineer 2: Yah, sure... Hey, remember that employee last month who got laid of within a week?
    Engineer 1: Who? Vincent?
    Engineer 2: Yeah, Vinnie... With the Italian accent?
    Engineer 1: Yeah, him. What about the guy?
    Engineer 2: Well, he has this offer on cheap RAM we just CAN'T resist!
    Engineer 1: Really now? But-
    Engineer 2: Look, our budget is already comparable to social welfare. We need to save some loot.
    Engineer 1: Fair enough, buy the crap and hand me the other twisty-turny thingy over there? I need to screw on this name tag reading... "Spirit"?
    Engineer 2: Look, it's either that or my wife's name.

    1. Re:2 years ago, back at NASA R&D... by 110010001000 · · Score: 1

      What? That makes zero sense...

    2. Re:2 years ago, back at NASA R&D... by CrazyJoel · · Score: 2, Insightful

      isn't the budget for social welfare something like 300 billion dollars?

      If only we spent so much on NASA. They only get 12 billion.

      --

      Such is the infinite Grace of Popeye.
    3. Re:Re:2 years ago, back at NASA R&D... by why+cant+i+get+the+n · · Score: 1

      Just wait a bit, it will come to you...

    4. Re:2 years ago, back at NASA R&D... by Anonymous Coward · · Score: 0

      Think it's actually closer to 1.something trillion dollars.

    5. Re:2 years ago, back at NASA R&D... by the+pickle · · Score: 1

      Engineer 1: Fair enough, buy the crap and hand me the other twisty-turny thingy over there? I need to screw on this name tag reading... "Spirit"?
      Engineer 2: Look, it's either that or my wife's name.


      Which would be great, except the two probes were already assembled and ready to go before Sofi Collis named them "Spirit" and "Opportunity."

      p

  15. Monday morning quarterback by GGardner · · Score: 5, Insightful

    If I was sending an embedded control computer to another planet, I would have chosen an OS with memory protection, not VxWorks. VxWorks is like DOS, and early versions of Windows, where one pointer problem in one task can corrupt the whole system. Sure, we don't know that's the problem now, but it would be nice to know for sure that it wasn't.

    1. Re:Monday morning quarterback by moquist · · Score: 1

      Windows CE would definitely have been a better choice. That way, MS could have an automatic monopoly on Mars, too.

    2. Re:Monday morning quarterback by mnmn · · Score: 0, Flamebait

      I will never understand why Linux and NetBSD are currently looked down upon in the embedded corporations currently. QNX does a great job too, but of these four, why WxWorks?

      I'm sure the NASA engineers have computers at home, mostly running either Windows 98, 2000 or XP. They should know the way a crash in Windows 98 brings the whole system down and Windows 2000 doesnt always do that. I wouldnt count on them to have had lots of experience with BSD or Linux because they didnt use that.

      I think just like that NASA's pathfinder sailplane, these rovers should have at least 2 computers on board with each CPU+OS constantly checking the state of the other and being able to take over. Perhaps just building in a really smart BIOS like they did is a cleaner solution.

      --
      "Give orange me give eat orange me eat orange give me eat orange give me you." -Nim Chimpsky
    3. Re:Monday morning quarterback by Dun+Malg · · Score: 1
      these rovers should have at least 2 computers on board with each CPU+OS constantly checking the state of the other and being able to take over.

      Two computers is never a good idea in cases like this. Either one computer, or 3+ computers, but not 2. Three computers lets you know which computer is bad via consensus: "unit 1 says unit 3 is bad; unit 2 and 3 say unit 3 is ok, and unit 1 is the one that's bad; logically, it's most likely unit 1 is bad". Two computers you don't know whether the reporting computer is erroneously diagnosing a problem in the other, or if the other is actually bad. One computer, well, it just works or doesn't!

      --
      If a job's not worth doing, it's not worth doing right.
    4. Re:Monday morning quarterback by Johnny+Mnemonic · · Score: 1


      I'm sure the NASA engineers have computers at home, mostly running either Windows 98, 2000 or XP.

      You can see in the control room video lots of Sun Terminals, and a fair amount of Apple PowerBooks running OS X. I suspect that these guys are more Unix-savvy than your average bear :)

      --

      --
      $tar -xvf .sig.tar
    5. Re:Monday morning quarterback by morgue-ann · · Score: 2, Informative

      I will never understand why Linux and NetBSD are currently looked down upon in the embedded corporations currently

      Because they're fucking HUGE.

      The uCLinux kernel for 68k which is more compact than SPARClite, but maybe less so than x86, is 512K.

      That's a stripped-down kernel with no MMU support and the special uClib C standard library designed to take less space.

      I'm working on a digital camera with 512K of flash and 8MB of SDRAM. That flash is divided into 7 64K sectors and 8+16+16+32K little sectors. We use the upper 64K for sensor calibration data and the lower 64K for a boot block that can be locked so you can recover a camera with a bad firmware load.

      That leaves 384K for everything else. Our kernel is Precise/MQX from ARC International and it's 30K !.

      Oh, and the RAM is needed for image processing and buffering movie frames on their way out to NAND flash, so your piggy kernels can't have it.

      While I'd like some things from uCLinux and busybox and netBSD, I have to be very selective. I'm presently porting elf2flt to the Metaware tools for ARC so we can dynamically load code resources. We'll also get a real log facility and monitor soon and maybe someday the Almquist shell.

      At least MQX and the Metaware tools are reasonably cheap and we get kernel and library sources (and ARC CPU hackable RTL instead of a giant impenetrable lump like ARM). I've heard nothing but irritation with WindRiver's high pricing and closed-IP attitude.

    6. Re:Monday morning quarterback by Anonymous Coward · · Score: 0

      Two computers can detect a discrepancy and forcibly reset each other. That lets them catch a single-event upset.

    7. Re:Monday morning quarterback by Anonymous Coward · · Score: 0

      I remember asking 'why no memory protection' to a WindRiver guy once and his argument besides performance was that it doesn't help that much for most applications. So you have a wild pointer in a task and you're going to kill that one, fine, but what next? A typical real-time application will be built around a producer/consumer model and in most cases the system will hang pretty quickly due to queue overruns anyway when a task is missing. Just restarting the task won't be that easy, either. What if you had some stateful information in there? You'd have to built a pretty elaborate system if you'd want to able to restart a single task.

      Dunno, I found the argument somewhat convincing if not completly satisfactory...

    8. Re:Monday morning quarterback by mnmn · · Score: 1

      For a Digital camera or another device that should eventually cost below $99, I can understand a kernel that is 30k and doesnt need 4MB RAM to boot. But we're talking about Spirit, and at least Intel sells a 8-mbit flash with tiny BGA footprint, more bits per gram regardless of price.

      Given two exactly similar OSes with differing kernel sizes I'd choose the simpler smaller one which will inevitably be more robust. I guess NASA ran into the same problem owners of cheap USB memory sticks run into all the time. Flash filesystems are still immature compared to standard Linux/BSD filesystems. I think Spirit has become negative advertisement for Windriver.

      --
      "Give orange me give eat orange me eat orange give me eat orange give me you." -Nim Chimpsky
  16. Huh? Flash? by ajlitt · · Score: 1

    I didn't even know they made rad-hard flash!

    1. Re:Huh? Flash? by DougHalfWay+AroundTh · · Score: 2, Informative

      They don't. See DoD Bids About 3/4 of the way down the page.

      Title: Rad Hard Flash Technology Abstract: The highest density radiation hardened non-volatile (NV) memory currently available is a 256 kbit EEPROM based on SONOS technology. One of the major limitations in developing rad hard NV memory has been the cost in bringing up the NV technology in a dedicated rad hard process facility, especially when weighed against the limited market size. One way to bring radiation hardening to an advanced electronic product on a cost-effective basis is to leverage the commercial product by applying the hardening to the commercial fab instead of bringing the commercial technology to the rad hard fab. NV flash memory technology is popular in the commercial marketplace, with densities up to 256 Mbit in production. Unfortunately, flash memory is not available, at any density, in total dose rad hard versions. And, most commercial flash memories are so soft that impractical amounts of shielding are required to survive even moderate radiation environments. This effort will be the first step in developing rad hard flash technology at a commercial fab. Rad hard flash technology will be a near-term solution to the problem of high density NV memory for space applications. It will enable the development of rad hard flash memories and embedded NV memory for rad hard ASICs.

      Flash...the weakest link...

  17. Hmmm... by linuxpyro · · Score: 1

    A similar thing happened with an old router I had. The only problem was, we needed Win98 in order to reflash it...

    --
    Saying "I'll probably get modded down for this" in a post is the best way to get it modded up.
    1. Re:Hmmm... by Anonymous Coward · · Score: 0

      Finally, someone's found a use for win98... never thought it'd happen in my lifetime!

    2. Re:Hmmm... by vrmlknight · · Score: 1

      this is off topic but we had a drive duplicator that needed Win98 to flash upgrade it to support larger drives. so apparently Win98 will never be completely useless.

      --
      This must be Thursday, I never could get the hang of Thursdays.
    3. Re:Hmmm... by toddestan · · Score: 1

      Definently off topic, but the Dos based versions of Windows do have their uses from time to time. For that reason, I don't think I'll ever get rid of my Pentium 133 running Windows 95b. Just a couple weeks ago I had to fire it up to make a boot disk to flash the bios on a misbehaving motherboard. If I didn't have Win95 I don't know what I would of done.

    4. Re:Hmmm... by vrmlknight · · Score: 1

      ok i usually hate off topic ramblings but
      http://bootdisk.com has images of floppies of various stuff always helpful.

      --
      This must be Thursday, I never could get the hang of Thursdays.
  18. Static Discharge? by seven+of+five · · Score: 5, Interesting

    Is there a chance that the problem could've been caused by electrostatic discharge? Rover bounces on rubber airbags on sand, bags fold up, Rover rolls off, Rover touches rock - zap!??

    1. Re:Static Discharge? by juglugs · · Score: 2, Insightful

      Doubt it

      I'd hope that the RAM is in a shielded box given the amount of radiation it's getting from the sun and the rest of space.

      Could be Soft Errors caused by Alpha particles though - depends on the technology used in the flash - unlikely, but possible...

      --
      This sig is in Spanish when you're not looking....
    2. Re:Static Discharge? by Anonymous Coward · · Score: 0

      And you got a 0 for that??!
      I think that if the flash chip was damaged during the ascent, the problem would have been detected in the begining

    3. Re:Static Discharge? by iminplaya · · Score: 1

      Naw...It probably just got wet

      --
      What?
    4. Re:Static Discharge? by myowntrueself · · Score: 2, Interesting

      I had been wondering.

      The sequence of events that lead up to this was, IIRC,

      1. Rover extends arm ready to take a grinder to a rock.

      2. Contact with Rover lost due to bad weather in Australia.

      3. Rover bad.

      So it had just moved part of its structure closer to the rock just before this happened.

      --
      In the free world the media isn't government run; the government is media run.
    5. Re:Static Discharge? by srleffler · · Score: 1
      Could be Soft Errors caused by Alpha particles though - depends on the technology used in the flash - unlikely, but possible...

      Why alpha particles? Alpha radiation is really easy to block. I find it hard to believe that any would penetrate the shielding around the electronics, much less the electronics themselves.

    6. Re:Static Discharge? by DerekLyons · · Score: 1
      Could be Soft Errors caused by Alpha particles though - depends on the technology used in the flash - unlikely, but possible...
      Considering that Alpha particles can be stopped by anything much thicker than tissue paper... I'd say very unlikely.
    7. Re:Static Discharge? by juglugs · · Score: 1

      Yes, Alpha radiation can be blocked very easily, but Alpha particles exist in the packaging and the substrates of the chips.

      Hence the (perceived) problems with Soft Errors on 90nm and smaller technologies.

      But the reason I doubted it was because there aren't many (any?) 0.13um flash RAMs out there...

      --
      This sig is in Spanish when you're not looking....
    8. Re:Static Discharge? by DerekLyons · · Score: 1
      Yes, Alpha radiation can be blocked very easily, but Alpha particles exist in the packaging and the substrates of the chips.
      Alpha radiation consists of Alpha particles. (The same is true for Beta as well. Only Gamma is pure energy.)

      However, I was thinking of Alpha's from external sources, not from the chips themselves. (As a former worker with radiological materials without any connection to modern electronic, I tend to think of external sources first. :) I'm aware of the problem with internal Alpha's, they just didn't occur to me.
  19. Obligatory... by incom · · Score: 0

    They should have used linux...

    --
    True genius is grasping a situation like a peice of fruit, and peircing it just right so that it drains dry.
    1. Re:Obligatory... by floamy · · Score: 1

      They need something realtime. Even QNX would be better than Linux in this situation.

    2. Re:Obligatory... by Anonymous Coward · · Score: 0

      Yes. Some Emacs Lisp macros would've been more robust.

    3. Re:Obligatory... by username.h · · Score: 1

      Except that even Linux can't run if you have bad RAM (not that Linux doesn't rock, I'm using it now).

      --
      #include "sig.h"
    4. Re:Obligatory... by pantherace · · Score: 1

      Another Obligatory:Bad RAM Patch for Linux

    5. Re:Obligatory... by incom · · Score: 1

      Damn, somebody is going after me with overrated mods today. I'd better go check my freaks list. That overrated modifier is just a cop-out anyways imho. If you seem a problem with a post actually mod it with a modifier that has to be backed up. Overated is just a way for abusive mods to avoid meaningful meta-moderation.

      --
      True genius is grasping a situation like a peice of fruit, and peircing it just right so that it drains dry.
  20. Cosmic rays... by bc90021 · · Score: 4, Interesting

    ...will apparently cause one out of every trillion bits on Earth to flip randomly... I guess with less of an atmosphere, it is a bigger problem on Mars! ;)

    1. Re:Cosmic rays... by Anonymous Coward · · Score: 0

      Not if you use ECC ram.

    2. Re:Cosmic rays... by shadowmatter · · Score: 5, Interesting

      Funny you mention that. I'm taking a class on design of digital systems at my university, and my professor works for JPL. He helps design the control systems onboard space vehicles such as the Mars rover. Anyway, a majority of the class grade is based on an end-of-the-quarter project, which we complete in groups of 2 to 4. On Wednesday he expressed interest in a group developing some sort of redundancy for FPGAs that would be suitable in spacecraft. You see, on Mars, you're not shielded from huge doses of radiation as you are on earth. A healthy dose of radiation bombardment could easily reprogram an FPGA chip on the surface of Mars; ASICs chips are used to overcome this problem.

      Maybe he was gung-ho about anti-radiation redundancy because he already knew the likely problem of the Spirit. Who knows?

      - sm

    3. Re:Cosmic rays... by mnmn · · Score: 2, Informative

      Rockets have blasted off into space since Sputnik1 and with all the communication satellites, we know alot about high-radiation electronics. We've had sun flares corrupting electronic equipment for decades and ASIC companies have entire lines of chips for high-radiation resistance, partly for military applications.

      So I think the rovers electronics are well protected from at least the Suns radiation. I think Mars is 1.3AUs from the Earth, making it 2.3AUs from the Sun, so it should receive less than a quarter of the radiation per square inch the earth gets, but I strongly feel I could be wrong there. Martian dust getting into the compartments IMHO can be a more likely reason.

      If electronics break on Mars, I'd put the highest chance on the initial impact on landing. Beside that its just sitting on barren land, under full solar radiation, exposed to some dust but in close-to-vacuum. Its a simpler environment we have to deal with compared to say sending a rover to a planet like Earth where it must be able to swim and walk through the forests.

      --
      "Give orange me give eat orange me eat orange give me eat orange give me you." -Nim Chimpsky
    4. Re:Cosmic rays... by pod · · Score: 1

      Well, this would be +4 Interesting (Informative even!) if you said how often this bit gets flipped by cosmic rays.

      --
      "Hot lesbian witches! It's fucking genius!"
    5. Re:Cosmic rays... by DerekLyons · · Score: 1
      So I think the rovers electronics are well protected from at least the Suns radiation. I think Mars is 1.3AUs from the Earth, making it 2.3AUs from the Sun, so it should receive less than a quarter of the radiation per square inch the earth gets, but I strongly feel I could be wrong there.
      The radiation dosage at Mars orbit is one quarter that of Earth. However, Mars has virtually no magnetic field, and an atmosphere an order of magnitude thinner. (These two thing are what blocks most radiation from reaching Earths surface.) Therefore the dose at Mars surface is far higher than that seen in LEO.
    6. Re:Cosmic rays... by devonbowen · · Score: 1
      I think Mars is 1.3AUs from the Earth, making it 2.3AUs from the Sun

      You're right that it's 1.3AUs from earth at the moment but you can only add the two when it's at opposition. Mars is actually about 1.5AUs from the sun.

      Devon

    7. Re:Cosmic rays... by Anonymous Coward · · Score: 1, Interesting

      Someone may have already said this or you may have already seen it, but Xilinx has a press release about having their chips (FPGAs of course) in the rovers. They're touting the FPGA as being used as the "main brain" so you may be right on.

      http://www.xilinx.com/prs_rls/design_win/0412_mars rover.htm

      A funny side note, I think they posted the press release on the same day as the Spirit failure. That's some good pr timing ;)

    8. Re:Cosmic rays... by dszd0g · · Score: 1

      There was a program on TV about their involvement in the Spirit. They were used in some of the systems controlling the landing. They also control some of the wheels, steering, arms, and cameras.

      They supposedly are not involved in any of the communications systems that could cause the problem the Spirit is experiencing.

      --
      This message is encrypted with Quad ROT-13 to protect the author's copyright under the DMCA.
  21. vnc by david_594 · · Score: 1

    someone should have just loaded up a nice copy of PCanywhere on the thing before it left.

  22. rover by Anonymous Coward · · Score: 0

    mars rover.. what is it all about? is it good, or is it whack?

  23. Last time they buy generic ghetto ram eh? by OgreFade · · Score: 1

    As always go for the name brand, or high quality parts, when performace is a necessity. Goes back to the problem with government stuff, the lowest bidder always gets the contract and they tend to buy the cheapest parts.

    I must admit I'm very happy they managed to get something back from spirit. I hope they get the little guy running again. Hopefully when we get to the martian surface the astronauts won't have to trip over the dead carcasses of many dead rovers. O

    1. Re:Last time they buy generic ghetto ram eh? by Anonymous Coward · · Score: 0

      Apparently you haven't been paying attention to the news about Halliburton. Part of the Brave New US is our willingness to accept highest bidders, if they provide enough kickbacks.

  24. Redundancy by juglugs · · Score: 0

    I wonder why they didn't use any redundancy is stuff like the flash RAMS? Wouldn't that be an obvious thing to do in a mission critical system (especially when you have no way of changing the parts).

    Also, don't they use ChipKill? (Chipkill can identify a bad chip on the SIM card and bypass that particular chip, keeping the rest of the SIM operational)

    --
    This sig is in Spanish when you're not looking....
  25. Software / Hardware Breakthrough? by Saeed+al-Sahaf · · Score: 4, Insightful
    This is remarkable, and a testament to good software / hardware integration. It is true that I think this money could have been better spent elsewhere in terms of our understanding of the universe, but still, these types of projects and the hardships that come with them teach miles of experience in remote software / hardware problems.

    I do seriously wonder if these types of projects will tell us anything more than esoteric wonders of Mars, but from a strictly engineering standpoint, perhaps it's worth it after all.

    --
    "Who are in control, they are not in control of anything - they don't even control themselves!" - Glen Beck
    1. Re:Software / Hardware Breakthrough? by Anonymous Coward · · Score: 0

      I don't know about the esoteric wonders of Mars, but I rather suspect that NASA (and the ESA) will be taking a damned hard look at their specifications for "radiation-hardened" semiconductors, as well as enhanced shielding and other requirements.

      The issue, IMHO, is not so much that it's on Mars, but that it had to traverse such a long distance in interplanetary space to get there, and the long time that it took to get there. A lot of (sub)atomic particles and various kinds of radiation probably baked the spacecraft in that duration. That, in itself, is an important lesson learned. The remote debugging aspect is very interesting in itself, but they're lucky they still have something to debug at this point.

    2. Re:Software / Hardware Breakthrough? by Saeed+al-Sahaf · · Score: 3, Informative

      A lot of the comoponents in this craft came from my former employer, www.InterPoint.com, who laid off half their staff a few years ago (me was one of those). Little boxes the size of a pack of cards, hand built. Really amazing stuff.

      --
      "Who are in control, they are not in control of anything - they don't even control themselves!" - Glen Beck
  26. The Full Story by DrunkenTerror · · Score: 5, Informative

    Here is the link to the real story. The one given in the /. acticle is getting pushed down spaceflight's page.

  27. Nice by Omega1045 · · Score: 4, Interesting

    I have a friend who works in the field. Space travel hoses electronics bad. Triple redundancy and over-engineering is the name of the game. This is nice to hear. I would imagine that something went wrong intransit or on-landing, but they can keep going,

    --

    Great ideas often receive violent opposition from mediocre minds. - Albert Einstein

    1. Re:Nice by fermion · · Score: 1

      It hoses most all components. The stress of launch, the radiation, even the vacuum can be an issue. That is why they use really expensive harden components, which everyone then complains about. After all, why pay 1K for a component that is certified to work when you could go down to the office store and buy it for 50 bucks.

      --
      "She's a scientist and a lesbian. She's not going to let it slide." Orphan Black
  28. This looks familiar.. by Anonymous Coward · · Score: 0

    DUPE.
    Way to go, Timothy.

  29. Somebody here on Slashdot nailed it... by tinrobot · · Score: 1

    I remember in the last thread about the rover, someone opined that it was bad memory, then proceeded to give a half dozen reasons why. Totally nailed it.

    You're all so damn smart. Sometimes I don't think I'm not worthy of posting here.

    1. Re:Somebody here on Slashdot nailed it... by onemorehour · · Score: 1
      Sometimes I don't think I'm not worthy of posting here.
      I always don't think I'm not worthy of posting here. ^_^
    2. Re:Somebody here on Slashdot nailed it... by prockcore · · Score: 5, Funny

      I remember in the last thread about the rover, someone opined that it was bad memory, then proceeded to give a half dozen reasons why. Totally nailed it.

      Yeah, in the future NASA should just submit an Ask Slashdot whenever something goes wrong..

    3. Re:Somebody here on Slashdot nailed it... by jobugeek · · Score: 2, Funny
      Why, so everyone can start their response with

      IANANE (I am not a NASA engineer), but.....

      --
      I'm not drunk, I just have a speech impediment. And a stomach virus. And an inner ear infection.
    4. Re:Somebody here on Slashdot nailed it... by Anonymous Coward · · Score: 1, Funny

      99% of everything posted here is INANE to start with...

    5. Re:Somebody here on Slashdot nailed it... by Anonymous Coward · · Score: 0

      On the other hand, maybe NASA really doesn't have a clue what's wrong, but some alert NASA engineer saw the Slashdot comment and convinced their P.R. to just go along with that explanation..

    6. Re:Somebody here on Slashdot nailed it... by pod · · Score: 1

      And I'm sure NASA engineers knew it too. But until they can reanimate the rover, verify their theory and have some good news to report, no one's gonna be saying anything.

      --
      "Hot lesbian witches! It's fucking genius!"
    7. Re:Somebody here on Slashdot nailed it... by Anonymous Coward · · Score: 0

      In that case slashdotters would require that nasa employees should work without salary and propose the following businessmodel.

      1) Work for free.
      2) ?
      3) Fly a spaceship somewhere.
      4) Profit!

    8. Re:Somebody here on Slashdot nailed it... by bill_mcgonigle · · Score: 1

      I remember in the last thread about the rover, someone opined that it was bad memory, then proceeded to give a half dozen reasons why. Totally nailed it.

      You're all so damn smart.


      Nah, one guy was so damn smart. Remember there were 944 other posts each with a different theory. Since we covered every possible theory somebody had to be right. :)

      --
      My God, it's Full of Source!
      OUTSIDE_IP=$(dig +short my.ip @outsideip.net)
  30. Inquiring minds want to know, by pair-a-noyd · · Score: 1

    What's the OS on this critter??

    1. Re:Inquiring minds want to know, by Anonymous Coward · · Score: 0

      MarsOS

    2. Re:Inquiring minds want to know, by Anonymous Coward · · Score: 0

      Go to Google

      Search "mars spirit operating system"

      You're Welcome.

      Sincerely,
      Pasha
      (Gaza Israeli Prison, 7 years old)

    3. Re:Inquiring minds want to know, by vrmlknight · · Score: 1

      Wind River its a real time embedded OS its actually used on a surprisely many things..

      their site is http://www.windriver.com/

      --
      This must be Thursday, I never could get the hang of Thursdays.
    4. Re:Inquiring minds want to know, by mlmurray · · Score: 1

      Go to Google

      Search "mars spirit operating system"

      You're Welcome.


      Nick Burns? Is that you? M O O V E !

  31. Adding injury to insult.... by Anonymous Coward · · Score: 0
    The rebate form NASA sent in for the flash card in the rover didn't get submitted on time. It ended up costing them $50 instead of $35 - AR/AC/PM.

    Next time, no more deals from gotapex.com!

  32. flash ram is known to fail on writes after a while by Anonymous Coward · · Score: 2, Interesting

    I know a lot of ppl are using flash ram in smaller computers for booting linux or what not. Well if they are writing their logs and other things to that flash be aware that you can only write to it so many times before it fails.

    Was NASA writing to that flash or just reading? A ram drive in flash sounds like it will access/write thousands of times a ?minute? This should wear it out quickly.

  33. Good News Everyone! by spun · · Score: 1

    NASA has a report, and it's very bad news!

    Well, bad news anyway. Bad flash? Maybe it was the solar storms. Can't they knock out flash, at least in space?

    --
    - None can love freedom heartily, but good men; the rest love not freedom, but license. -- John Milton
  34. Re:Monday morning quarterback: RTOS tradeoffs by G4from128k · · Score: 3, Informative

    If I was sending an embedded control computer to another planet, I would have chosen an OS with memory protection, not VxWorks.

    Actually, they might have protected memory if they use VxWorks AE RTOS/Tornado Tools 3.0. Spirit uses VxWorks, but I don't know what version they used or when they had to commit to a particular version of VxWorks.

    Also, as the article mentions, memory protection adds overhead and can affect real-time performance. Hard real-time software cannot afford to have a complex layered structure and lots of conditional code that adds unpredictable delays. For that reason, many really real-time applications run very close to the hardware (for better or for worse.)

    --
    Two wrongs don't make a right, but three lefts do.
  35. Redundancy? by jasonfncsu · · Score: 0, Redundant

    This is horrible planning by NASA! Web servers have more redundancy than a Mars Rover? Ok, lets see. 128 mb RAM -- (assuming it's a DDR sodimm) about $100 to add an extra 256 mb flash ram -- ~$100 Any reason why they didn't add any backups?

    --
    Jason Faulkner
    Old Os Administrator
    jason@oldos.org
    oldos.
    1. Re:Redundancy? by TheKidWho · · Score: 1

      Nasa has 50million dollar toilets on their spacecraft... they sure as hell arent using $100 RAM, more like $40,000 RAM.

    2. Re:Redundancy? by leeward · · Score: 1

      Sheesh... so you are going to condemn NASA based on a blurb on Slashdot? I don't suppose that you considered the possibility that the blurb is inaccurate or leaves out some critical details?

    3. Re:Redundancy? by Anonymous Coward · · Score: 0

      Because they're not using cheapass commodity PC components?
      I want to see you find something that could survive reentry into a thin atmosphere down at the local Best Buy or Ratshack. Not to mention the hideous amounts of radiation this thing must have been exposed to.

    4. Re:Redundancy? by Anonymous Coward · · Score: 0

      leeward clearly doesn't belong here.

    5. Re:Redundancy? by Anonymous Coward · · Score: 0

      They DO have redundant 256 mb flash mem modules on board as noted during today's press conference.

      Each time it booted, it tried an alternate flash module, but the error was the same on both.

      The error was written to both modules.

    6. Re:Redundancy? by Mr.+Darl+McBride · · Score: 1
      Ok, lets see. 128 mb RAM -- (assuming it's a DDR sodimm) about $100 to add an extra 256 mb flash ram -- ~$100 Any reason why they didn't add any backups?

      No problem. It's landing tonight. :)

  36. Damn, who do they have working for them? by ericdano · · Score: 0, Troll

    Damn, it sounds like freaking Microsoft. Say they know about a bug or a problem, and weeks LATER they fix it......

    --
    It's either on the beat or off the beat, it's that easy.
    I moderate therefore I rule!
    --
    1. Re:Damn, who do they have working for them? by vrmlknight · · Score: 1

      The only problem is that it's a hardware problem and it's 103 million kilometers away. it's tough to get the onsite tech there the next day.

      --
      This must be Thursday, I never could get the hang of Thursdays.
  37. Steal SOME by MajorDick · · Score: 5, Funny

    I mean like beagle isnt using its flashram anymore, just go and jack some off it. While your at it TAG the Beagle with some PRO-US graffiti :) hell maybe its got nicer rims too

    Seriously, can you imagine the first manned expiditon seeing the Beagle Jacked up, tagged , up on little martian cinderblocks, All that and we already got a head start on building martian cities

    1. Re:Steal SOME by fredmosby · · Score: 1

      hell maybe its got nicer rims too

      I don't know, the spirit rover has pretty fancy rims.

    2. Re:Steal SOME by flossie · · Score: 1
      I mean like beagle isnt using its flashram anymore, just go and jack some off it. While your at it TAG the Beagle with some PRO-US graffiti :) hell maybe its got nicer rims too

      Seriously, can you imagine the first manned expiditon seeing the Beagle Jacked up, tagged , up on little martian cinderblocks, All that and we already got a head start on building martian cities

      What, like this? (from a comment in an earlier article).

  38. Information on the MER hardware. by elrond1999 · · Score: 5, Interesting

    Ive been unable to find any hard information on the design of the MER memory systems. If anyone can point me to a technical brief id be very happy.

    From what ive pieced together the MER system is something like this:

    One RAD6000 powerpc cpu.
    Connected via probably compact pci to 128 mb of ecc sdram.
    256 mb of flash. No info on what make of flash, but likely Intel since they are the biggest. There was some info from the press conference that there are actually two flash chips and that the flight software is redundantly stored on each. So does this mean that there is actually 128mb of redundant flash? Also it was said that they had problems even with the redundancy, could they possibly have overwritten something? We all know that even a redundant raid does not stop filesystem corruption.

    No information on how the flash is connected, parallell / serial? How the redundancy works?

    Btw, I guess flash is rather radiation hard since they require 10 - 20V to erase / write.

    1. Re:Information on the MER hardware. by Anonymous Coward · · Score: 0

      In these situations you'd want something with LOW density. If my RAM/Flash chip gets hit with an alpha particle then I don't want a huge load of memory getting corrupted as any ECC will be useless.
      Low density means each cell is quite large so you won't get as much corruption and ECC has a better chance of correcting or noticing it.

    2. Re:Information on the MER hardware. by georgewilliamherbert · · Score: 2, Informative
      Ive been unable to find any hard information on the design of the MER memory systems. If anyone can point me to a technical brief id be very happy.

      RAD6000 6U Compact PCI page at BAE Systems.

      It's not great, but there are more detailed links around the BAE website.

      It doesn't list how the FLASH is connected; that's not a standard built-in on the RAD6000 computer. I would guess, hung off the FPGA interface device, but I don't know that for sure.

    3. Re:Information on the MER hardware. by haggar · · Score: 1

      No info on what make of flash, but likely Intel since they are the biggest.

      As far as I know, the biggest in Flash RAM is AMD, with the Atmels and Winbonds coming distant second. And Intel is among them.

      --
      Sigged!
    4. Re:Information on the MER hardware. by mkramer · · Score: 1

      As for rad-hard flash devices, they're often purchased from BAE or Honeywell, who purchase the technology rights from the bigguns and do their own low production rate fabbing.

      There just isn't much business for rad-hard devices, and the big producers don't find it worth their time and money, usually.

    5. Re:Information on the MER hardware. by Anonymous Coward · · Score: 0

      It's a vfat partitioned flashrom, which has a lot of problems. I don't know why vfat is being used on a mission like this.

      dave m.

    6. Re:Information on the MER hardware. by cosmo7 · · Score: 1

      The slightly unnerving thing is the obvious reason why BAe Systems - which makes the business end of the Eurofighter - would need to develop expertise in radiation-hardened computer components.

    7. Re:Information on the MER hardware. by mkramer · · Score: 1

      I don't know much about BAe's product line, other than various military-application sensors and power supplies we've bought from them for US DoD projects.

      Like Honeywell, though, I think their foundry services were started for producing parts to military requirements (namely, temperature and packaging constraints). At that point, given how low the yield is in that market, it was probably very worth their while to do other custom foundry work for defense customers, including rad-hard ASICs for god only knows who.

    8. Re:Information on the MER hardware. by bill_mcgonigle · · Score: 1
      It's a vfat partitioned flashrom, which has a lot of problems. I don't know why vfat is being used on a mission like this.

      VxWorks supports it? Apparently FAT and a PDP-11 filesystem.

      Not everybody thinks it's a good FAT implementation.
      Are there any known filesystem problems?

      During the course of our internal testing, we came across three problems with the dosFs 2.0 filesystem that warranted patches from Wind River Systems. We strongly recommend you upgrade to dosFs 2.2, SPR 79795 (x86) and SPR 79569 (PPC) which fixes all of these problems and many more. You should ask Wind River Systems for the patches to these problems if you encounter them and are unable to upgrade to dosFs 2.2.

      The first problem is that files will seem to disappear. You should look at SPR 31480 in the Wind River Systems' Support pages for a more detailed description of this problem.

      The second problem is a semaphore deadlock within the dosFs filesystem code. Looking at a stack trace via CrossWind, you will see two or more of your application's tasks waiting in semaphore code within dosFs. The patch for this problem is under SPR 33221 at Wind River Systems. There are several SPR numbers at Wind River Systems that refer to this particular problem.

      The third problem is that all tasks will hang on a dosFs semaphore. You should look at SPR 72063 in the Wind River Systems' Support pages for a more detailed description of this problem.
      --
      My God, it's Full of Source!
      OUTSIDE_IP=$(dig +short my.ip @outsideip.net)
  39. Hmm, who made the ram? by raque · · Score: 1

    And if my palm locks up do I have to send it to mars and reboot it?

  40. Follow the status? by Anonymous Coward · · Score: 0

    You can read all about it at: Spaceflight Now - where you can continue to follow the status of both spirit and opportunity (which currently is hours away from landing).

    Yeah, sure. What a better way for a geek to spend a Saturday night than checking Spaceflight Now's website for up-to-the-minute reports on what some hi-tech gizmo is doing on Mars. Time to fire up the microwave and break out the extra-butter-flavor PopSecret! This is gonna be one exciting night!

    1. Re:Follow the status? by tealover · · Score: 1, Offtopic

      Yeah, he's not cool like us studs spending our Saturday nights here on Slashdot.

      --
      -- You see, there would be these conclusions that you could jump to
    2. Re:Follow the status? by Joey7F · · Score: 1

      Man I must really bad off, I know that was supposed to be sarcastic but I have a bag of popcorn cooking as we speak...sigh

      This is like a reality tv show, I love Nasa Tv!

      --Joey

    3. Re:Follow the status? by the_mad_poster · · Score: 2, Funny

      This is like a reality tv show, I love Nasa Tv!

      With the exception that this is actually real...

      --
      Alito: A vote for Alito is a punch in the eye to put that bitch back in her place!
    4. Re:Follow the status? by datan · · Score: 1

      me too... who's the asian dude in the flag? he was wearing it the last time too.

    5. Re:Follow the status? by Basehart · · Score: 2, Funny

      A little too much grandstanding though.

      I've noticed that a few people stand facing the cameras a lot, gesticulating wildly as if talking about something important.

      I also saw one guy go from reading a magazine and sipping a martini to furiously typing away at a keyboard as the camera panned across the room!

    6. Re:Follow the status? by SYFer · · Score: 1

      Ich bin ein geek.

      --
      "...all the labours of the ages, all the devotion, all the inspiration, all the noonday brightness..." yada yada
    7. Re:Follow the status? by datan · · Score: 2, Insightful

      just wondering something. when they say 'currently' do they mean now or light-time ago? eg. they confirmed cruise stage separation less than a minute after it "happened"

    8. Re:Follow the status? by ckaminski · · Score: 1

      He was just on AIM "finger"ing his girlfriend...

      Either that or his wife told him to come home now, and get some milk and bread on the way, or he can live in his martian 24hour, 36m world cuz she's moving out.

  41. Wrong colours again... by daina · · Score: 2, Funny
    NASA should never have used a Sony WHITE memory stick with built-in DRM. That rover probably took a picture of something that looked a little too much like a Disney character, and - bam - total shutdown!

    They should stick with purple next time.

    1. Re:Wrong colours again... by the+pickle · · Score: 1

      That rover probably took a picture of something that looked a little too much like a Disney character

      I told you I saw a face on the surface of Mars!

      Jeez, no one believed me until now...

      p

  42. Congrats by jrc313 · · Score: 1

    Congratulations to the hackers at NASA. I don't know whether to be jealous or glad that it isn't me sitting there hacking away at a machine that is sitting on another planet.

  43. Good news but lost time by Linus+Sixpack · · Score: 1

    The handle on their problem is a very good thing.

    Knowing about the problem before the twin lands is probably a good thing because they might anticipate the problem.

    But if it takes weeks to fix the solar panels on the lander will be degrading in the martian atmosphere. The will miss the down time for Spirit's task list.

    It must be so frustrating to sit on a possible fix and wait for a communication window, or computer response to see if you're right.

  44. screw GDB by borgdows · · Score: 0

    man!! it's what I call remote debugging!!

  45. MOD PARENT AS 'REDUNDANT' by Anonymous Coward · · Score: 0

    That's show him where the redundancy is!

  46. Yeah! by Anonymous Coward · · Score: 0

    Yay! We win.

  47. Heh by SeaDour · · Score: 1

    The flash card they used on the rover must've been made in Taiwan. ;)

  48. It's a good thing the Spirit had an F8 key by michaelmalak · · Score: 3, Funny

    ...and it's amazing NASA could press it at the right time from 124 million miles away (1.3 AU). Although I wonder how many times NASA did have to press it before they got the timing right -- we only know about the success :-)

    1. Re:It's a good thing the Spirit had an F8 key by MachDelta · · Score: 1

      Ah, they probably just used the tried and true, highly-scientific method of:

      Mash-Mash-Mash-Mash-Mash-Mash-Mash-Mash-Mash-Mash- Mash-Mash-Mash
      "Got it!"

    2. Re:It's a good thing the Spirit had an F8 key by Anonymous Coward · · Score: 0
      ...and it's amazing NASA could press it at the right time from 124 million miles away (1.3 AU).


      The Mars rovers are using VXWorks, not Windows. What does hitting F8 have to do with VXWorks? If anything, blame UNIX for the troubles. Perhaps they SHOULD have used Windows XP as it may have been more fault tolerant than VXWorks.

    3. Re:It's a good thing the Spirit had an F8 key by Anonymous Coward · · Score: 0

      Well, it was rebooting 60 times a day so....

    4. Re:It's a good thing the Spirit had an F8 key by Anonymous Coward · · Score: 0

      VXWorks doesn't have anything to do with UNIX, you dolt.

  49. Salute the Helpdesk by Papa+Legba · · Score: 5, Funny

    I have had some tough calls in my time but I have never had to walk a robot 283 million miles away through brain surgery. Man I am glad I did not get that call. This is going to blow there call averages all to hell. I raise a cup of Joe to you, Rover Help Desk man.

    --
    Papa Legba come and open the gate
    1. Re:Salute the Helpdesk by Anonymous Coward · · Score: 0

      The only thing is that the first question the Help Desk man will ask NASA will be:

      "Is your system plugged in?".

      From there it only took them a couple of days before the NASA got transferred to the second tier help.

      -cmh

  50. last photo from Spirit by djupedal · · Score: 5, Funny

    This is the last image received prior to the recent issues with Spirit...

    1. Re:last photo from Spirit by HotNeedleOfInquiry · · Score: 1

      Well, at least it wasn't goat.se

      --
      "Eve of Destruction", it's not just for old hippies anymore...
    2. Re:last photo from Spirit by Anonymous Coward · · Score: 0
      Well, at least it wasn't goat.se

      Spirit landed on a Mars, it didn't fall into a "black hole" like Beagle... :-)

    3. Re:last photo from Spirit by zcat_NZ · · Score: 1


      Nah, I think this guy had something to do with it...

      --
      455fe10422ca29c4933f95052b792ab2
    4. Re:last photo from Spirit by WankersRevenge · · Score: 1

      I thought Michael Jackson was in court?

    5. Re:last photo from Spirit by Anonymous Coward · · Score: 0

      goatse is a red hole you insensitive clod

  51. Why are they sending its twin so early? by James+Lewis · · Score: 1, Interesting

    I know those guys at NASA are smart... so does anyone know why they sent opportunity right after spirit? I would think it would be better to wait and see if any problems occured with the Spirit, and learn from their mistakes. I'm sure there is an optimum aligning of the planets for a launch... but is that really so rare that they couldn't wait?

    1. Re:Why are they sending its twin so early? by Anonymous Coward · · Score: 0

      During the launch window for both these missions mars was approching the closest point to earth in its orbit in 59,600 years. So the answer is flight time.

    2. Re:Why are they sending its twin so early? by the+eric+conspiracy · · Score: 1

      This close approach was truly historic; the last time Earth and Mars were this close was 60,000 years ago. Moderately close approaches are more common, but it will still be 284 years until the next such very close approach event.

    3. Re:Why are they sending its twin so early? by Anonymous Coward · · Score: 0

      Because that's what makes them twins and not siblings.

  52. Re:Monday morning quarterback: RTOS tradeoffs by GGardner · · Score: 4, Insightful
    memory protection adds overhead and can affect real-time performance

    This is the conventional wisdom, and in my experience, this particular nugget causes more embedded and real time software projects to fail than any other.

    First off, on a modern PowerPC processor, memory protection (that is, without virtual memory support) can be implemented very cheaply. If you can do it just with the IBAT/DBAT registers, it should be a constant-time overhead, which is good enough for hard-real time. Oddly enough, I can't find a single reference on the net that measures the cost of memory protection alone on a modern CPU. Anyone? Anyone?

    Secondly, though the rover certainly may have some software components that have hard-real time requirements, that doesn't mean that every single line of code does. Typically, less than 1 percent of the code in a real time system is hard real time. In that case, you can run the real-time code in ISRs, or perhaps in a dual-mode system, like RT-Linux, or in high-priority kernel threads (as with QNX). In any of these situations, you can run all the rest of the code in protected memory space.

  53. Flash? by curious.corn · · Score: 1

    Oh, shite (I'm struggling not to swear trollish gibberish!)... how can the fools even think of using flash in a space mission? What's most rad sensitive that a bunch of trapped electrons hovering on a thin isolation layer tweaking a threshold voltage?

    --
    Mi domando chi à il mandante di tutte le cazzate che faccio - Altan
    1. Re:Flash? by geekoid · · Score: 1

      you're right, those morons at NASA, with there PhDs and years of experience, don't know what the hell there doing.

      --
      The Kruger Dunning explains most post on /. http://en.wikipedia.org/wiki/Dunning%E2%80%93Kruger_effect
    2. Re:Flash? by Anonymous Coward · · Score: 0

      They obviously don't know what they are doing.
      I expect the other rover to experience the same problem.

    3. Re:Flash? by mkramer · · Score: 1

      There are multiple implementations of rad-hard flash memory that are used regularly in space. Usually the charge pumps are removed, with external voltage sources used instead to reduce Total Ionizing Dose problems. The other major source of upsets are in Latch-Up, the behavior of which are very implementation-dependent. Some designs are practically immune to single event latch-ups, some are very susceptable.

      We've been using flash in space for a while now, with little trouble. Actually, with lower error rates than SRAM (pre-correction, if ECC).

  54. Opportunity by loconet · · Score: 3, Interesting

    Opportunity is fast approaching the red planet. It should be an interesting night at JPL. Execellent work guys, good luck.

    --
    [alk]
  55. You think that's neat by chazR · · Score: 5, Informative

    Here's a rant by a JPL guy about appropriate technologies for software on deep space probes. He recounts one story of a failed probe "100 million dollars, and 100 million miles away".

    They fixed it. The fact there was a lisp REPL running on the spacecraft helped.

    That's cool:

    (unwind-protect
    (progn (do-science)(talk-to-earth))
    (wait-in-repl-for-earth))

    1. Re:You think that's neat by be-fan · · Score: 3, Interesting

      This is a bit OT, but I need to rant:

      A quote from his site: "It is incredibly frustrating watching all this happen... I can't even say the word Lisp without cementing my reputation as a crazy lunatic who thinks Lisp is the Answer to Everything"

      I feel his pain. I was introduced to Lisp not too long ago, and within a short time, a Lisp-derived language (Dylan) became my favorite. I also found that many of the features I loved from Python were very Lisp-y in nature. Now, I see Java and C# either neglecting all the knowledge garnered from the Lisp-family of languages, or reinventing it --- badly. The features in C# 2.0 have either been in Lisp for decades (lambdas, closures) or are not necessary in Lisp (iterators, enumerators --- which, btw, are theoretically not necessary in C# 2.0 either because of lambdas and closures!) This new "Xen" (or X#) language Microsoft Research is pushing takes a great idea (extending the language to fit the problem domain) that has been a part of Lisp for decades, and chops it off at the knees. Instead of having proper macros, so you can extend the language to fit *your* problem domain, they hack support for a single problem domain (back-end business programming) into the language itself!

      That said, the Lisp community is to blame as well. Part of the reason people stop listening the moment somebody says Lisp is that the Lisp community is *so* rabid and *so* unyielding. Especially some high-profile members who are highly respected within the community despite the fact that they are completely obnoxious and lack any human sense of manners.

      --
      A deep unwavering belief is a sure sign you're missing something...
    2. Re:You think that's neat by DAldredge · · Score: 1

      If you would have just mentioned Ruby you would have had the trifecta of Languages that People Have To Fit Into Every Slashdot Post. LPHTFIESP.

    3. Re:You think that's neat by be-fan · · Score: 1

      Hey, I didn't start it! The thread was about Lisp already :)

      PS> That 'sdr' in sdrlabs couldn't possibly have something to do with software-defined radios, could it?

      --
      A deep unwavering belief is a sure sign you're missing something...
    4. Re:You think that's neat by DAldredge · · Score: 1

      No. It's the first letters of my kids names. :->

    5. Re:You think that's neat by greenrd · · Score: 1
      The features in C# 2.0 have either been in Lisp for decades (lambdas, closures) or are not necessary in Lisp (iterators, enumerators --- which, btw, are theoretically not necessary in C# 2.0 either because of lambdas and closures!)

      Lambdas are just an aesthetics/keystrokes issue. In my opinion, Lisp-lovers need to stop focusing so much on petty, marginal issues like that, and start focusing on really useful subjects - like formal verification and strong typing (which go very well together).

      Part of the reason people stop listening the moment somebody says Lisp is that the Lisp community is *so* rabid and *so* unyielding.

      That's probably because the only people who think Lisp Is The Answer to Everything are a little bit insane.

    6. Re:You think that's neat by be-fan · · Score: 2, Interesting

      Lambdas are just an aesthetics/keystrokes issue.
      --- :: jaw drops :: :: gets on knees ::

      I bow down to your ignorance, oh mighty King of the Cluless!

      Seriously, though, please research lambdas. They don't just save typing. They are *everything*. All of computation can be described just with lambdas of a single parameter. Everything else is just syntax suger. If you ease one restriction of the lambda calculus (no side-effects), lambdas can do procedural code, functional code, and even object-oriented programming. That's why I said iterators, enumerators, etc, are not necessary in C# 2.0, because it has proper lambdas. All of thsoe can be implemented very easily on top of lambdas.

      and start focusing on really useful subjects - like formal verification
      ---
      You can do formal verification in Lisp. Look up ACL2 (first link in Google search for that term). I'd still say that Haskell or Clean are a bit better for such purposes, but mainly because they are designed from the beginning to be comfortable to program without side-effects, while Lisp was not.

      and strong typing (which go very well together).
      ---
      C'mon. You're making this too easy. Lisp has a strong type system. All type errors are caught, unless you disable type-checking in your compiler. Maybe you mean *static* typing instead?

      That's probably because the only people who think Lisp Is The Answer to Everything are a little bit insane.
      ---
      Um, wasn't that my point? Its the "Lisp is the Answer to Everything" people that make it harder for normal people to push Lisp to areas where it would be really useful.

      --
      A deep unwavering belief is a sure sign you're missing something...
    7. Re:You think that's neat by torpor · · Score: 1

      That's probably because the only people who think Lisp Is The Answer to Everything are a little bit insane.
      ---
      Um, wasn't that my point? Its the "Lisp is the Answer to Everything" people that make it harder for normal people to push Lisp to areas where it would be really useful.


      no no, you don't get it, he's just one of those 'likes to prove smart lisp people wrong' insane people ... those guys are all just a little bit insane as well.

      --
      ; -- the corruption of government starts with its secrets. a truly free people keep no secrets. --
    8. Re:You think that's neat by Anonymous Coward · · Score: 0

      Your career has been a mirror of mine. How ironic. I checked your link and if i did'nt know any better i'd think it was me. (except for the Google part - my Google was the NSA) Keep coding bud .... while i glance over at my chugging CPM Ampro64 over in the corner running a firmware version of lisp.

  56. It will work fine... by inteller · · Score: 1

    ....until Windows Update pushes out another Critical Update next week.

    NASA needs to turn off Automatic Update. Oh and they need to uninstall MAME32....the martians are having too much fun playing space invaders.

  57. Re:flash ram is known to fail on writes after a wh by Anonymous Coward · · Score: 0

    Welcome to Slashdot. News for Nerds and 10 million experts on every damn subject on the planet. Does everybody here suddenly have a PhD in engineering and space technology? Because I sure haven't. Every whacked out theory gets +3 for Informative now.

    A normal commodity Flash memory has around 10,000 erase cycles minimum. You should be very safe in assuming that whatever NASA is using that it's got plenty of erase cycles for whatever tasks it's going to do.

    A programmer may miss out a - in code or mistake miles for kilometres but somebody would have to be seriously looking the other way for their entire life if they'd specified something which would die this easily.

  58. that line from armageddon comes to mind... by MoFoQ · · Score: 5, Funny

    where the russian cosmonaut says "American components, Russian components. They're all made in Taiwan!"

  59. Imagine... by Anonymous Coward · · Score: 0

    ...a Beowulf cluster of two working rovers! Go NASA!

  60. Did they really use COTS parts ? by Anonymous Coward · · Score: 0

    I find it very difficult to believe that given the money involved that NASA used Commercial Off The Shelf (COTS) componentry for something this critical ?

    Surely they have enough pull to have custom semiconductors designed and manufactured for this type of task ?

    I would have expected some seriously 'on-die' redundancy for all the OS boot media as well as RAM - preferably with triple-redundancy and bit level voting for all bits.

    Oh well, there's always the next time :-)

  61. Solar Flares by anubi · · Score: 1
    Oh yes...

    Remember, when we launched this probe, no sooner than we had it in space, butt-nekid out there, we get assaulted with every kind of solar flare imaginable. So many I lost count, but they were the subject of numerous slashdot stories.

    Kudos, NASA, that this thing works at all!!!

    --
    "Prove all things; hold fast that which is good." [KJV: I Thessalonians 5:21]

    1. Re:Solar Flares by rajid · · Score: 1

      Also, it's interesting that I was reading about a large high energy particle emission which was hitting earth on the same night when Spirit went dead. There was a lot of talk about auroras as far south as middle latitudes in the US (which would be a rather large emission). Interesting coincidence? I'm seriously wondering if the flash memory got fried by this same high energy particle stream? You would think someone would mention this or NASA would address this possibility.

  62. Re:Monday morning quarterback: RTOS tradeoffs by Jeff+DeMaagd · · Score: 1

    How about just simple ECC?

    Maybe I'm crazy but the systems I run that have ECC are incredibly stable even when using Windows. My Alpha got 100+ days uptime with daily use on Windows NT4. I have an old Xeon that easily did 40 days, and I shut it down by mistake.

  63. Imagine by LZ_Mordan · · Score: 0

    imagine you could decide WHEN you die. Wouldn't be that great? Biological engineering science will allow us to do just that. What do you prefer? TO see a few unmanned missions going on Mars or ACTUALLY going on Mars? The battle against time is online. Getting the anti-age pill should be top priority for any geek. Imagine unlimited time of geekness? Bliss. Unlimited space exploration for your OWN consciousness!!! Everytime I hear space news, that's what I think about....

  64. can't go shopping by whovian · · Score: 1

    Damn! Where's a Wal-Mart when you need one?

    --
    To-do List: Receive telemarketing call during a tornado warning. Check.
  65. Re:Monday morning quarterback: RTOS tradeoffs by anubi · · Score: 2, Interesting
    From where I sit, I think they did damm good.

    I don't know that much about VXWorks, but I heard that one of its main assets is having a very small tight multitasking kernel.

    They were able to regain the system, despite loss of a major computational component. Remotely. Through a debug link. That sure says a helluva lot for the robustness of the OS and how they configured it.

    Good job, JPL.

    --
    "Prove all things; hold fast that which is good." [KJV: I Thessalonians 5:21]

  66. Flash error ?? by borgdows · · Score: 0

    Damn Macromedia!!

  67. As someone else said by Viadd · · Score: 5, Funny

    The Spirit is willing, but the flash is weak.
    (Posted by Jane Slee and John Stracke in separate usenet postings.)

    1. Re:As someone else said by genka · · Score: 1

      This is THE funniest joke! Why I don't have any mod points when I need them...

    2. Re:As someone else said by Teach · · Score: 1

      I do have mod points, but it's already at +5. I'd give a 6 or 7 if I could. Holy dang, that's funny!

      --
      Graham "Teach" Mitchell, computer science teacher, Leander HS
    3. Re:As someone else said by Anonymous Coward · · Score: 0

      The spirit is willing but the flesh is soft and spongey.

      Thanks to Zap Branagan

  68. Re:You mite listen to Jimmy, But you can't hear Ji by Anonymous Coward · · Score: 0

    Wonder wehre they got they flash ram from?

    From the same place that you got your typing lessons, apparently.

  69. Wrong planet by Anonymous Coward · · Score: 0
    Spirit is on Mars, not Uranus.

    Ba-da-boom. I'm here 'til Tuesday.

  70. Thank God by mtfbwy · · Score: 2, Funny

    They didn't use Windows CE. Remember the diplomat months back that got locked in his 7 series BMW because of a computer crash? :)

  71. Radiation hardened Flash by andygrace · · Score: 5, Informative

    There is a big difference between standard flash and radiation hardened flash. In fact we are designing a project with one of these VME buss units as a storage array.

  72. Re: Damn it by Anonymous Coward · · Score: 0

    I wonder if they store the initial, by definition very critical boot section in flash or another really non-writable ROM?
    Does anyone know further facts about that topic?

  73. Relative positions of Earth and Mars by LouisvilleDebugger · · Score: 3, Insightful

    The present series of orbiters/landers (Nozomi, Mars Express, Spirit, Opportunity) were launched at such a time as to take advantage of the most optimal Mars-Earth configuration for something like 60,000 years. I believe the bottom line is that it was a time you could get the most science there for the least cost of launch.

    Shame on my fellow American who said we should strip Beagle 2 and leave it up on cinderblocks. If Beagle is ever discovered to have soft landed, I would think the only proper thing to do would be to restore whatever's wrong with it, and let it complete its mission. (HAL, V'Ger, anyone?) Given the discussion of things like the effects of radiation exposure on electronics, you'd just have to be interested to know what a 50-or-150-year-old "dead" lander might be able to wake up and do.

    If Spirit's problems aren't resolved, the Mars Scorecard should at least reflect that Beagle was the less expensive failure.

    (Disclaimer: I visited England for the first time last year, and falling in love with the whole place doesn't begin to describe it. R.I.P. Beagle 2. *sniff*)

  74. same launch window by babazaroni · · Score: 1

    They needed to be in the same launch window, otherwise you would have to wait a long time to send another.

  75. We learn from our mistakes... by Chordonblue · · Score: 4, Interesting

    So... I wonder if they'll consider validating MRAM more quickly if Flash is found to be more error prone.

    You know how NASA works. The Space Shuttle running on 486's and whatnot. I understand the science behind that reasoning, as sad as a 66 MHz processor seems to us geeks nowadays, but I wonder if MRAM will prove more flexible and stable for future space missions.

    --
    "...Well, there's egg and bacon; egg sausage and bacon; egg and spam; egg bacon and spam; egg bacon sausage and spam..."
    1. Re:We learn from our mistakes... by Detritus · · Score: 2, Informative

      The Space Shuttle does not use 486s. It uses IBM AP-101s, which are architecturally similar to the IBM 360/370 series of computers. See the Second Generation Computers FAQ.

      --
      Mea navis aericumbens anguillis abundat
    2. Re:We learn from our mistakes... by John+Courtland · · Score: 1

      No way.... I actually took a course in IBM S/390 Assembler. Awesome. I wonder if they use MVS and JCL too...

      --
      Slashdot is proof that Sturgeon's Law applies to mankind.
  76. RAEFR? by mobby_6kl · · Score: 1

    Redundant Array of Expensive Flash Rams?

  77. Remote nonsense by fm6 · · Score: 2, Insightful
    Just wait until we have Interplanetary, Interstellar, Intergalactic Remote Desktop. I'm only half-joking.
    No you're not. All these Mars glitches are exactly why real space exploration entails sending an actual carbon-based unit, not a glorified laptop.

    Consider that an interstellar probe will take years to receive updated instructions. By which time, any fix will probably be irrelevent. Plus if they're more than 30 light-years away (practically next door by galactic standards) they guy who sent out the instructions probably won't live long enough to find out if they worked!

    1. Re:Remote nonsense by TheLink · · Score: 1

      Humans have a lot more expensive on-site dependencies than computers.

      An interstellar probe using current tech will take years to get where it's aimed for anyway, whether carrying humans or computers. By the time it reaches the nearest star it'd be pretty much irrelevant to anyone except itself.

      --
    2. Re:Remote nonsense by mi · · Score: 1
      sending an actual carbon-based unit, not a glorified laptop

      According to a recent Economist article (on "Bush's grand but costly visions"), NASA reckons, that adding a human to a space mission increases its cost ten-fold. Which means, we could've sent 10 (probably -- more, thanks to the economies of scale) instead of 2 rovers to Mars at half the price of a single manned mission.

      --
      In Soviet Washington the swamp drains you.
    3. Re:Remote nonsense by fm6 · · Score: 1

      Except that a manned mission will (or should, if it's properly conceived) build up to a permanent presence. This builds national pride in the program, and holds out the hope of long term benefits. No robot probe can hope to do these things -- even on the rare occasions they work!

    4. Re:Remote nonsense by mi · · Score: 1
      build up to a permanent presence

      Did not happen on the Moon yet...

      This builds national pride in the program

      I thought, this missions should be about advancing science not bragging rights. But don't worry, US still holds the "national pride" record of Moon-landing, and, should a need to show-off arrive again (if Chinese land on the Moon, for example), our experience with robotic Mars missions will help us get an astronaut there first.

      --
      In Soviet Washington the swamp drains you.
    5. Re:Remote nonsense by fm6 · · Score: 1
      We don't have a permanent presence on the mooon because the Apollo program wasn't designed to create one. A fundamental flaw in the whole concept. They didn't build a system that could be used as a foundation for further space exploration -- they just built this big expensive kludge, the main design goal of which was to get there before the Soviets.
      I thought, this missions should be about advancing science not bragging rights.
      Depends on who you talk to. But that's a moot point -- you can't have one without the other. Taxpayers aren't going to get all emotional about a mission that mainly result in a few scientific papers. They need something more.

      Congress gladly spent $25 billion (almost $120 billion in today's dollars) to put a man on the moon, because it was a matter of national pride. But as soon as it was obvious that the Soviets had dropped out of the race, funding for anything past the initial landing began to dry up.

      I'm not saying plans for a moon base would have saved the manned space program. But we obviously needed to do more than go a fetch a few moon rocks to maintain popular support.

    6. Re:Remote nonsense by mi · · Score: 1

      So your true goal is advancing science (which -- technicaly -- can be done without human presence), but you think, that the only way to get popular support (hence -- Congress funding) is by fooling the masses into thinking, the human presence is needed or somehow useful?

      Should not we be above such things?..

      --
      In Soviet Washington the swamp drains you.
  78. how ms might handle the problem by Anonymous Coward · · Score: 0

    The mars rover seems to have problems. I read that the rover was rebooting itself at least sixty times a day.

    This got me thinking about what a hypothetical service call to microsoft about the problem might be like.

    ring.... ring.... ring......

    Microsoft:
    hello, welcome to microsoft international automated service sys ....... hello, welcome to microsoft international ..... hello welcome to
    microsoft int ..... hell0 ..... hel-l0 .... hell .... no ....

    Your call will be answered in just a moment by Microsoft's advanced, state of the art fishy intelligence system designed to give you the
    illusion of speaking to a real trout while preserving all that you have come to expect from microsoft technology. Please speak naturaly to the oi vey when your call is answered.

    pause,

    Hi there, my name is sivekenanda vishnu ramakrishna singh, this is not my real name but rather a randomly generated name designed to prevent you guessing my country of orgin; you can call me art. Your call has been routed to me by microsoft artificial intelligence routines which
    have examined your internet traffic for the last three years and determined you are in paris, poland and speak yiddish but due to a minor
    system anomaly in my programming I am speaking to you in english.

    Please accept feltup condolences from everyone at microsoft on the death of your piano.

    How may I be a pittance to you mrs calabash?

    Caller:
    Well, my name is spike mcman and I'm calling you from the jet propulsion laboratories in pasadena california. We seem to be having a little
    problem with one of our space craft on Mars, you see its rebooting itself more then 60 times a day.

    Microsoft:

    And your problem is?

  79. Where's the redundancy? by Anonymous Coward · · Score: 0

    It's named Opportunity.

  80. Flash Memory Brand by Jafafa+Hots · · Score: 1

    They shoulda used some decent Sandisk or Viking flash memory instead of that lousy Mr. Flash crap.

    --
    This space available.
  81. Secure Digital SanDisk by Fubar411 · · Score: 1

    Sandisk Compact Flash is as good as anyones. It is their SD flash that is terrible.

    1. Re:Secure Digital SanDisk by ncc74656 · · Score: 1
      Sandisk Compact Flash is as good as anyones. It is their SD flash that is terrible.

      FWIW, I've had no problems with 256MB and 512MB SD cards, used with either a no-name reader or a Palm Tungsten T. What are the problems of which you speak?

      --
      20 January 2017: the End of an Error.
    2. Re:Secure Digital SanDisk by Anonymous Coward · · Score: 0

      Put it in a Dell Axim X5. You'll end up *exactly* like this rover.

      My first thought when they said was the flash ram... Did Dell design the controller? Oh God! We're doomed!

      Of course, yes, I know, Dell didn't actually design the X5; they outsourced it. Either way, it's got a POS memory controller, just like the rover.

    3. Re:Secure Digital SanDisk by NDPTAL85 · · Score: 1

      And of course you posted as an AC so that you can't be educated with the correct information.

      The Axim DID have a problem like this but a fix was issued by Dell at no cost to the owners.

      --
      Mac OS X and Windows XP working side by side to fight back the night.
  82. it could happen... by cybin · · Score: 1

    jpl:~ sokeefe$ ssh spirit
    sokeefe@spirit's password:

    Today is Prickle-Prickle, the 24th day of Chaos in the YOLD 3170

    Welcome to Spirit!

    spirit:~ sokeefe$ mkdir /tmp/ramdisk0
    spirit:~ sokeefe$ mke2fs /dev/ram0
    spirit:~ sokeefe$ mount /dev/ram0 /tmp/ramdisk0
    spirit:~ sokeefe$ dd if=/dev/flash1 of=/tmp/ramdisk0

    spirit:~ sokeefe$ reboot
    Connection to host lost.

    sokeefe@jpl:~ sokeefe$

  83. Solution for Next Time by wildsurf · · Score: 2, Interesting

    Flash RAID array.

    (Can this even be done?)

    --
    Weeks of coding saves hours of planning.
  84. Nasa TV by Nucleon500 · · Score: 3, Informative

    If you don't get it on cable, you can watch NASA TV here.

    1. Re:Nasa TV by CodeWheeney · · Score: 1

      You think any of those guys in mission control are reading /. right now?

      --
      C8H10N4O2 | Developer > Code
  85. The uncensored news release by Anonymous Coward · · Score: 0

    JPL, Pasadena -- January 21, 2004

    Ground controllers were able to send commands to the Mars Exploration Rover
    Spirit early Wednesday and received a simple signal acknowledging that the
    rover heard them, but they did not receive expected scientific and
    engineering data during scheduled communication passes during the rest of
    that martian day.

    Project managers have not yet determined the cause, but similar events
    occurred several times during the Mars Pathfinder mission. NASA scientists
    suspect, however, that the re-use of AOL floppy disks to hold the
    mission-critical computer software may have contributed to the failure. A
    scientist who requested anonymity said "We were just trying to save a little
    money. We should have stopped using them after the shuttle disasters."

    Full details on the rover's status will be described in the next daily news
    conference Thursday at 9 a.m. Pacific time at the Jet Propulsion Laboratory,
    which will be broadcast live on NASA Television.

  86. Flash Crash by the+eric+conspiracy · · Score: 1

    Eh ... I always carry a couple of flash cards when I go out on a photo shoot with my digital camera (on Earth). If I was sightseeing on Mars I'd go whole hog and bring half a dozen in my kit. The air fare is so ungodly expensive.

    Everbody knows these things crash every now and then and you need to carry spares. I hope Spirit is set up so it doesn't just have one flash RAM device. I'd hate to loose those pictures.

    1. Re:Flash Crash by Anonymous Coward · · Score: 0

      I agree, that sounds like a serious-ass weak design point. Flash RAM is probably among the most radiation-vulnerable stuff around, and it weighs next to nothing, so they should have included at least one redundant bank of it.

  87. One Question by runestar · · Score: 1

    Was the Flash Memory made by Simpletec?

    Okay now for the back story so fellow Slashdotters are not comfused. I work for a Consumer Electronics Manufacturer, and our device uses Multimedia Memory Cards. Well we had to design our own MMC cards because those produced by SimpleTec fail way too often for our quality controls. Our Techs are always fielding calls about applications closing due to poor memory cards.

  88. Re:Monday morning quarterback: RTOS tradeoffs by rekoil · · Score: 1

    100 days uptime is "incredibly stable"? *snrrk*

  89. Simple Solution by Anonymous Coward · · Score: 0

    Don't write code that fucks with other
    processes memory.

    What is the OS supposed to do? "Hey
    driver that controls the communcation
    chip, you die !!!". Then what?
    No link to earth.

    Just write good code, test it, review it.

  90. Radiation by panxerox · · Score: 2, Interesting

    with all the radiation and very high energy particles zipping thru the spacecraft on its way there, I'm suprised any computerized spacecraft get anywhere intact.

    --
    "It's so convenient to have a system where everyone is a criminal" - A. Hitler
  91. Opportunity log last entry by Muhammar · · Score: 1

    "0354 GMT (10:54 p.m. EST Sat.)
    Opportunity is currently 8,268 miles from Mars, traveling at 7,758 miles per hour. The craft's speed will continue to increase as the Martian gravity pulls Opportunity to the planet."

    Right on target. To be confirmed by bright flash in few minutes.

    --
    I doubt that we will ever figure out - and I suspect that even if we did figure out we couldn't do much about it
  92. 0417 GMT (11:17 p.m. EST Sat.) by Anonymous Coward · · Score: 0

    Mission control feeling effefcts of /.'ing
    Comun icationss di tvie

    [No Carrier]

  93. Re: Technically... by MachDelta · · Score: 2, Informative
    I hate to be a nitpick, but the exact quote (with context) is:
    Andropov: Excuse me, but I think I know how to fix this.
    Watts: Move it! You don't know the components!
    Andropov: [annoyed] Components. American components, Russian Components, ALL MADE IN TAIWAN!!!

    Oh, and he has another quote I liked too:
    Lev Andropov: This is how we fix things on Russian space station!
    [hits panel with tool]

    But maybe I just like it because thats how I tend to fix things too ;)
  94. Re: Copying links from the topic by some+guy+I+know · · Score: 1
    You can read all about it at: Spaceflight Now -
    You can also read about it at Slashdot, along with informative viewer comments that repeat the link that was given in the original article.
    --
    Those who sacrifice security to condemn liberty deserve to repeat history or something. - Benjamin Santayana
  95. Spirit is flash-based? by Anonymous Coward · · Score: 0

    Couldn't NASA just hit the "Skip Intro" button in the corner to avoid all these problems?

    1. Re:Spirit is flash-based? by SmurfButcher+Bob · · Score: 1

      The truth about the DEBUG info... the problem was caused by the orbiter being wide-open to relay anything sent to it. Luckily, I've got a dish array in my back yard, and was able to sniff some of that big burst of data they'd received...

      <HTML><HEAD>
      <STYLE></STYLE>
      </HEAD>
      <BODY bgColor=#ffffff>
      <DIV><FONT face=Arial size=2>
      <P><FONT size=2>Buy cheap medications online, no prescription needed.<BR>We have
      Viagra, Pherentermine, Levitra, Soma, Ambien, Tramadol and many more
      products.</FONT> <BR><FONT size=2>No embarrasing trips to the doctor, get it
      delivered directly to your door.<BR><BR>Experienced reliable service.<BR>Most
      trusted name brands.<BR></FONT></P>
      <P><FONT size=2>Your solution is here: <U><FONT color=#000080
      size=1>http://nclxaibfbr.mypillsour ce.com</P></U>< /FONT></FONT></FONT> </DIV></BODY></HTML>

      --

      help me i've cloned myself and can't remember which one I am

  96. Re: Technically... by MoFoQ · · Score: 1

    it was paraphrased!

    also..don't forget duct tape and WD40 (or a paper clip and a straw for MacGuyver).

  97. Soulsuckers bite Spirit by Doc+Ruby · · Score: 1

    The Martian probe "Spirit" has been captured and subverted by the Martian vampires. They have now inserted their virus code into the "debug" data received from Spirit. What security measures has NASA applied to the tainted data?

    Time will tell if the vampire digital virus does to computers what their live virus does to humans. When the Net is swarming with zombie processes that can only be killed -SILVER, the biters' shadow will have fallen across our whole planet. Meanwhile, work on the SOLASER (Solar Optical Light Amplification by Stimulated Emission of Radiation for Killing Vampires) proceeds at a feverish pace. Flooding Earth's corrupted fiber veins with the beneficient bandwidth of the Sun might just keep us out of their icy clutches, and a longrange transorbital freespace burst might even take them out decisively. If you're pulling allnighters in a NOC, or have some surplus SDI gear, please volunteer for betatesting the SOLASER. Stake 'em & bake 'em!

    --

    --
    make install -not war

  98. God Speed Opportunity by Anonymous Coward · · Score: 0

    Your six minutes of hell are nearly over.

    1. Re:God Speed Opportunity by Anonymous Coward · · Score: 2, Informative

      Well Done NASA.. bringing space to us is the next best thing to taking us to space.

      0508 GMT (12:08 a.m. EST)
      A good signal is still being received! Unlike the Spirit landing where signal was lost immediately after touchdown, Opportunity continues to talk to Earth.

      0506 GMT (12:06 a.m. EST)
      After a short loss of signal from the rover, a strong signal is now being received as Opportunity arrives on Mars!

      0505 GMT (12:05 a.m. EST)
      BOUNCING ON MARS! Mission Control has received a signal of Opportunity bouncing on the surface of Mars.

  99. +5 Hysterical by thatguywhoiam · · Score: 1

    This is one of the funniest things I've ever read on Slashdot.

    --
    If Jesus wants me it knows where to find me.
  100. "Opportunity" for a repeat performance? by NoData · · Score: 1

    The system is rebooting no matter which flash memory is being accessed, it has the same bug both ways, so the flash ram itself looks to be OK, but the interface between the flash ram and the software looks to be causing resets ...Even if there were more backup flashrams, it looks like they'd still have this problem... ...But then sending two rovers would also negate problems, and thats just what they've done

    Except, isn't the flash interface mechanism identical on Opportunity? Is this a design flaw or a flawed instance? I wonder how they will circumvent this same scenario with Opportunity...

  101. We need open source rover software by HangingChad · · Score: 3, Interesting
    I'd put the /. community up against NASA any day. Instead of trying to be so secret about everything, open the software up to the community and let the collective propose solutions to some of these issues. Hey, it's our tax dollars developing all this stuff, why can't we play too?

    Besides robot exploration software would be handy right here. It would be neat to be able to send a research bot out in the deserts, deep oceans and jungle canopies of the world. Machines can go where we can't.

    Individually you can be damn annoying sometimes, but I'm constantly amazed and delighted by the collective intelligence of the /. pack.

    --
    That's our life, the big wheel of shit. - The Fat Man, Blue Tango Salvage
  102. Update on Opportunity landing: by Jmstuckman · · Score: 1

    The Opportunity spacecraft has stopped bouncing and has come to rest on the surface of Mars safely! At this time, it appears that the landing was flawless and everything occurred as expected. Al Gore and Arnold witnessed the event with NASA from the JPL. After landing, the spacecraft bounced and then rolled from several minutes (the extended roll was due to the flatness of the landing site.) Initial diagnostics performed by the spacecraft detected no faults.

  103. I believe they have two... by kngborg · · Score: 1

    but they both fail so they think it might be the io channel or the io bus that is damage.

  104. It's simple actually by HarveyBirdman · · Score: 2, Informative
    I've actually consulted on this for another group inside my company. You don't wait for a cosmic ray to change a programming bit in an FPGA.

    You have two or more running in parallel. While one is running, the next reloads from ROM. When it's loaded and synchronized, you switch to it, and load the next one. You do that in series, over and over, so you're only using any particular FPGA for a couple of seconds at a time, and their configurations are constantly being refreshed. It's a very simple idea that can be done now.

    --
    --- Ban humanity.
    1. Re:It's simple actually by juglugs · · Score: 1

      Or design FPGA's with ECC built in to the configuration RAM.

      ASIC's aren't too practical because they can't be re-programmed if anything goes wrong - FPGA's can get a new bitstream from earth...

      --
      This sig is in Spanish when you're not looking....
    2. Re:It's simple actually by HarveyBirdman · · Score: 1

      Yeah, that's the idea for making the actual download error free, but once it's downloaded and configured, you have a further problem. A cosmic ray comes along and flips a bit in the FPGA's configuration SRAM *after* configuration, and everything is ruined. An OR gate somewhere is now and AND gate, or a signal is rerouted to the wrong register. You need to do a reload. So just do constant reloads, and when a bad event happens, its effect lasts just for a second or less.

      --
      --- Ban humanity.
    3. Re:It's simple actually by juglugs · · Score: 1

      True, but then you have to consider that the configuration RAM only has finite reloads before it is rendered unusable - granted, it's in the millions, so for most applications, this isn't an issue, but for the kind of mission we are talking about here (to Mars) with a reload rate of 1 second, you're only going to last a couple of weeks. With an ECC algorithm running constantly in the configuration space you can correct only when necessary.

      --
      This sig is in Spanish when you're not looking....
  105. Opportunity has landed by cybrthng · · Score: 1

    Opportunity has confirmed landing :)

    wahooo

    1. Re:Opportunity has landed by jpatokal · · Score: 1
      Opportunity has confirmed landing :)

      Landing itself isn't too hard, but unlike Beagle 2, Opportunity appears to have landed in one piece. Congrats!

      Cheers,
      -j.

  106. Historical landing sites by aauu · · Score: 1

    We should begin to designate all sites of landers and moon landings as historical sites. No entry within some distance. Otherwise some kid will stomp man's first footprints on the moon out of existence. All the landers on mars will be recycled into local products for the first colonists. History is now and for the future.

    --
    When I was young, I had to rub sticks together to compute.
    1. Re:Historical landing sites by Anonymous Coward · · Score: 0

      IMHO if the colonists can make use of them, I think that's much better than having them sit around collecting dust.

  107. Me too! by theendlessnow · · Score: 1

    I just notice my Mars rover also has a bad flash. Anyone else? I smell a class action lawsuit on this one.

    Sincerely yours,
    Colin

  108. Cut it out! by Dun+Malg · · Score: 3, Insightful

    OK, you dorks (you know who you are) need to stop postulating about the memory failures having to do with static electricity, martian dust, or lack of redundancy. This is JPL and (the one case of metric vs. standard aside) they thought of all the obvious stuff during the design stage. Do you really think they're slapping their foreheads and saying "the dust! we forgot about the dust!" over in the design lab? Get real, people.

    --
    If a job's not worth doing, it's not worth doing right.
    1. Re:Cut it out! by Capt'n+Hector · · Score: 1

      Don't you find it the least bit suspicious that the thing arrives there just fine after a long time in space, and fails only when it touches a rock with it's metallic arm? Static electricity is a very plausible culprit in my mind, simply because of the statistical improbability of a flash ram failure, at all times, now.

      --
      Quid festinatio swallonis est aetherfuga inonusti?
      Africus aut Europaeus?
    2. Re:Cut it out! by HeghmoH · · Score: 1

      Maybe you forgot about that one probe where the people at JPL were slapping their foreheads and saying, "Meters?! That number was in meters?! We thought it was feet!"

      And then there was the one where they slapped their foreheads, saying, "What, you wanted average velocity on that line? We thought that was instantaneous velocity!" after the probe in question got dumped into the Atlantic.

      Then there was, "What do you mean, 'too cold to launch'? It's a frigging rocket! We have a schedule to keep!"

      I could go on. I think you get the point. They are only human, and they make stupid mistakes just like everybody else.

      --
      Mod down posts with a "Free Mac Mini/iPod" sig, they're spam!
    3. Re:Cut it out! by Anonymous Coward · · Score: 0

      Well to be fair, JPL didn't mess up the metric conversion, a company they bought a part from did.

    4. Re:Cut it out! by Anonymous Coward · · Score: 0

      Considering the fact that the thing is driving around with metal wheels, and didn't fail until well after touching the rock, then no, i don't find it likely that static damaged it. Even if there was a discharge, it would be very unlikely to damage flash ram in the heart of the rover, as it would be dissipated through the entire chasis. You ever see a computer fried by someone touching the case? Neither have I.

    5. Re:Cut it out! by HeghmoH · · Score: 1

      I haven't found any references to what you say, can you back it up? (If an AC will read this reply and respond to it....)

      The failure was actually not what I quoted, but it was rather a mix-up between pounds of force and newtons. This resulted in mid-flight course corrections not being correct.

      Can you tell me what a part supplier was doing calculating mid-course correction data?

      --
      Mod down posts with a "Free Mac Mini/iPod" sig, they're spam!
    6. Re:Cut it out! by bill_mcgonigle · · Score: 1

      Can you tell me what a part supplier was doing calculating mid-course correction data?

      IIRC, it was Lockheed Martin. The "part" was the software that put the bird into orbit.

      They used foot-pounds, apparently the standard unit in aerospace (newtons are not standard in the field).

      NASA assumed Newtons. It's not clear if Newtons were speced in the software specification or not (that I've heard).

      --
      My God, it's Full of Source!
      OUTSIDE_IP=$(dig +short my.ip @outsideip.net)
    7. Re:Cut it out! by Old+Wolf · · Score: 1

      If they can forget about a lens cap [Viking], they can forget about dust

    8. Re:Cut it out! by SuiteSisterMary · · Score: 1

      My rule of thumb is that anything that's too suspicious to be a coincidence, is probably a coincidence.

      --
      Vintage computer games and RPG books available. Email me if you're interested.
  109. opportunity is on the ground and stable by rebelcool · · Score: 1

    it has apparently landed on its side petal. That's okay, because it was designed to flip itself to an upright position no matter how it lands.

    --

    -

  110. Totally Recalling This... by JollyGoodChase · · Score: 1
    Also in Mission Control is Arnold Schwarzenegger, California's governor. He and Gore are talking to the entry team and looking at data screens just moments after touchdown.

    [Douglas Quaid seeing his real personality on the screen]
    Douglas Quaid: Now get your ass to Mars.

  111. S.H.I.T. Happens! by Anonymous Coward · · Score: 0

    I couldn't believe my ears when one of the guys at the 12:00 PST breifing said: "It's a *S*oftware *H*ardware *I*nterface *T*hing"

  112. Re:They found the problem MAA help needed by saskboy · · Score: 1

    I should have added MOL insurance to the Martian Automobile Association brochure.

    Those MOLiens get the IntergalacticNet so screwed up, their email doesn't work right, IE is messed up, and when they installed MOL on the rover it started saying:
    "You've Got Rocks!"

    --
    Saskboy's blog is good. 9 out of 10 dentists agree.
  113. Transcend humanity first! by Thinkit3 · · Score: 0

    It is likely that we don't even need flight to transcend humanity. Isolate the area of consciousness in the brain, and put it in charge of a silicon computer. Then it's trivial for a single highly protected, highly intelligent sentient being to go to Mars (if he would even want to).

    --
    -Libertarian secular transhumanist
  114. An opportunity? by pair-a-noyd · · Score: 1

    Now that opportunity has arrived safely maybe it can run over and hit control-alt-delete for the poor little thing? (will we really ever know the truth that Spirit is suffering a case of BSOD?)

    One more and they'll have Huey, Dewey and Louie..

  115. Of course. by Anonymous Coward · · Score: 0

    Of course the politicians showed up. If the first one failed, do you think they'd have been there for this one?

  116. Parent is a moron. by Anonymous Coward · · Score: 0
    Defective flash RAM just happens sometimes
    What they should have done is tested the ram on earth

    Are you kidding? You cannot possibly be serious. They should have tested it? This is freaking NASA. There is no doubt in my mind that every piece of flash ram was tested 50 gazillion times and simulated more than that. They test everything that goes through there more than most people would, and with good reason (i.e., their stuff has to work in space). I remember reading that it takes em like a couple years to get a new meal approved to go up, to test it out. *Years* for a freaking meal, and you think they just sent up some random flash card they bought at best buy?!?! I doubt it.

    Plus, if you had bothered to read most (or any) of the reports out there, you'd see that it doesn't seem like it's defective flash ram; it's more likely the interface between the software and that ram.

    I hate to be mean, but god you're a moron. The fact that anyone would moderate your drivel as "insightful" is just scary.

  117. Re:Monday morning quarterback: RTOS tradeoffs by fobfob · · Score: 1

    "Hard real-time software cannot afford to have a complex layered structure and lots of conditional code that adds unpredictable delays."

    Vxworks is the worst of both worlds, not only does it not have the memory protection (unless you use AE), but its "hard" real time performance is not so good because system calls cannot be interrupted.

    A modern operating system such as INTEGRITY has excellent latency characteristics as well as memory and CPU bandwidth protection...

    BTW I am not a salesman for Green Hills, just a humble engineer who recommended them for his embedded PPC project only to be overruled by non software people and ended up with VxWorks... which by the way has done it's job on this project very well. I would still be keen on trying something a little more up to date but overall WRS has come to the party with most major issues, the only problem is you have to ask, and know what to ask, to get a result. They are just as clueless/lazy when it comes to finding problems as you are... unless you are NASA of course.

    We never bought FFS though, maybe that was a good decision! hehe

  118. Dodgy flash RAM by Anonymous Coward · · Score: 0

    I bet they wish they took at an on-site warranty on that Flash Memory now.

  119. Flash!! by maroberts · · Score: 1

    Saviour of the Universe!!

    [with apologies to 1930s movie serials and Queen]

    --

    Donte Alistair Anderson Roberts - hi son!
    Karma: Chameleon

  120. First man on Mars! by JaredOfEuropa · · Score: 1
    I hope they bought the extended warranty.
    Which would mean on-site repair service. First man on Mars will be a Dell repair guy swapping out a bad memory module.
    --
    If construction was anything like programming, an incorrectly fitted lock would bring down the entire building...
  121. Cripple mode!?! by niittyniemi · · Score: 1

    That's even worse than talking about man-uals and other sexist remarks, the insensitive clods!

    I note also that RAM is a male sheep. Why not EWE? Because NASA is full of women and disabled person oppressors that's why!

    Pull their funding and disband NASA.

    --
    The Machine stops.
  122. Hear! Hear! I couldn't say it any better! by Teancum · · Score: 3, Interesting

    I was going to mod this one up, but I decided to give this reply some more emphasis by actually replying with some thoughtful encouraging words instead.

    It would be nice to be able to have some folks at JPL throw down the source code and engineering schematics and say to the geek/space/engineering community at large "We have a problem here and could use your suggestions to see if we can get this fixed."

    This (the mars missions) is obviously a big hit, as measured by replies on Slashdot, the number of hits on the website at JPL, stories in mainstream media, and other reasonable metrics to gague popularlity of a project. I'm sure that there are several geeks out there that wouldn't mind digging into the source code.

    The only reason I could see the engineers not wanting to do that is to open themselves up to obvious scrutiny for poor engineering and coding. (Whadda you mean the global variable named temp is the only variable. We also have temp2, temp3, and temp4. What do the numbers mean in those mean? You can get it from context, can't you?) That and some people just aren't used to allowing other into their "domain".

    Being 100% funded by public money should also be further reason for why this should be opened up. I also totally agree.

  123. /. was LUCKY by MisterSquid · · Score: 1

    Speculation is one thing, diagnosis and debugging is another. /. may be full of kibbutzers, but not a damn one us who doesn't work for NASA did a thing, really. In the end, even those who hit it "right on," what did they actually know about the situation? Next to nothing, and this means that their solution was a guess. Just because the guess was correct does not mean anything merit-worthy was done by the guesser.

    Nasa engineers, 1: Slashdot, 1/2 (a lucky guess is worth something, I suppose)

    --
    blog
  124. Cripple mode? by thedillybar · · Score: 1
    Cripple mode?

    Sounds a lot like safe mode to me. Let's hope it works a hell of a let better than safe mode or we're screwed.

  125. Re:Monday morning quarterback: RTOS tradeoffs by AaronW · · Score: 3, Insightful

    As someone who has programmed VxWorks (including AE) for several years, I can say AE is a buggy piece of crap. We moved to AE for our project and eventually had to dump it since it was so buggy and slow. Also, as far as flash filesystems go, VxWorks ONLY SUPPORTS FAT, and not even FAT32, so it isn't a very robust filesystem. Not only that, because it's FAT there is no wear level support. I believe there also isn't the equivelent of chkdsk either. I also imagine that it can't handle faults in the filesystem (as if anything ever could deal with faults in a FAT filesystem very well).

    With VxWorks you can often get away without any filesystem because all the code is linked together in one big monolithic file. Separate tasks are not separate files (although you can have loadable object files).

    Yes, AE does provide memory protection domains, but it still doesn't clean up after a task dies. Sure, you can free the memory, but not open files, semaphores, pipes, or other things. Malloc in AE is improved over the braindead implementation in standard VxWorks, but it still has a long way to go. For example, it can't free up open file descriptors, semaphores, or other items associated with a task because a task usually isn't associated with it. So if you have a task that acquired a semaphore and dies, that semaphore will never be released.

    Hell, Wind River couldn't even get malloc right! Their malloc has got to be the worst implementation I've ever seen! They place free blocks in sorted order (smallest to largest) in a linked list after attempting to combine a new free block with neighboring free blocks. The next time you allocate, it walks the entire linked list until it finds a block large enough! In our case we wound up with tens or even hundreds of thousands of small blocks causing our watchdog timer to kick in because malloc became impossibly slow. AE improves this to use a tree instead of a list, but it still fragments. I ripped out the Wind River implementation and replaced it with Doug Lea's dlmalloc and all our malloc problems were solved, and the fragmentation went from tens of thousands of fragments to only a few dozen.

    For an RTOS being pushed for networking it isn't very good there either. It comes with an ancient BSD TCP/IP stack. If you have a device and want to see if it runs VxWorks, just run nmap against it. If it says TCP sequence number guessing is trivial, you can bet it's probably running VxWorks.

    In todays world, VxWorks doesn't cut it any more. Any complex project should choose a real OS like QNX or even embedded Linux over VxWorks. For realtime, Linux usually isn't very good, but Timesys appears to have solved that problem nicely.

    VxWorks isn't even that good at realtime. Usually you can't get any better resolution than half the system tick rate (usually 10ms), so you can't get better than 20ms of resolution in many cases.

    I've also heard many rumours that Wind River is dropping AE, or at least not pushing it. We're not the only ones to have been burned by it. I've heard of only one other company that used it, and they were also burned. I think it was a startup that went out of business.

    In VxWorks, all tasks share the same memory space. Think of every "task" as really a thread and you get the idea. In other words, if a "task" dies, the only way to clean up the system is to reboot.

    Also, VxWorks doesn't scale. The more tasks you have, the slower it runs (i.e. no O(1) scheduler). And with the shared memory, the more complex the code, the harder it is to debug and develop a stable system.

    QNX would have been a much better solution. In QNX, the core OS is very small, and if a task dies it can easily be restarted. In QNX, everything is a task with memory protection. The TCP/IP stack is separate from the core OS, for example, as are all the other drivers. If a driver crashes, it won't take the OS with it. Context switching in QNX is also very fast, faster than VxWorks even though memory protection is involved.

    -Aaron

    --
    This post is encrypted twice with ROT-13. Documenting or attempting to crack this encryption is illegal.
  126. Bad flash experiences by Anonymous Coward · · Score: 0

    So my roommate worked for an engineering firm, and his job was designing an interface for a piece of circuitry that used flash memory to hold its programming. He sent some data to the circuit, and the thing stopped responding. He checked it inside and out, looked at the data he sent, checked the whitepapers about the electronics, and came to the conclusion that, because he had sent improperly-formatted data to the circuit, the little controller chip freaked out and deleted everything on the flash memory. He didn't send "delete everything please" or "wipe the flash" data, just something like "0225" when the chip was waiting for "225". If your circuitry isn't robust enough to withstand improperly-formatted data to a very small degree, you should probably redesign it to be more reliable.

    Not that NASA did the same thing; when NASA builds a piece of equipment to last, it almost invariably does. That only 17 people have died in the thousands of launches and recoveries the space program has been through is a testament to the engineering abilities in the organization.

  127. Re:Monday morning quarterback: RTOS tradeoffs by AaronW · · Score: 2, Informative

    I can tell you that AE is in many ways WORSE than the standard VxWorks. It has a lot more bugs and is quite a bit slower. Think of regular VxWorks with memory protection hacked in, not designed in from the ground up.

    As a VxWorks programmer for the last 5 years, I can honestly say VxWorks is a PoS that is losing market share at a tremendous rate to the likes of embedded Linux and QNX. Wind River decided to spend tons of money buying add-in products like Routerware instead of improving their RTOS. It was a huge waste of money and now they're paying for it. They're losing money hand over fist and have had a lot of layoffs lately. They were good at one time, but they have fallen far behind the curve now in embedded RTOS design, especially for complex systems.

    VxWorks comes with support for a FAT flash file system, a completely broken malloc implementation, an ancient BSD TCP/IP stack, poor RT support, no memory protection, and no way to clean up after a task that dies. Not only that, it usually costs a fortune, but I've heard they're willing to sell it very cheap now because they're desparate.

    I looked into embedded Linux for our next generation hardware and software and Timesys appears to have a very nice solution with hard real-time support. The kernel is fully preemptable using semaphores instead of spinlocks and has priority inversion support. They also offer resource reservations, so I can say "I want this task to be guaranteed 5.73ms of execution time every 9.8ms" where after 5.73ms the task either gives up the CPU entirely, or else changes to a non-RT priority to not starve other tasks. It's really quite clever. Not only that, unlike RT-Linux there isn't a separate API for RT vs non-RT tasks. Monta Vista Linux is soft real-time. It cannot guarantee context switching time, nor does it deal with priority inversion. In RT, priority inversion can be a major problem (see the first Mars rover for an example).

    For an example of priority inversion say you have 3 tasks, a low priority, medium priority, and a high priority. The low priority task acquires a mutex semaphore to protect a critical section and starts processing. It is interrupted by a medium priority task. Meanwhile, a high-priority task unblocks and attempts to grab the mutex. The high priority task will block until the medium priority task blocks so that the low priority task can release the semaphore. A common solution is priority inheritance. With priority inheritance, as soon as the high priority task attempts to acquire the mutex semaphore, the low priority task has its priority bumped to that of the high priority task until it releases the semaphore. In this way, the low priority task will interrupt the medium priority task so that the high priority task won't have to wait as long.

    QNX is also a very good alternative. Very fast context switching and extremely robust memory protection. I think with QNX you can even buy a license suitable for use in medical devices (i.e. you absolutely cannot afford to have the OS crash for any reason).

    I've heard rumours that Wind River is dropping AE since nobody is using it. After our experience, I pity whoever tries it.

    Also, unless you get the source to VxWorks, which usually costs a lot of $$$, debugging is a complete nightmare, especially when you hit a bug in the Wind River code (and there's a lot of them). Hell, they couldn't even implement malloc right!

    Wind River is coming out with version 6 of VxWorks, but it is basically an enhanced version of AE. I'm not holding my breath.

    -Aaron

    --
    This post is encrypted twice with ROT-13. Documenting or attempting to crack this encryption is illegal.
  128. Did you learn nothing from minority report? by CarlDenny · · Score: 1

    You should always trust the results from the more powerful female computer.

  129. martian hackers by stuuf · · Score: 1

    There goes my theory of an alien changing inittab to make it to boot to runlevel 6.

    --

    Everyone is born right-handed; only the greatest overcome it

  130. Re:Monday morning quarterback: RTOS tradeoffs by gnalre · · Score: 1

    Interesting stuff.

    Just one correction, VxWorks now sells its source code quite cheaply as long as you are in the present licence model(Which unfortunately cost the earth).Not as cheap as linux however...

    --
    Choose your allies carefully, it is highly unlikely you will be held accountable for the actions of your enemies
  131. Spirit problem duplicated in lab by dellis78741 · · Score: 1

    NASA engineers reported on Monday that they reproduced in their test facility the same error that hit the Spirit Rover last week. Basically, the file system didn't have any proper limits set and had created too many files, leading to a situation where the software would gag. They are now deleting older files such as those generated while traveling to Mars. see: http://www.nytimes.com/2004/01/27/science/space/27 MARS.html

    --
    ======= ~\_/~\_O Burmese
    1. Re:Spirit problem duplicated in lab by dellis78741 · · Score: 1

      Here's a detailed, blow-by-blow from Jennifer Trosper, on how they have worked through Spirits' problem so far. http://www.spaceflightnow.com/mars/mera/040126spir it.html

      --
      ======= ~\_/~\_O Burmese
  132. Re:Monday morning quarterback: RTOS tradeoffs by pslam · · Score: 1
    This is the conventional wisdom, and in my experience, this particular nugget causes more embedded and real time software projects to fail than any other.

    The conventional wisdom is "a production system doesn't have bugs so we don't need to protect against them". They obviously never use their own chip designs. It takes so much longer to ever get the damn system working, and when it works you still don't know if you're just being lucky with memory corruption. I'm really going to have to shoot the next manufacturer I see quoting that memory protection is unnecessary. Unfortunately, embedded chip manufacturers all seem to have forgotten that development time needs to be factored into cost.

    First off, on a modern PowerPC processor, memory protection (that is, without virtual memory support) can be implemented very cheaply. If you can do it just with the IBAT/DBAT registers, it should be a constant-time overhead, which is good enough for hard-real time. Oddly enough, I can't find a single reference on the net that measures the cost of memory protection alone on a modern CPU. Anyone? Anyone?

    I don't know about PowerPC, but a typical ARM MMU has a very small overhead, especially if you use large page sizes. From memory, there's 1 page directory (or large page) per 4MB of address space, which points to a page table containing 32 bit entries for each page (4KB or 64KB pages). There's something like 32 TLBs, and lookups are done in hardware straight from memory.

    So, every time there's a TLB miss, there's an additional latency of two 32 bit SDRAM reads. You can put the tables in internal SRAM (if you have it), which means it only takes a few extra clocks on a TLB miss. TLB misses generally aren't an overhead in my experience - you're going to be dwarfed by the cache miss cost anyway.

    Basically, anyone telling you an MMU has overhead is talking out of the wrong orifice. There are very, very few things which would ever need better latency than a couple of cache misses and TLB misses. If you need that kind of low latency, it should be done in hardware.