Slashdot Mirror


Spirit Sends Debug Information to Earth

gfilion writes "NASA has released a press release that says: 'Shortly before noon, controllers were surprised to receive a relay of data from Spirit via the Mars Odyssey orbiter. Spirit sent 73 megabits at a rate of 128 kilobits per second.'" They've been having communications troubles with Spirit since Wednesday, so it's good to hear from it again, even if the data is just filler.

5 of 477 comments (clear)

  1. Spirit rebooting 60 times a day by G4from128k · · Score: 5, Informative

    CNN is reporting that spirit is self-rebooting 60 times a day. NASA suspects a hardware fault that is causing the processor to detect trouble and automatically reboot.

    --
    Two wrongs don't make a right, but three lefts do.
    1. Re:Spirit rebooting 60 times a day by spaceyhackerlady · · Score: 5, Informative
      Something like 2/3 of NASA's recent missions have failed in some way or another. Is it quite possible that NASA engineers simply have not mastered the art and science of designing hardware and software operable in the harshest of environments?

      Maybe they have. That's how they know how difficult a task it is to get it right.

      I am something of an aerospace engineer, and work professionally with real-time systems (based on VxWorks - fancy that!). Let me illustrate the kind of bizarre bug that can happen on a spacecraft, and how it was fixed from the ground.

      Consider a satellite with a simple on-board computer. To guard against the OS locking up (no matter how good the software is, you can't protect against radiation-induced bit flips in memory), it has a hardware watchdog timer. The software resets the timer periodically, before the hardware can reboot the system. Things run well for a while.

      Then the on-board system starts resetting for no apparent reason. No suggestion of memory problems, no apparent hardware problems. The problem is traced to a radiation-induced change in component values in the watchdog timer, causing the timer to go off sooner than expected. Until the satellite is finally turned down a few years later, an important task of the ground stations was checking for watchdog resets and adjusting the software watchdog task accordingly. When the software eventually spent all its time resetting the watchdog timer, the satellite could no longer function and was turned down.

      The moral of the story: space is weird and hostile. Things happen. No matter how hard you try, you cannot always get it right.

      ...laura

  2. It wasn't exactly 'filler' by Eevee · · Score: 5, Informative

    Only a couple of frames were fillers of random values. Most of the frames were engineering data. No actual scientific data came down, though.

    Still, it's a good sign that it's still able to talk.

  3. Re:Linux Cost Tax Payers at least $410M...nothing by TheGrayArea · · Score: 5, Informative

    You might want to check your facts before you spew. While the ground system is heavy on Linux according to the article you referenced, the actual OS on the rover itself is VxWorks from Wind River.
    http://www.windriver.com/news/press/20040105.html

    --

    This space for rent.
  4. Re:Wind river by AaronW · · Score: 5, Informative

    I wouldn't brag. I've been programming VxWorks for several years now and all I can say is it's a piece of crap for a complex system.

    VxWorks does not provide any memory protection (well, AE does, but it's so buggy nobody uses it).

    If a task dies, it does not clean up after it. All memory is global, i.e. any task can overwrite memory for any other task.

    Wind River couldn't even implement a decent malloc implementation. I had to replace it with Doug Lea's DLMalloc code (which glibc's malloc is based off of). It fragments horribly, and becomes increasingly slower the more free blocks exist.

    Just by replacing malloc, I brought the time down on our box from 50 minutes to under 3 minutes and went from tens of thousands of fragments to a couple of dozen.

    If you want a reliable embedded system with a lot of complexity, go with QNX or perhapse a good embedded Linux (I like Timesys Linux myself - good realtime support).

    At least with QNX if there's a problem in a task, it's much easier to isolate it and not kill the entire system. As it is on the product I'm working on, if a task dies about the only way to recover is to reboot. Also, VxWorks has piss-poor built-in debugging support. Sometimes you can get a stack trace. Tracing the heap is virtually impossible (and because it's a global memory pool, you don't even know what blocks were allocated by what task or even how much memory each task has allocated). In the product I'm working on I added such support to find memory leaks and detect memory corruption.

    VxWorks AE does provide memory protection. We tried to use it, but it was so buggy and slow we had to drop it and go back to standard VxWorks.

    VxWorks hasn't really changed in the last few years and Wind River is losing customers like crazy to the better alternatives. They're hemmoraging money at an astronomical rate and quickly losing market share to the likes of QNX and Linux.

    Even the realtime performance of VxWorks isn't that great. The finest granularity for a reliable timer is 1/2 the system tick rate (often no more than 20ms resolution).

    VxWorks doesn't have a shell as such either. The commands you type in are functions with parameters to those functions. You can do things like my_global = global_a + 7

    or

    my_func(&my_global, 3)

    on the command line, but it's not at all like a traditional command line.

    Most real-time Linux implementations arn't all that great either from my research into it. Most don't deal with priority inversion, or require a completely separate set of APIs for RT tasks (i.e. RT Linux). I found Timesys Linux to solve most of these issues and it looks like our next generation will be based off of either Timesys Linux or QNX.

    -Aaron

    --
    This post is encrypted twice with ROT-13. Documenting or attempting to crack this encryption is illegal.