Stress-Testing Software For Deep Space
kenekaplan writes "NASA has used VxWorks for several deep space missions, including Sojourner, Spirit, Opportunity and the Mars Reconnaissance Orbiter. When the space agency's Jet Propulsion Laboratory (JPL) needs to run stress tests or simulations for upgrades and fixes to the OS, Wind River's Mike Deliman gets the call. In a recent interview, Deliman, a senior member of the technical staff at Wind River, which is owned by Intel, gave a peek at the legacy technology under Curiosity's hood and recalled the emergency call he got when an earlier Mars mission hit a software snag after liftoff."
While I buy that the landing systems need an RTOS, I doubt Curiosity does. Image processing that happens with "precision"? Do x86 processors not process images precisely enough? I get the idea of being hardened to radiation but it was my understanding we have newer processors that fit the bill on this. The rest of this seems like a rationalization for using old hardware. However, as an engineer for the government it's possible I'm just old and embittered.
So, You work for Microsoft?
With my long experience with VxWorks this doesn't surprise me. VxWorks is not the most robust RTOS. Think of it as a multi-tasking MS-DOS. The version they used has no memory protection between processes and I have found numerous areas of VxWorks to be badly implemented or downright buggy. Up through version 5.3 the malloc() implementation was absolutely horrid and suffered from severe fragmentation and performance problems. On the platform I was working with I replaced the VxWorks implementation with Doug Lea's implementation (which glibc was based off of) and our startup time dropped from an hour to 3 minutes. I was also able to easily add instrumentation so we could quickly find memory leaks or heap corruption in the field, something not possible with Wind River's implementation. After reading about the problems with the filesystem I looked at the Wind River filesystem code. It was rather ugly. They map FAT on top of flash memory (not the best choice) and the corner cases were not well handled (like a full filesystem).
Similarly, their TCP/IP stack sucked as well. If you can drop to the T-shell through a security exploit you totally own the box (i.e. Huawei's poor security record).
VxWorks is fine for simple applications, but for very complex applications it sucks. At least the 5.x series do not clean up after a task if it crashes because it does not keep track of what resources are used by a task. A task is basically just a thread of execution. All memory is a shared global pool. At the time it did have one feature that was useful that was lacking in Linux, priority inheritance mutexes. These are a requirement for proper real-time performance and I believe are now included in Linux.
This post is encrypted twice with ROT-13. Documenting or attempting to crack this encryption is illegal.
Up through version 5.3 the malloc() implementation was absolutely horrid and suffered from severe fragmentation and performance problems.
I talked to one of Curiosity's software engineers the day it landed... he mentioned that one of their coding rules was: no malloc() allowed.
I don't care if it's 90,000 hectares. That lake was not my doing.