Stress-Testing Software For Deep Space

← Back to Stories (view on slashdot.org)

Stress-Testing Software For Deep Space

Posted by samzenpus on Wednesday October 10, 2012 @02:24PM from the phone-the-help-desk dept.

kenekaplan writes "NASA has used VxWorks for several deep space missions, including Sojourner, Spirit, Opportunity and the Mars Reconnaissance Orbiter. When the space agency's Jet Propulsion Laboratory (JPL) needs to run stress tests or simulations for upgrades and fixes to the OS, Wind River's Mike Deliman gets the call. In a recent interview, Deliman, a senior member of the technical staff at Wind River, which is owned by Intel, gave a peek at the legacy technology under Curiosity's hood and recalled the emergency call he got when an earlier Mars mission hit a software snag after liftoff."

4 of 87 comments (clear)

Min score:

Reason:

Sort:

Re:Seems like a rationalization by Sasayaki · 2012-10-10 14:56 · Score: 5, Insightful

My understanding is that the thinking goes like this.
Sure, there are newer processors that claim to fit the bill. But space hasn't changed so much since the Apollo days that we need all new processors; by and large anything that needs "heavy lifting" CPU wise can be transmitted back to Earth. For unmanned probes, there's very little demand for high speed CPU tasks that can't be offloaded to Earth. And even if there was, when your latency back to your operator is about 14 minutes (with an extra 14 to receive further instructions, plus the time it takes to interpret the previous data set, determine new instructions, then program those instructions), that's a lot of down time to work on various tasks.
The Mars rover CPUs, I imagine, spend the vast majority of their time idling.
However... the old stuff works. It has its faults and flaws, sure, but they're extremely well known and documented. You can work around them. You have the old grognards that have been kicking around since Apollo who know every damn thing about them. They're risky, sure, but it's a managed, controlled, limited and understood risk. But new processors are *new*. You lose that element of certainty, and the CPU is the heart of a probe. You lose it, you're fucked.
You're trusting the mission, a mission that costs billions of bucks, to a new, untested device that hasn't been field tested, hasn't got that certainty, and *you just don't need*.

--
Check out my sci-fi book "Lacuna" at http://goo.gl/MVxX8
Re:Seems like a rationalization by Anonymous Coward · 2012-10-10 15:02 · Score: 5, Informative

that's why land-based projects like SKA for example which also take decades to complete are designed taking moore's law into account, leading to a very funny situation in which the project starts, they start building stuff but the computers that will run the thing are still 10 years away... (and I guess everybody just hopes computers will keep up or else...)
Also you must take into account that the actual instruments are being built fairly early i.e. 5 or more years before launch since there is a LOT of testing calibration more testing etc. Additionally, when the stake is a billion dollar project like these you tend to leave fancy new things and favor old proven and well documented tech. Just in case...
If not you just mount two instruments if you have space and money a fancy new one and the old usual thing (such is the case for Solar Orbiter for example)
Re:"earlier Mars mission" == MER-A Spirit by AaronW · 2012-10-10 15:20 · Score: 5, Informative

With my long experience with VxWorks this doesn't surprise me. VxWorks is not the most robust RTOS. Think of it as a multi-tasking MS-DOS. The version they used has no memory protection between processes and I have found numerous areas of VxWorks to be badly implemented or downright buggy. Up through version 5.3 the malloc() implementation was absolutely horrid and suffered from severe fragmentation and performance problems. On the platform I was working with I replaced the VxWorks implementation with Doug Lea's implementation (which glibc was based off of) and our startup time dropped from an hour to 3 minutes. I was also able to easily add instrumentation so we could quickly find memory leaks or heap corruption in the field, something not possible with Wind River's implementation. After reading about the problems with the filesystem I looked at the Wind River filesystem code. It was rather ugly. They map FAT on top of flash memory (not the best choice) and the corner cases were not well handled (like a full filesystem).
Similarly, their TCP/IP stack sucked as well. If you can drop to the T-shell through a security exploit you totally own the box (i.e. Huawei's poor security record).
VxWorks is fine for simple applications, but for very complex applications it sucks. At least the 5.x series do not clean up after a task if it crashes because it does not keep track of what resources are used by a task. A task is basically just a thread of execution. All memory is a shared global pool. At the time it did have one feature that was useful that was lacking in Linux, priority inheritance mutexes. These are a requirement for proper real-time performance and I believe are now included in Linux.

--
This post is encrypted twice with ROT-13. Documenting or attempting to crack this encryption is illegal.
Re:Seems like a rationalization by Animats · 2012-10-10 19:02 · Score: 5, Informative

I get the idea of being hardened to radiation but it was my understanding we have newer processors that fit the bill on this.
Radiation-hardened processors are hard to get. For one thing, they're export-controlled, so if you make them in the US, you can't sell many. Atmel makes a rad-hard SPARC CPU, and they've sold 3000 of them. Nobody seems to have built a modern x86 design or even an ARM in a rad-hard technology.
There's a basic conflict between small gate size and radiation hardness. The smaller the transistors, the more likely a stray particle can damage or switch them. So the latest small geometries aren't as suitable. Also, the more radiation-hard processes, like Silicon on Sapphire, aren't used much for high-volume products.
As a result, rad-hard parts are an expensive niche product. It's not inherently expensive to make them, but the volume is so small that the cost per part is high.