Stress-Testing Software For Deep Space

← Back to Stories (view on slashdot.org)

Stress-Testing Software For Deep Space

Posted by samzenpus on Wednesday October 10, 2012 @02:24PM from the phone-the-help-desk dept.

kenekaplan writes "NASA has used VxWorks for several deep space missions, including Sojourner, Spirit, Opportunity and the Mars Reconnaissance Orbiter. When the space agency's Jet Propulsion Laboratory (JPL) needs to run stress tests or simulations for upgrades and fixes to the OS, Wind River's Mike Deliman gets the call. In a recent interview, Deliman, a senior member of the technical staff at Wind River, which is owned by Intel, gave a peek at the legacy technology under Curiosity's hood and recalled the emergency call he got when an earlier Mars mission hit a software snag after liftoff."

87 comments

Min score:

Reason:

Sort:

Seems like a rationalization by gadzook33 · 2012-10-10 14:36 · Score: 4, Interesting

While I buy that the landing systems need an RTOS, I doubt Curiosity does. Image processing that happens with "precision"? Do x86 processors not process images precisely enough? I get the idea of being hardened to radiation but it was my understanding we have newer processors that fit the bill on this. The rest of this seems like a rationalization for using old hardware. However, as an engineer for the government it's possible I'm just old and embittered.
1. Re:Seems like a rationalization by Anonymous Coward · 2012-10-10 14:47 · Score: 2, Insightful
  
  Remember, too, that Curiosity has been in the works for almost a decade. They had to commit to a spec for the computers a long time ago, so it's no wonder by today's standards things seem out of date.
2. Re:Seems like a rationalization by Sasayaki · 2012-10-10 14:56 · Score: 5, Insightful
  
  My understanding is that the thinking goes like this.
  Sure, there are newer processors that claim to fit the bill. But space hasn't changed so much since the Apollo days that we need all new processors; by and large anything that needs "heavy lifting" CPU wise can be transmitted back to Earth. For unmanned probes, there's very little demand for high speed CPU tasks that can't be offloaded to Earth. And even if there was, when your latency back to your operator is about 14 minutes (with an extra 14 to receive further instructions, plus the time it takes to interpret the previous data set, determine new instructions, then program those instructions), that's a lot of down time to work on various tasks.
  The Mars rover CPUs, I imagine, spend the vast majority of their time idling.
  However... the old stuff works. It has its faults and flaws, sure, but they're extremely well known and documented. You can work around them. You have the old grognards that have been kicking around since Apollo who know every damn thing about them. They're risky, sure, but it's a managed, controlled, limited and understood risk. But new processors are *new*. You lose that element of certainty, and the CPU is the heart of a probe. You lose it, you're fucked.
  You're trusting the mission, a mission that costs billions of bucks, to a new, untested device that hasn't been field tested, hasn't got that certainty, and *you just don't need*.
  
  --
  Check out my sci-fi book "Lacuna" at http://goo.gl/MVxX8
3. Re:Seems like a rationalization by Anonymous Coward · 2012-10-10 15:02 · Score: 5, Informative
  
  that's why land-based projects like SKA for example which also take decades to complete are designed taking moore's law into account, leading to a very funny situation in which the project starts, they start building stuff but the computers that will run the thing are still 10 years away... (and I guess everybody just hopes computers will keep up or else...)
  Also you must take into account that the actual instruments are being built fairly early i.e. 5 or more years before launch since there is a LOT of testing calibration more testing etc. Additionally, when the stake is a billion dollar project like these you tend to leave fancy new things and favor old proven and well documented tech. Just in case...
  If not you just mount two instruments if you have space and money a fancy new one and the old usual thing (such is the case for Solar Orbiter for example)
4. Re:Seems like a rationalization by Required+Snark · 2012-10-10 15:11 · Score: 0, Troll
  
  Yep, you're obviously correct. Everyone at NASA is stupid, and just by looking at a summery on Slashdot you have reached a conclusion that escaped them. Windows or plain old Linux would work just fine.
  If your comment is any indication of you native intelligence I don't know how you manage to put your clothes on by yourself. It's surprising that you haven't wandered into the street and been killed by a car. (That's just my wishful thinking, by the way.)
  Any autonomous vehicle, is by definition, a real time system. It's working in a physical environment that requires hard real time response. If the control action is not delivered in a specified interval, it is useless. The result of missing a hard deadline is crashing. Not such a good idea on Mars. The speed of light delay time is 14 minutes one way, and it's going to get longer since the Earth and Mars are now moving apart.
  All indications imply that you are as stupid as you look. Your SIG implies that you are a knee jerk right wing asshole, who assumes that all government activity is useless. I worked a JPL years ago, and everyone one I met there was bright, creative and dedicated. There were no slackers. I doubt you would last in that environment for two pay periods. You're not smart enough.
  
  --
  Why is Snark Required?
5. Re:Seems like a rationalization by Grave · 2012-10-10 15:12 · Score: 2
  
  That's just it - this sort of computing task cannot get by with 5 9's or 7 9's or a hundred 9's of reliability. It needs to be 100% reliable, which means that every potential hiccup, flaw, or design quirk is understood and documented to the nth degree, and thus can be worked around. It also means you can reliably simulate the hardware and throw all sorts of stress testing at it.
6. Re:Seems like a rationalization by Anonymous Coward · 2012-10-10 15:16 · Score: 2, Interesting
  
  Reminds me of about 10 years ago when I was working at Motorola on cell phone base stations. We switched from VxWorks to Linux and got... nothing. No performance gains, no reliability gains. Just a free OS instead of something we had to spend money licensing.
  Of course, all the extra time spent switching and testing certainly cost a lot of money in man hours.
7. Re:Seems like a rationalization by gadzook33 · 2012-10-10 15:26 · Score: 2
  
  Oh good, someone more embittered than me. I especially like how rather than provide any sort of argument as to why an RTOS is required (because...a pedestrian might walk in front of the rover?) you'd rather insult me.
  
  Not for nothing but I'm about a step away from being a hippie and I've served the government faithfully for many years. I work with some of the best and brightest and if you weren't able to cut it there, the fact that you're incredibly negative and seem like a jerk would likely only be a few of the reasons why.
8. Re:Seems like a rationalization by gadzook33 · 2012-10-10 15:31 · Score: 1
  
  Yeah, I agree except that wasn't really how his argument goes...and yes, old stuff works. But new stuff works too (also, new here could be 5 years old). Anyway, I'm not really (or at least overly) questioning their rationale. I've just seen too many programs where the same people have been there forever and it's easier to keep doing the same thing rather than try something new. Again, hopefully that's not the case at NASA but it's sure as hell the case at the Pentagon.
9. Re:Seems like a rationalization by Anonymous Coward · 2012-10-10 15:42 · Score: 3, Informative
  
  And that's where you (and most people) are mistaken.
  A RTOS is not an OS that acts "quickly", it's an OS which provide a 100% guarantee that a task will be executed in a definite time-frame, whether this needs to be 1 micro-second or 1 hour ; and which provide guarantees if the task can not be completed in this time-frame. A job neither Windows nor any flavor of Linux can achieve.
10. Re:Seems like a rationalization by Anonymous Coward · 2012-10-10 16:15 · Score: 0
  
  The cpu they used cost $50,000 plus another $350,000 for the board. This is radiation hardened high grade stuff here. You put that kind of hardware in things that require hardcore components, things like spaceships and interplanetary probes. The cpu was also spec'd out a decade ago.
  Any old x86 cpu would not fit that bill, not by a long shot.
11. Re:Seems like a rationalization by TubeSteak · 2012-10-10 17:31 · Score: 2
  
  First of all, there's a typo in TFA.
  They state the chip is a "RAD760" but they link to the RAD750 wikipedia page.
  
  Do x86 processors not process images precisely enough? I get the idea of being hardened to radiation but it was my understanding we have newer processors that fit the bill on this.
  The problem with x86 technology is that it has gotten too advanced.
  The chips have become so dense that radiation hardening is much much more difficult than it used to be.
  Increased difficulty = increased expense
  Further, I don't think you appreciate the specs of that old PowerPC chip.
  It's tolerance to 1 megarad of radiation exposure is a lot.
  You literally get what you pay for with this cup, ranging from 200 rads to 1 megarad.
  Even 500 rads is more than most space applications require.
  So in order to save money, some companies use cheaper hardware in a triple redundant configuration, in order to avoid paying out big bucks for radiation hardened boards + chips. But for a mission to mars, where reliability and power usage are critical, two old 133mhz processors are better than any of the other choices.
  The rover has just enough processing power to talk to NASA, look around, and do one other thing. And that's just fine.
  They've partly split up the workload between two processors, but if one processor failed, NASA could juggle everything with one hand.
  
  --
  [Fuck Beta]
  o0t!
12. Re:Seems like a rationalization by khallow · 2012-10-10 17:45 · Score: 3, Insightful
  
  It's worth noting that in the overall mission they generally get by with one or two 9's of reliability. There's no 100% reliability out there and nobody would be able to afford it, if there was.
13. Re:Seems like a rationalization by Meditato · 2012-10-10 17:59 · Score: 4, Informative
  
  Look, that guy ("Required Snark") might have been an asshole, but you didn't really acquit yourself well either in your original post. I cofounded and work for a real-time telemetry contractor. We use Android, but the Linux kernel isn't built to handle read-time applications reliably. There are too many things to handle in terms of time-safe task-switching, execution, multi-processing, and internal consistency in order for it to be a good RTOS. So keeping that in mind, I had to implement a real time environment in userspace that uses root and some native code in order to collect data, send data, and operate hardware in a safe, timely manner. But this isn't the best solution because I still have to deal with the fact that it's all just a frustrating abstraction sitting on top of a kernel that isn't at all concerned with what I'm actually trying to do, despite my best efforts to single-handedly make the necessary changes.
  Your "newer processors" bit is also completely off the mark. Radiation-hardened processors lag generations behind owing to the need for extensive redesign and testing. Complicating this picture is the fact that even then, they still have varying levels of reliability and power efficiency. You don't want a processor that has a microcode architecture that makes your targeted code difficult to semantically evaluate and verify. You don't want (or need) a recent processor that hasn't had extensive real-world user testing. You want a processor in the goldilocks zone, one that you've worked with before and has a community behind it.
  Keeping that all in mind, they chose a good processor, and already had an OS largely built for it based on previous missions with earlier versions of the same processor.
14. Re:Seems like a rationalization by Electricity+Likes+Me · 2012-10-10 18:21 · Score: 2
  
  The classic example is the Pentium math error: imagine if you were 2 years into the mission and then discovered that the new high speed chip you put in gives incorrect floating point calculations.
15. Re:Seems like a rationalization by Electricity+Likes+Me · 2012-10-10 18:24 · Score: 1
  
  Yeah, I agree except that wasn't really how his argument goes...and yes, old stuff works. But new stuff works too (also, new here could be 5 years old). Anyway, I'm not really (or at least overly) questioning their rationale. I've just seen too many programs where the same people have been there forever and it's easier to keep doing the same thing rather than try something new. Again, hopefully that's not the case at NASA but it's sure as hell the case at the Pentagon.
  On the other hand, NASA really doesn't have the budget to spend working up for something new either. A processor switch means new simulators, new architectures etc. I imagine for a space probe - i.e. something you can't get at ever if it breaks down - then you go with the processor you have when you start designing it, and you pick the most reliable thing you can.
16. Re:Seems like a rationalization by Anonymous Coward · 2012-10-10 18:37 · Score: 0
  
  Thats it. 640KB is enough.
17. Re:Seems like a rationalization by Anonymous Coward · 2012-10-10 18:47 · Score: 0
  
  Pedestrian crossing?
  Do you even know what a real time constraint is used for?
18. Re:Seems like a rationalization by lordholm · 2012-10-10 18:56 · Score: 3, Informative
  
  Newer missions collect too much data to transmit everything back to earth. They typically need to do local processing of for example images and other data. There is also AI aspects, for the ExoMars rover (made by Europe), the onboard computer will have a virtual scientist embedded. This virtual scientist look at the camera pictures and decide if something is worth an extra look, and may order the rover to carry out opportunistic science. I am not sure as to whether this is the case with Curiosity, by I could easily imagine this is the case. In fact, newer missions have substantial need for computational power. But, there is no software reason to do these computational tasks on the main computer, the task may as well be sent to a soft realtime helper computer, that may as well run Linux or something else. A lost image is typically not the end of the world.
  In many cases the spacecraft and rovers are also not hard realtime, but they are also not soft realtime either (i.e. we compute thruster response for t=0, only to have the thrusters fired at t+0.1 or something in that range, whether they fire within this time does not really matter except during docking, landing and separation), I was trying to push through the notion of firm realtime when I was working in the space sector, but the main problem with this notion is that we do not yet know what effects it has in terms of sw design. Any way...
  The primary reasons for running 10 year old CPUs is that, 1) specs are chosen early in the project, this is important as the CPU specs are guiding the development of the SW requirements and the actual implementation of the SW and 2) as you say, the older CPU will be battle tested before they are sent into deep space.
  
  --
  "Civis Europaeus sum!"
19. Re:Seems like a rationalization by Animats · 2012-10-10 19:02 · Score: 5, Informative
  
  I get the idea of being hardened to radiation but it was my understanding we have newer processors that fit the bill on this.
  Radiation-hardened processors are hard to get. For one thing, they're export-controlled, so if you make them in the US, you can't sell many. Atmel makes a rad-hard SPARC CPU, and they've sold 3000 of them. Nobody seems to have built a modern x86 design or even an ARM in a rad-hard technology.
  There's a basic conflict between small gate size and radiation hardness. The smaller the transistors, the more likely a stray particle can damage or switch them. So the latest small geometries aren't as suitable. Also, the more radiation-hard processes, like Silicon on Sapphire, aren't used much for high-volume products.
  As a result, rad-hard parts are an expensive niche product. It's not inherently expensive to make them, but the volume is so small that the cost per part is high.
20. Re:Seems like a rationalization by pedestrian+crossing · 2012-10-10 20:25 · Score: 4, Funny
  
  Pedestrian crossing?
  You rang?
  
  --
  A house divided against itself cannot stand.
21. Re:Seems like a rationalization by kauaidiver · 2012-10-10 20:35 · Score: 1
  
  Exactly, and imagine if we sent people there. It's amazing we sent people to the moon so many years ago!
22. Re:Seems like a rationalization by hackertourist · 2012-10-10 21:26 · Score: 1
  
  The RTOS may not be needed for image processing, but I'll bet it's handy when driving, or running other mechanical aspects.
  And once you have an RTOS for those tasks, it'd be silly to add another OS for the non-time-critical tasks.
23. Re:Seems like a rationalization by Anonymous Coward · 2012-10-10 21:53 · Score: 0
  
  Yes, it's amazing what can happen when a country decides that a stunt is very important and dedicates vast resources to it. Mars is a hundred times further away.
24. Re:Seems like a rationalization by Anonymous Coward · 2012-10-10 22:28 · Score: 0
  
  In the end, if you look at the numbers its not that expensive. A big space project like curiosity or ESA's JUICE or PLANK are in the order of a billion spent over 20 yrs.
  Medium and small spacecrafts are from 50 to a couple of hundred millions plus launch costs, also over a period of 10-20 years.
  Just compare it with other numbers you saw lately during the elections for example... (a small satellite is one fourth of romneys fortune, the banks bailout in 08 is 700 curiosity rovers)
  In terms of scientific results, which I can speak of, since I am in research and not in the industry, it makes all the difference for the scientific output of a lab.
  To put it clearly, those little things we send out there are not just for fun and cute pictures to hang on our nerdy offices...
25. Re:Seems like a rationalization by jittles · 2012-10-10 22:32 · Score: 1
  
  The government LOVES old hardware. Trust me. The AH-64D uses 486 processors. You know what? They aren't the only ones, either. I used to work for a company that designed and manufactured analog and digital video surveillance systems. They are still using 486's in some of their hardware as well (key components that require an insane MTBF to comply with regulations for casinos, military installations, etc). Why? Because it runs nice and cool compared to modern processors, and it is a tried and true processor. Can you imagine launching a robot to Mars with a Pentium chip in it, only to find that Intel still hasn't gotten their floating point right in that old chipset? I'm not saying it's likely that float problems still exist in Pentium hardware, but for the cost you go with what you know works. In 10 or 20 years from now, when Ivy Bridge is the tried and true processor, you can bet that the government and many corporations will be using them in satellites and other mission critical hardware.
26. Re:Seems like a rationalization by Lumpy · 2012-10-10 22:54 · Score: 1
  
  Then as an engineer you understand why it's super stupid to have it all run off of 1 processor.
  Guidance and maneuvering is 1 processor/system. Science package another, imaging another, etc... When you cant get to it to press the reset button, you dont do the dumb mistakes done on consumer hardware like automotive industry does.
  Example: GM and having 90% of the car run on the BCM, and Honda running the WHOLE car including engine off of the single ECM. My AC quit working because of a faulty sensor shorting out the IO port on the ECM. only fix is to replace the WHOLE ECM for the car at $2200.00
  That kind of design is only done by really really dumb engineers.
  
  --
  Do not look at laser with remaining good eye.
27. Re:Seems like a rationalization by gadzook33 · 2012-10-10 22:58 · Score: 1
  
  Yeah, I agree. In fact, I think that was the point I was trying to make (albeit unsuccessfully as it turns out).
28. Re:Seems like a rationalization by Lumpy · 2012-10-10 23:03 · Score: 1
  
  Except RTlinux.
  http://en.wikipedia.org/wiki/RTLinux
  
  --
  Do not look at laser with remaining good eye.
29. Re:Seems like a rationalization by AmiMoJo · 2012-10-10 23:53 · Score: 1
  
  TFA explains it: "VxWorks has to react immediately in order to survive while exploring Marsâ(TM) surface."
  It isn't hard to imagine why this would be the case. If a sensor suddenly reports a fault you might want to react extremely quickly to prevent the rover being damaged. Say a wheel jams or something like that. Since the rover can't be repaired a great deal of caution is necessary.
  
  --
  const int one = 65536; (Silvermoon, Texture.cs)
  SJW, n: "Someone I don't like, and by the way I'm a fuckwit" - AC
30. Re:Seems like a rationalization by Anonymous Coward · 2012-10-11 02:45 · Score: 0
  
  I know this is a crazy observation, but why get parts that are individually hardened against radiation, and just put them in a DU or lead box?
31. Re:Seems like a rationalization by DickBreath · 2012-10-11 03:18 · Score: 1
  
  That "stunt", which maybe it was, resulted in driving a lot of advances in microelectronics that led directly to the cool toys we have today. We would have gotten here without the space race, but it would have taken longer.
  
  It's funny how once money is put into developing something 'impractical', that other uses are found for it that lead to it being useful for everyone. Examples are many, but I'll just mention: GPS, Communication and Weather Satellites, The Internet and Xtube.
  
  --
  
  I'll see your senator, and I'll raise you two judges.
32. Re:Seems like a rationalization by DickBreath · 2012-10-11 03:27 · Score: 1
  
  > That kind of design is only done by really really dumb engineers.
  
  Or maybe by mid level managers?
  
  Hey, if this one component goes bad, (A) they MUST fix it because it controls so much, and (B) we make a boatload of money replacing it. Therefore, it's a great idea. Engineer promoted. Everyone happy.
  
  Oh, wait. Not everyone?
  
  --
  
  I'll see your senator, and I'll raise you two judges.
33. Re:Seems like a rationalization by mcgrew · 2012-10-11 03:53 · Score: 1
  
  Example: GM and having 90% of the car run on the BCM, and Honda running the WHOLE car including engine off of the single ECM. My AC quit working because of a faulty sensor shorting out the IO port on the ECM. only fix is to replace the WHOLE ECM for the car at $2200.00
  That kind of design is only done by really really dumb engineers.
  
  At one point I agreed with that sentiment. Hanlon's razor says don't assume malice when stupidity will explain, but mcgrew's razor says don't assume stupidity when greedy self-interest explains.
  I once remarked "if the idiots who designed cars had to actuallt work on them, they'd be designed better." It was pointed out to me that the automaker makes more money for their dealerships in repair when they're expensive to repair.
  Tell me, why does a car need a $2200 computer for the heater and AC when a couple of potentiometers and switches will do the same job for five bucks?
  Don't assume stupidity, the engineers are doing what they're told: Make it expensive to fix.
  
  --
  Free Martian Whores!
34. Re:Seems like a rationalization by chihowa · 2012-10-11 04:20 · Score: 2
  
  Shielding is heavy and expensive to launch (and to land softly). Then, for every extra mm of lead shielding you add, there's a more energetic photon just waiting to flip a bit. It ends being up cheaper to make radiation hardened electronics than to accommodate for the extra shielding.
  
  --
  If you want a vision of the future, imagine a youtube comments section scrolling - forever.
35. Re:Seems like a rationalization by 0123456 · 2012-10-11 04:24 · Score: 1
  
  That "stunt", which maybe it was, resulted in driving a lot of advances in microelectronics that led directly to the cool toys we have today.
  No, it didn't. ICs existed before Apollo, and the primary benefit was ramping up production and pushing for improved reliability.
  
  We would have gotten here without the space race, but it would have taken longer.
  Indeed. We might still be using Core 2s. Not a big deal in the grand scheme of technology when most i5s and i3s spend most of their time idle.
  
  Examples are many, but I'll just mention: GPS, Communication and Weather Satellites, The Internet and Xtube.
  None of which have anything to do with Apollo. It was a great achievement with the technology of its time, but the 'spinoff' arguments are just bogus.
36. Re:Seems like a rationalization by Anonymous Coward · 2012-10-11 05:08 · Score: 0
  
  Actually, there is one pair of flight computers that are on Curiosity. That pair of computers has directed everything since the rocket lifted off the launch pad. They ran the flight to Mars, the ED&L, and now they run all the operations on the surface. Surface ops include coordinating the science packages as well as driving the rover, avoiding obstacles, and communicating with the orbiters and with Earth.
  There really aren't newer processors that are qualified to do deep-space work, the environment - radiation wise - calls for computers that can tolerate in excess of a mega-rad of exposure. There aren't very many of those available.
  For operating on the surface of Mars you could probably get away with a 500K-rad hard computer, but you'd still have to get it there somehow.
37. Re:Seems like a rationalization by toolie · 2012-10-11 09:05 · Score: 1
  
  The AH-64D uses 486 processors.
  You didn't even get the architecture right, much less the processor.
  
  --
  -- toolie
38. Re:Seems like a rationalization by sjames · 2012-10-11 09:26 · Score: 1
  
  How many hours in space had the new stuff logged when the design of Curiosity was completed?
39. Re:Seems like a rationalization by jittles · 2012-10-11 09:37 · Score: 1
  
  Depends on which aircraft system you're talking about. I can promise you that they have at least one 486 on board. I've dealt with the aircraft for years.
40. Re:Seems like a rationalization by Anonymous Coward · 2012-10-11 15:57 · Score: 0
  
  NEC V40, a 80188 clone with 8080 emulation, had a radiation-hardened version. http://www.amsat.org/amsat-new/satellites/satInfo.php?satID=48
"earlier Mars mission" == MER-A Spirit by Anonymous Coward · 2012-10-10 14:38 · Score: 2, Interesting

''recalled the emergency call he got when an earlier Mars mission hit a software snag after liftoff."
From TFA:

Back when Spirit Rover landed on Mars in 2004, it experienced file systems problems. I got a call on landing day while I was in Southern California. I fired up my laptop and worked with three groups who were dealing with a variety of time zones: California, Japan and Mars. Since I had a RAD 6000 systems on my desk running simulations, by the end of first week we figured it out and were able to fix it.
I'm glad I don't program for NASA by Press2ToContinue · 2012-10-10 14:52 · Score: 0

The last thing I would want to do is program mission-critical systems. That G*d my programming mistakes are hidden in the mire of a thousand other programmer's mistakes, and never make it to the front page of /.

--
Sent from my ENIAC
1. Re:I'm glad I don't program for NASA by jimmydevice · 2012-10-10 15:14 · Score: 4, Funny
  
  So, You work for Microsoft?
VxWorks has a nice track record in space by Anonymous Coward · 2012-10-10 14:56 · Score: 2, Interesting

At least one instrument running VxWorks has been flying on the ISS since 2001. I'd be surprised if it were the only one.
1. Re:VxWorks has a nice track record in space by toygeek · 2012-10-10 15:54 · Score: 0
  
  Hopefully it worked better in space than it did in WRT54G's.
  
  --
  Nobodies Prefect
  Tidbits for Techs Technology Blog
Keeping up with the kardashians... by Anonymous Coward · 2012-10-10 15:08 · Score: 0

If you can survive eight hours, you can survive *ANYTHING*...
Re:"earlier Mars mission" == MER-A Spirit by AaronW · 2012-10-10 15:20 · Score: 5, Informative

With my long experience with VxWorks this doesn't surprise me. VxWorks is not the most robust RTOS. Think of it as a multi-tasking MS-DOS. The version they used has no memory protection between processes and I have found numerous areas of VxWorks to be badly implemented or downright buggy. Up through version 5.3 the malloc() implementation was absolutely horrid and suffered from severe fragmentation and performance problems. On the platform I was working with I replaced the VxWorks implementation with Doug Lea's implementation (which glibc was based off of) and our startup time dropped from an hour to 3 minutes. I was also able to easily add instrumentation so we could quickly find memory leaks or heap corruption in the field, something not possible with Wind River's implementation. After reading about the problems with the filesystem I looked at the Wind River filesystem code. It was rather ugly. They map FAT on top of flash memory (not the best choice) and the corner cases were not well handled (like a full filesystem).
Similarly, their TCP/IP stack sucked as well. If you can drop to the T-shell through a security exploit you totally own the box (i.e. Huawei's poor security record).
VxWorks is fine for simple applications, but for very complex applications it sucks. At least the 5.x series do not clean up after a task if it crashes because it does not keep track of what resources are used by a task. A task is basically just a thread of execution. All memory is a shared global pool. At the time it did have one feature that was useful that was lacking in Linux, priority inheritance mutexes. These are a requirement for proper real-time performance and I believe are now included in Linux.

--
This post is encrypted twice with ROT-13. Documenting or attempting to crack this encryption is illegal.
My PVR by GrahamCox · 2012-10-10 15:21 · Score: 1

My PVR also runs VxWorks. Given that it still crashes randomly now and again, I hope they have a better version for space probes.
pshaw, we use RTEMS by Anonymous Coward · 2012-10-10 15:30 · Score: 3, Informative

the other big player in space RTOS: RTEMS.
Free, open source, rtems.org.
Has all the same problems as VxWorks.. no process memory isolation (because space flight hardware doesn't have the hardware to support it usually)....
One thing that VxWorks has that RTEMS doesn't, and I wish it did, was dynamic loading and linking of applications. You're basically back in 1960s monolithic image days, not even with overlay loaders.
1. Re:pshaw, we use RTEMS by jimmydevice · 2012-10-10 15:41 · Score: 1
  
  Why not FORTH?
  It was the to-go system for exploration satellites for years.
  I believe Voyager is running it still.
2. Re:pshaw, we use RTEMS by Hans+Lehmann · 2012-10-10 16:04 · Score: 1
  
  If a better OS came along since the start of the Voyager program, which I'm sure is true, I highly doubt that the Voyager crafts would get their disks wiped and a new OS installed, so to speak, while on their way to the edge of the solar system.
  
  --
  09 F9 11 02 9D 74 E3 5B D8 41 56 C5 63 56 88 C0
3. Re:pshaw, we use RTEMS by Anonymous Coward · 2012-10-10 16:34 · Score: 3, Interesting
  
  FORTH is great. From about 2 dozen core instructions an entire operating environment can be built. Unfortunately, FORTH takes in-depth knowledge of not only the hardware, but also a firm grasp of scope of the tasks that need to be performed. Most programmers today cannot handle FORTH -- imagine building your own TCPIP stack, filesystem, and RTOS operating environment from scratch. That talent is found only in a dying breed of programmers, literally.
  For a robust 100% reliable radiation-hardened space environment, even the processor, data paths, and memory need to be self-correcting for each data bit. Sapphire-On-Silicon processors are only the beginning of solving the reliability issues. Commercial Off The Shelf solutions are an invitation to disaster, but nobody wants to invest the time and money for proper solutions any more.
  You can thank Just In Time supply chains, quarterly corporate focus on maximizing profits, and Globalization for the current sad state of space exploration. Without a paradigm shift in attitude, there will be no more Voyagers. I know. I used to work for rocket scientists.
4. Re:pshaw, we use RTEMS by Anonymous Coward · 2012-10-11 08:16 · Score: 0
  
  RTEMS is much better overall. There is hardware flying around mars running RTEMS, because I picked it long ago. There is no upside to a lack of memory protection when the hardware supports it.
The same VxWorks.... by gQuigs · 2012-10-10 16:35 · Score: 1, Flamebait

that is (or was?) in newer Linksys routers, that are much less stable than the older Linux based versions..
http://en.wikipedia.org/wiki/VxWorks#Networking_and_communication_components
1. Re:The same VxWorks.... by Anonymous Coward · 2012-10-10 18:57 · Score: 1
  
  that's actually at the fault of the device drivers and glue code.
  Unfortunately vxworks has a small and well understood and deterministic core despite various suck points. Important in many areas in control but it has so many cons. The buggy POS crap all over the world running VxWorks is testimount to that. It's like once you consider the human factor in commanding the beast why bother because it's just going to be less reliable in the end. AFAIK you get to pay royalies too... This is truely one of the things in the world i don't understand. You'd think the world would move on to something like QNX or greenhills... On that topic, linux doesn't actually have my vote in this domain of control and RTOS...
Ever seen a time table for a space mission? by dutchwhizzman · 2012-10-10 18:05 · Score: 2

They start planning this years, years and years ahead. It is not uncustomary to have decided on a hardware platform five years before launch. Since there's a lot at stake for these bigger missions to succeed, they usually don't take risks and put stuff up there that hasn't proven itself. Maybe some evolution like a higher clock rate or more memory or something like that, but a new processor architecture gets tried on other things that have redundancy, lower cost or less exposure and preferably a combination of those.
I have been discussing some technology that was possibly put in an instrument on a weather/climate sat with the primary investigator of the then current mission and named to be the one of the next mission as well. This was around 2007. They had to choose the technology then, so they could work on plans and get funding around now. Once they get their funding, it will still be three to five years before it goes up there. Back then, due to the reliability demands they had for the sensor and the relative unproven state of using CMOS sensors for photon capture (common used in digital consumer cameras in 2007) they chose to go with the previous solution, that was in the current instrument. That means that they will probably launch a pre-CMOS sensor equipped instrument around 2015, because that was the best option available to them when it was decision time.
Unless we change the way we "go to space" in a radical way, I don't see the latest and greatest tech make it in missions like this. It's up there, sure it is, but only a handful people know it is and they don't want their precious black ops budget exposed or taken away from them. Once the statistics they get from the successes and failures (failing in secret "testing missions" once in a while is allowed) to a rating that makes it commercially viable to sell the tech to civilian usage, plus the state of technology used for espionage and military use is such that there isn't any tactical threat to do so, more modern tech will be used for missions like this.

--
I was promised a flying car. Where is my flying car?
Re:"earlier Mars mission" == MER-A Spirit by Jeremi · 2012-10-10 19:01 · Score: 4, Interesting

Up through version 5.3 the malloc() implementation was absolutely horrid and suffered from severe fragmentation and performance problems.

I talked to one of Curiosity's software engineers the day it landed... he mentioned that one of their coding rules was: no malloc() allowed.

--

I don't care if it's 90,000 hectares. That lake was not my doing.
Re:"earlier Mars mission" == MER-A Spirit by Anonymous Coward · 2012-10-10 19:51 · Score: 0

While I agree with most of the sentiments on the 5.x vxworks version, it has to be a said that vxworks is now at version 6.9 and is a much improved beast with a far better IP stack, support for 'proper' processes, etc. Saying that it comes with the cost of dropping or modifying a lot of API's making upgrades difficulty and to be honest looks so like linux once you've finished with it you wonder why you spent $50000 on a developer seat
Slightly off thread I know... by Anonymous Coward · 2012-10-10 19:59 · Score: 0

As a slightly off thread, I always wondered why Intel bought windriver. One of the issues we have is that finding someone who knows the OS well is difficuilt because there is no way of getting exposure to it unless you have a lot money.
I can't help thinking Intel have missed a trick here. With the rise of the embedded hobbiest with things like Raspberry Pi, a new generation of engineers are learning, however there experience is based around ARM and linux, so further marginalising Intel in the embedded world, which is likely to be the big growth area in the future.
If intel was smart they would create there own hobbiest board based around an embeeded core duo or the like and provide a free version of vxworks to run on it. It doesn't need some of the high end features, but would provide early exposure to the OS as well as raising the profile of Intel in the embedded space.
Just a thought....
1. Re:Slightly off thread I know... by gnalre · 2012-10-10 20:02 · Score: 1
  
  As a slightly off thread, I always wondered why Intel bought windriver. One of the issues we have is that finding someone who knows the OS well is difficuilt because there is no way of getting exposure to it unless you have a lot money.
  I can't help thinking Intel have missed a trick here. With the rise of the embedded hobbiest with things like Raspberry Pi, a new generation of engineers are learning, however there experience is based around ARM and linux, so further marginalising Intel in the embedded world, which is likely to be the big growth area in the future.
  If intel was smart they would create there own hobbiest board based around an embeeded core duo or the like and provide a free version of vxworks to run on it. It doesn't need some of the high end features, but would provide early exposure to the OS as well as raising the profile of Intel in the embedded space.
  Just a thought....
  Whoops posted as AC
  
  --
  Choose your allies carefully, it is highly unlikely you will be held accountable for the actions of your enemies
2. Re:Slightly off thread I know... by tippen · 2012-10-11 00:47 · Score: 1
  
  As a slightly off thread, I always wondered why Intel bought windriver. One of the issues we have is that finding someone who knows the OS well is difficuilt because there is no way of getting exposure to it unless you have a lot money.
  Intel has loads of cash and a near monopoly on processors in most major market segments. They need somewhere to grow and PCs and servers isn't it. The big segments they are weak in are mobile (or more generally, low power) and networking.
  VxWorks is very common in networking equipment and in embedded (low power / low processing capability) systems.
  I can see where WindRiver looked attractive to Intel. Of course, the risk is that they scare traditional VxWorks customers off by focusing WindRiver too heavily on x86 processors.
3. Re:Slightly off thread I know... by default+luser · 2012-10-11 04:28 · Score: 1
  
  Owning VxWorks also gives Intel a way to get into military designs. These are high margin, low-volume parts just like server CPUs, so it's a lucrative market for Intel to get into.
  That said, they've only made half-assed commitments, offering just 7 years availability of embedded processors (most places do 10+ years). That works for simpler projects, but the bigger government designs may require a CPU upgrade before the finished product even ships!
  And yes in the Windows desktop world it's no big deal to upgrade a CPU,. but in the embedded world where board support packages will vary from one board to another (regardless of processor compatibility), upgrading your computer can range in difficulty from simple to incredibly complex. And since these things are always low-volume, you constantly run the risks of running into driver/hardware bugs on a new platform, so there are lots of reasons to avoid changing the hardware powering a project as much as possible.
  
  --
  Man is the animal that laughs.
  And occasionally whores for Karma.
Wind River? by Viol8 · 2012-10-10 20:42 · Score: 1

Didn't they used to do Linux distros back in the day?
Yes , I know, off topic , but just asking...
1. Re:Wind River? by Anonymous Coward · 2012-10-10 21:40 · Score: 1
  
  Yup. Wind River Linux. I remember using it and thinking what a /big/ pile of shit it was. It was slated as an embedded target but the smallest they could shrink it was a gig or so. Their support had no idea how to shrink it, and I only shaved it down be a couple of hundred meg.
  Suffice to say, for what they were charging we were able to build our own glibc based distro with newer, more stable components and cram it in under 200M. There was even change.
Re:"earlier Mars mission" == MER-A Spirit by kauaidiver · 2012-10-10 20:43 · Score: 2

No malloc()? Interesting, I worked on a project at NG and we had same policy. Everything was on the stack or global. We had the chance to run with Monta Vista embedded Linux but someone higher up decided to go with "tried and true" VxWorks. I agree with a poster above about re-training costs and all that adding up.. but if embedded linux became standard with big companies I don't think it would take too long to make-up the costs of re-training and all the other stuff that goes with it.
It demonstrate how inefficient desktop software is by Viol8 · 2012-10-10 20:56 · Score: 3, Insightful

An old Power PC can fly a spaceship to mars, execute a difficult landing and now semi autonomously drive a robot across the surface of a planet 30 million miles away , yet its not up to the job of writing documents using the latest word processors. Whats wrong with this picture?
Secret space designs? by scsirob · 2012-10-10 21:13 · Score: 1

I find the most revealing part of the interview that he publicly acknowledges his customers working on secret designs for space.
I'm sure those customers will deny any such project exists.

--
To Terminate, or not to Terminate, that's the question - SCSIROB
Re:"earlier Mars mission" == MER-A Spirit by AaronW · 2012-10-10 22:27 · Score: 1

That is a good policy if you can do it, but in this case it was impossible. We had to use some 3rd party software which used malloc and realloc extensively. To make matters worse, for a long time we could only get obfuscated code to support the network processor we were using, meaning that it was impossible to make changes to it. We also had to make use of it because of the dynamic nature of the software. In our case it really wasn't feasible to avoid mllox. Replacing Windriver's malloc had some huge advantages. Fragmentation was horrible with the VxWorks malloc to the point where there were many tens of thousands of fragments of memory. VxWorks used a sorted linked list from smallest to largest free block. Due to the extensive dynamic reallocs, this linked list turned into a huge bottleneck.
Replacing the code with Doug Lea's malloc eliminated the fragmentation problem completely. By including the task ID and calling function's program counter in each block allocated it made it trivial to find memory leaks and keep track of how much memory and how many blocks were allocated per task or even by function.
There really was no good reason why VxWorks was chosen since there were no hard real-time requirements. The product was a mess (router and broadband remote access server) since each box had to include a Sun Ultrasparc computer running Solaris (we required big-endian) where most of the software ran. Solaris was an even worse choice. Trying to write streams drivers for Solaris was a nightmare compared to Linux drivers, especially when trying to tie into the TCP/IP stack. Not only that, Solaris was quite slow. Give me Linux any day.
The great thing about writing applications in Linux user space is that you can use tools like Valgrind to catch many of these memory leaks, uninitialized variables, etc.

--
This post is encrypted twice with ROT-13. Documenting or attempting to crack this encryption is illegal.
Re:"earlier Mars mission" == MER-A Spirit by AmiMoJo · 2012-10-10 23:28 · Score: 1

You probably shouldn't be using malloc() on an embedded system like that anyway. Statically allocate everything. That way you know exactly how much memory will be consumed at any time and can budget appropriately. It also reduces the chance of having a bug malloc() all your memory or running out of stack space.
VxWorks claims to have memory protection, chances are it is the CPU they are using which lacks an MMU to support it.

--
const int one = 65536; (Silvermoon, Texture.cs)
SJW, n: "Someone I don't like, and by the way I'm a fuckwit" - AC
Re:"earlier Mars mission" == MER-A Spirit by Anonymous Coward · 2012-10-10 23:51 · Score: 1

If your objhective requires an RTOS, you're probably not going to malloc(). There are edge cases, but we've successfully banished them. We don't use VxWorks, thank god, but we do use a real memory machine instead of a virtual memory machine. Getting young programmers to understand that is challenging, and getting CS grads, of all fucking people, to program for a real memory machine is just fucking impossible. We make them managers instead.
Re:It demonstrate how inefficient desktop software by edcalaban · 2012-10-11 00:25 · Score: 1

I wonder what the CPU and memory load graphs would look like for a probe versus some standard desktop applications. Might explain a lot.
Re:"earlier Mars mission" == MER-A Spirit by datapharmer · 2012-10-11 00:31 · Score: 1

policy inheritance can be handled through FUTEX_PI. Issues due to a lock-contention can be handled by the kernel via FUTEX_LOCK_PI.

--
Get a web developer
Re:It demonstrate how inefficient desktop software by datapharmer · 2012-10-11 00:33 · Score: 1

Your desktop word processing software also didn't have a licensing cost in the hundreds of thousands of dollars...

--
Get a web developer
Re:It demonstrate how inefficient desktop software by Viol8 · 2012-10-11 01:41 · Score: 1

I would imagine that landing a spaceship takes a lot more CPU than reformatting some text and drawing a blinking cursor.
Re:"earlier Mars mission" == MER-A Spirit by Anonymous Coward · 2012-10-11 02:00 · Score: 3, Funny

Malloc is non-deterministic. The request for a pointer to return contiguous free bytes will need to search a fragmented memory map to complete the request. The duration of the search depends upon the algorithms and the amount of fragmentation relative to the size of the request. It is worse if it must rearrange memory to accomodate the request. Thus, use of malloc() is typically avoided for time-critical code in a real-time operating system.
Re:It demonstrate how inefficient desktop software by Anonymous Coward · 2012-10-11 02:49 · Score: 0

Why do you think that? Landing a space ship can be done using analogue electronics as a control system in the 60s. That means it was simple and light enough even when analogue in design that it made it into space. The rate on the feedback loops doesn't have to be more than a few khz and the amount of processing per loop is very low. More intensive than blinking a cursor yes but is it more intensive than reformatting text? Perhaps not so.
A PID controller is say 10 arithmetic operations per evaluation and only has to be evaluated at the rate of the feedback loops. No, it's not very much processing to control a spaceship landing.
Re:It demonstrate how inefficient desktop software by Viol8 · 2012-10-11 03:10 · Score: 1

"Landing a space ship can be done using analogue electronics as a control system in the 60s"
I don't remember any system in the 60s where a skycrane had to hover in place, lower a lander down, release it then fly off. Or navigate using image recognition. If you know otherwise fill me in.
"more intensive than reformatting text?"
Oh please. Reformatting text algorithms were running on 8 bit home computers in the 70s!
Re:It demonstrate how inefficient desktop software by Anonymous Coward · 2012-10-11 06:07 · Score: 0

Oh please. Reformatting text algorithms were running on 8 bit home computers in the 70s!
I'm sure that explains why we still don't have hypenation in web browsers and justified text sucks. Hey, browser guys! This one has a clue! You have to use 8 bit home computers!
Re:It demonstrate how inefficient desktop software by Anonymous Coward · 2012-10-11 06:11 · Score: 0

Or navigate using image recognition.
Well, you can call image recognition anything these days, like what univesity students do in their robotic-fight competitions (based on maybe an 8bit luminosity sensor) or what any laser based mouse does to detect movement across a surface. The devil is in the details, isn't it?
Re:It demonstrate how inefficient desktop software by mcgrew · 2012-10-11 07:19 · Score: 1

Your desktop word processing software also didn't have a licensing cost in the hundreds of thousands of dollars
It would if you were the only customer and it was only going to run on one computer. Do you have any idea how many programmers MS has and what it costs for salaries and other overhead?

--
Free Martian Whores!
Re:"earlier Mars mission" == MER-A Spirit by dfries · 2012-10-11 12:41 · Score: 1

It is worse if it must rearrange memory to accomodate the request.

You were going okay until here. You can't rearrange memory, malloc returns pointers, and there isn't any callback to ask for that pointer back to move it to another location.
Byte compiled languages like Java can rearrange memory but you call new not malloc so I know you weren't talking about them. Garbage collection is a much bigger problem especially if you think about mixing Java and real time operations. C/C++ in realtime means following the best practices, but for Java, get a different Java http://en.wikipedia.org/wiki/Real_time_Java.