ESA: European Mars Lander Crash Caused By 1-Second Glitch (space.com)

← Back to Stories (view on slashdot.org)

ESA: European Mars Lander Crash Caused By 1-Second Glitch (space.com)

Posted by BeauHD on Wednesday November 23, 2016 @07:00PM from the what-went-wrong dept.

An anonymous reader quotes a report from Space.com: The European Space Agency (ESA) on Nov. 23 said its Schiaparelli lander's crash landing on Mars on Oct. 19 followed an unexplained saturation of its inertial measurement unit (IMU), which delivered bad data to the lander's computer and forced a premature release of its parachute. Polluted by the IMU data, the lander's computer apparently thought it had either already landed or was just about to land. The parachute system was released, the braking thrusters were fired only briefly and the on-ground systems were activated. Instead of being on the ground, Schiaparelli was still 2.3 miles (3.7 kilometers) above the Mars surface. It crashed, but not before delivering what ESA officials say is a wealth of data on entry into the Mars atmosphere, the functioning and release of the heat shield and the deployment of the parachute -- all of which went according to plan. In its Nov. 23 statement, ESA said the saturation reading from Schiaparelli's inertial measurement unit lasted only a second but was enough to play havoc with the navigation system. ESA said the sequence of events "has been clearly reproduced in computer simulations of the control system's response to the erroneous information." ESA's director of human spaceflight and robotic exploration, David Parker, said in a statement that ExoMars teams are still sifting through the voluminous data harvest from the Schiaparelli mission, and that an external, independent board of inquiry, now being created, would release a final report in early 2017.

110 comments

Min score:

Reason:

Sort:

This never happened to me before... by hyades1 · 2016-11-23 19:05 · Score: 5, Funny

Man, if I had a nickel for every time some kind of sensory saturation forced a premature release...

--
I've calculated my velocity with such exquisite precision that I have no idea where I am.
1. Re:This never happened to me before... by rsmith-mac · 2016-11-23 19:19 · Score: 5, Funny
  
  Man, if I had a nickel for every time some kind of sensory saturation forced a premature release...
  Then you'd still be broke. This is Slashdot; you're not fooling anyone.
2. Re:This never happened to me before... by Freischutz · 2016-11-23 20:40 · Score: 2
  
  Man, if I had a nickel for every time some kind of sensory saturation forced a premature release...
  Then you'd still be broke. This is Slashdot; you're not fooling anyone.
  Visit the lair of any Slashdot poster, buried deep in the basement of his parent's house, and you will find that the height and majesty of the tissue mountain on the nightstand next to his bed thoroughly discredits your hypothesis. If you need further confirmation, shine a black light at his laptop and prepare yourself to be blinded by the glow.
3. Re:This never happened to me before... by Anonymous Coward · 2016-11-23 20:41 · Score: 1
  
  I believe he's referring to the manual override.
4. Re:This never happened to me before... by hyades1 · 2016-11-23 20:49 · Score: 1
  
  What...you never put lipstick on your hand for those special evenings?
  Fairness forces me to note that you, too, are a Slashdot denizen.
  
  --
  I've calculated my velocity with such exquisite precision that I have no idea where I am.
5. Re:This never happened to me before... by Anonymous Coward · 2016-11-23 22:29 · Score: 0
  
  But, but, under the black light, the "ejecta" will not glow but is actually dark.
  Your "shiny glow" is only from TV sitcoms ... like a plethora of stupid "police detective shows" ...
6. Re:This never happened to me before... by stealth_finger · 2016-11-23 22:44 · Score: 1
  
  Visit the lair of any Slashdot poster, buried deep in the basement of his parent's house, and you will find that the height and majesty of the tissue mountain on the nightstand next to his bed thoroughly discredits your hypothesis. If you need further confirmation, shine a black light at his laptop and prepare yourself to be blinded by the glow.
  Just because your house is like that doesn't mean everyone's is.
  
  --
  Wanna buy a shirt?
  https://www.redbubble.com/people/stealthfinger/shop?asc=u
7. Re:This never happened to me before... by silentcoder · 2016-11-23 23:17 · Score: 3, Insightful
  
  He never said there was another person involved. He's just complaining about never managing to make it to the end of the pornhub clip.
  
  --
  Unicode killed the ASCII-art *
8. Re:This never happened to me before... by Anonymous Coward · 2016-11-24 01:21 · Score: 0
  
  I wouldn't give a dime to see that...
9. Re: This never happened to me before... by Anonymous Coward · 2016-11-24 01:29 · Score: 0
  
  Not so. That stuff glows. Maybe you should see a doctor about yours.
In retrospect by Anonymous Coward · 2016-11-23 19:13 · Score: 0

They should've accounted for +-2 seconds of random delays in sensory data in their simulations.
1. Re: In retrospect by Anonymous Coward · 2016-11-23 22:32 · Score: 0
  
  I've always wondered why so many probes control systems seem to based on making instantenous decisions, such as do this once this input passes X value.
  Surely it should track height position continuously, then it could know the input is spurious by the sudden shifts which are physically impossible.
2. Re:In retrospect by TWX · 2016-11-24 05:13 · Score: 1
  
  Heh. When I heard about the crash landing I literally said to a friend of mine, "I bet the subroutine that cuts the parachute loose so it doesn't land on top of the payload detected the thump of the parachute strings going-taught, determined that meant it was on the ground, and cut the parachute." Didn't something like this happen to an American probe sent to Mars around twenty years ago?
  
  --
  Do not look into laser with remaining eye.
3. Re: In retrospect by turbidostato · 2016-11-24 10:27 · Score: 2
  
  And then, what? Should it take an "instantenous" decision about what to do next?
4. Re:In retrospect by newcastlejon · 2016-11-24 13:49 · Score: 1
  
  ...detected the thump of the parachute strings going-taught, determined that meant it was on the ground, and cut the parachute...
  Just think about that again for a moment.
  It's "taut" by the way, but apart from that I see a bright new career in space exploration in your future.
  
  --
  If God forks the Universe every time you roll a die, he'd better have a damned good memory.
5. Re: In retrospect by Anonymous Coward · 2016-11-24 21:28 · Score: 0
  
  That's what they did in the Ariane rocket. When the first speed sensor gave an out of range value, it was disregarded as broken. When the second speed sensor gave an out of range value, it too was disregarded as broken. A few milliseconds later, all four speed sensors were marked as broken, and the computer was out of options for controlling the rocket.
  Turns out the sensors were correct, the software simply hadn't been updated for the new, more powerful rocket.
  This is what it looked like: https://www.youtube.com/watch?v=PK_yguLapgA
Cheater? by Anonymous Coward · 2016-11-23 19:27 · Score: 2, Funny

They're blaiming lag?
Kalman filter by little1973 · 2016-11-23 19:29 · Score: 4, Insightful

https://en.wikipedia.org/wiki/...
How in hell did they test their Kalman filter to allow such bad data to reach the decision logic? (I assume they used one.)

--
Government cannot make man richer, but it can make him poorer. - Ludwig von Mises
1. Re:Kalman filter by gTsiros · 2016-11-23 20:14 · Score: 3, Insightful
  
  I find it more weird that *one* sensor misbehaving lead to the entire mission failing.
  I have more robustness in my thrust measuring rig made of wood beams and zipties :|
  
  --
  Looking for people to chat about multicopters, coding, music. skype: gtsiros
2. Re: Kalman filter by Anonymous Coward · 2016-11-23 20:23 · Score: 0
  
  Exactly. Not very good at the lateral thinking at the ole ESA.
3. Re:Kalman filter by 0100010001010011 · 2016-11-23 20:38 · Score: 2
  
  "Check for integer overflow" is a checkbox in Simulink.
  How was this not caught on the Hardware in the Loop test benches?
  Jesus people, is this amateur hour.
4. Re: Kalman filter by Anonymous Coward · 2016-11-23 21:26 · Score: 0
  
  I bet something returned -1 on error.
5. Re:Kalman filter by Ramze · 2016-11-24 04:38 · Score: 1
  
  This. So much this. I don't know about the EU, but when NASA builds spacecraft, it tends to put in multiple redundancies where it can and add a little logic to determine if and when a sensor fails given other data. If you're going to send up a multi-million dollar craft for a project that will last months, have a backup plan for each and every thing that could possibly go wrong so long as it doesn't significantly add to the expense.
  We know that rocket scientists can fire an object into orbit and hit a spot only a few meters wide on Mars -- while the Earth and Mars are both in motion -- rotation and revolution! I bet there's even a checklist as to when each system should come online and be deployed. If a sensor says "hey, we're on the ground or very near it!" when you're still 2 or 3 miles up... and you KNOW there hasn't been enough time for descent, maybe ignore that sensor and fire the descent engines based upon other data. Maybe even include a second sensor that works on different principles -- or fire a laser at the ground and have an optical sensor figure out the altitude from the reflected light. Maybe even throw down a reverse GPS system for Mars... some planetary markers so that orbital and landing craft can triangulate their location and proximity to the ground in real time.
6. Re:Kalman filter by vtcodger · 2016-11-24 06:09 · Score: 2
  
  "How in hell did they test their Kalman filter to allow such bad data to reach the decision logic? (I assume they used one.)"
  1) A Kalman Filter probably is not really appropriate here because the parachute has just been deployed and you wouldn't have state statistics available to filter the input data. Doesn't mean they didn't use one with ad hoc statistics. That's not as uncommon as perhaps it should be.
  2) Presumably the IMU is expected to tell you the probe has run into the planet (i.e. landed) and it's time to get rid of the 'chute before it lands on your probe and also time to shut down the thrusters lest they bounce the probe across the landscape or flip it upside down. Depending on how often the IMU is read out during landing, a full second of bad data may be pretty convincingly NOT noise.
  Not that I know anything about landing Mars probes.
  
  --
  You can't see ANYTHING from a car, You've got to get out of the goddamned contraption and walk...Edward Abbey
7. Re:Kalman filter by Anonymous Coward · 2016-11-24 06:41 · Score: 0
  
  I find it more weird that *one* sensor misbehaving lead to the entire mission failing.
  It wasn't an entire mission failure. The large majority of the mission is ExoMars, which is working fine.
  The Schiaparelli lander was supposed to be a demonstrator or ESA'S ability to get a lander down. With Beagle 2, Philae, and now Schiaparelli all failing to one degree or another, ESA do need to work on their landers!
8. Re: Kalman filter by eliphalet · 2016-11-24 09:28 · Score: 1
  
  Or it's an off-by-one coding error.
9. Re:Kalman filter by RockDoctor · 2016-11-25 16:20 · Score: 1
  
  Presumably the IMU is expected to tell you the probe has run into the planet (i.e. landed) and it's time to get rid of the 'chute before it lands on your probe and also time to shut down the thrusters
  Wrong landing sequence. This spacecraft was intended to parachute down to some hundreds of metres, then fire up retro-rockets and jettison the parachutes, then descend to a few metres on the retro-rockets, then drop to the ground. So, the signal from the IMU would vary between free-fall and various substantial decelerations several times during the planned descent.
  Well, they achieved the desired state of not having the parachute land on top of the lander.
  
  --
  Birds are not dinosaur descendants;birds are dinosaurs, for all useful meanings of "birds", "are" and "dinosaurs"
They didn't learn the lesson by LordHighExecutioner · 2016-11-23 19:34 · Score: 4, Informative

Overflows and bad data problems happened to ESA before.
1. Re:They didn't learn the lesson by thegarbz · 2016-11-23 21:23 · Score: 5, Insightful
  
  To be fair, few people did. Multiple cases of overflows and bad data problems have occurred and still continue to occur not just in space programs around the world but in other industries too.
Teach it Phenomenology! by Bongo · 2016-11-23 19:43 · Score: 2

"Obligatory" Dark Star reference.
Ariane 5 by MichaelSmith · 2016-11-23 19:46 · Score: 1, Interesting

Brings to mind the failure of the first Arianne 5 launcher because control software spat an Ada stack trace over a line which was supposed to only contain kinematic data.

--
http://michaelsmith.id.au
1. Re:Ariane 5 by Anonymous Coward · 2016-11-23 21:07 · Score: 3, Informative
  
  > ...control software spat an Ada stack trace over a line...
  Eh, no. The failure of the INS's control software caused the INS to send diagnostic data (rather than sensor data) to the control systems, which then did what they _thought_ they were being commanded to do.
  None of the code in the system was modified in flight.
Filter or not by evanh · 2016-11-23 19:47 · Score: 4, Informative

When the altitude stops changing for a whole second the filter is going to have to be a long one! And that ain't desirable for responsive control.
The real question is how could the sensory processor have overloaded in the first place? My money is on simple [b]code bloat[/b]. Ie: They used a bunch of generic libraries that use further libraries that use further libraries that use further libraries that use further libraries that use further libraries ...
1. Re:Filter or not by Anonymous Coward · 2016-11-23 20:07 · Score: 1
  
  Bloat itself typically just makes things consistently slow.
  To get stalling you either need buffers upon buffers put there "to make things faster" or you need a runaway process that hangs.
  Not not realize you have them you need abstractions that hides them away.
2. Re:Filter or not by Anonymous Coward · 2016-11-23 20:55 · Score: 1
  
  I seriously doubt they cobbled the software together with a bunch of generic libraries; but if they did they got what they deserved.
3. Re:Filter or not by Solandri · 2016-11-24 04:17 · Score: 4, Insightful
  
  Dynamic range. Sensors which can measure from 0 - 100 g's are not as sensitive in the 0-1 g range you may be more concerned about. So you instead opt for accelerometers which max out at 10 g's and try to deal with the periods of max acceleration in software.
  
  A more elegant solution is to use both the sensitive accelerometer and an accelerometer with a greater max threshold. That way you keep the higher max limit without giving up low-gain sensitivity. But spacecraft tend to be both weight- and budget-constrained...
  
  More troubling to me was that there wasn't some basic sanity checking going on. Like a calculation that says "3 seconds ago I was at 4 km high. Now I think I'm on the ground. Does it make sense that I could've traveled that far in that little time? No? Then the instruments saying I'm on the ground are probably wonky, and I should give other instruments a higher priority in calculating my altitude for a bit." Same way I write my code (and spreadsheets) to calculate important numbers two, three, or sometimes even four different ways to make sure they all agree before proceeding to act on it.
4. Re:Filter or not by little1973 · 2016-11-24 07:11 · Score: 1
  
  This is not how Kalman filter works. Even if it gets totally wrong data for one second it outputs "correct" values based on previous data.
  So, in our case the altitude output would have changed for this one second and the output values would have been quite close to the real altitude.
  
  --
  Government cannot make man richer, but it can make him poorer. - Ludwig von Mises
5. Re:Filter or not by Anonymous Coward · 2016-11-24 07:39 · Score: 0
  
  You don't seem to understand what Little 1973 is saying. The lander judged, based on faulty data, that it was much closer to the planet than it was just moments before. During the initial approach everything went right, then the saturation happened, and not only did this saturation not get flagged, but the logic that deduces its position then judged that the entire craft had suddenly skipped position to a new point it couldn't possibly have been at unless all previous data was profoundly wrong.
  This isn't a problem limited to ESA though. A long time ago I used to work with industrial control software and judging by the code other programmers write, nobody ever stops to ask the what-if questions. People make a model based on correctly working sensors. Taking into account what might happen (and what, based on my experience, will happen, 100% guaranteed) and have the system behave sensibly simply doesn't occur to programmers.
6. Re:Filter or not by Kjella · 2016-11-24 08:18 · Score: 2
  
  More troubling to me was that there wasn't some basic sanity checking going on. (...) Same way I write my code (and spreadsheets) to calculate important numbers two, three, or sometimes even four different ways to make sure they all agree before proceeding to act on it.
  Well it's not exactly like the lander can abort, it's do or die. So you got inconsistent or unlikely data, but what's good and what's bad? It is a glitch, is it defective, did a misfire flip us around or put us in a spin or block the sensor? Can we salvage it or is the mission fucked no matter what? That's really the million dollar question, is there a contingency plan that could work and if so what should trigger it.
  I'm guessing that with combinatorics you'll have potentially very many possible failure modes and it's hard to find the realistic and salvageable ones while not screwing up non-sensor issues because you don't trust the data. I imagine you can spend a lot of time and resources on a few paths only to find it fails in a completely different way you didn't expect. It's a lot easier in hindsight...
  
  --
  Live today, because you never know what tomorrow brings
7. Re:Filter or not by Cochonou · 2016-11-24 10:11 · Score: 1
  
  This is not as simple as that. The inertial sensor is not outputting bogus or noisy data that can be easily discarded from previous data. It is saturating because the actual acceleration or rotation of the spacecraft is higher than any value the sensor can measure. Any integration algorithm used to compute the position of the spacecraft, including a Kalman filter or not, is going to have trouble in those conditions. Of course, there are methods to estimate what could be the correct measurement value during sensor saturation, but they are far from being a silver bullet.
8. Re:Filter or not by presidenteloco · 2016-11-24 14:33 · Score: 1
  
  You want sanity checking (based on physical possibility/impossibility) on individual input data streams to the Kalman filter prior to allowing them to get into the filter's weighted averaging. If a given single measurement stream (the position measurement by integrated acceleration) is indicating impossible changes in position over various near-past time ranges, exclude the whole measurement-type from the averaging immediately.
  
  --
  
  Where are we going and why are we in a handbasket?
9. Re:Filter or not by presidenteloco · 2016-11-24 14:35 · Score: 1
  
  To be fair, either doesn't occur to programmers, or it doesn't occur to managers to allow sufficient development and test time to consider stuff like that.
  I got my part of the lander done on time on budget. Gold star for me.
  
  --
  
  Where are we going and why are we in a handbasket?
10. Re:Filter or not by cwsumner · 2016-11-27 12:52 · Score: 1
  
  When the altitude stops changing for a whole second the filter is going to have to be a long one! And that ain't desirable for responsive control.
  The real question is how could the sensory processor have overloaded in the first place? ...
  
  ... When I heard about the crash landing I literally said to a friend of mine, "I bet the subroutine that cuts the parachute loose so it doesn't land on top of the payload detected the thump of the parachute strings going-taught, determined that meant it was on the ground, and cut the parachute." ...
  Mechanical devices can have really long "bounce" times, when it includes a parachute and riser lines it can easily be over a second.
  Not only was their mechanical testing lacking, their simulation software should have also picked this up. And they had a similar failure when the landing gear opened, in a previous lander.
  It sounds like they had a lot of scientists, but no engineers!
11. Re:Filter or not by strikethree · 2016-11-29 02:46 · Score: 1
  
  Same way I write my code (and spreadsheets) to calculate important numbers two, three, or sometimes even four different ways to make sure they all agree before proceeding to act on it.
  You sir, are not a typical coder. I would go so far as to say your error checking and thought processes are superior to (random very high number) 90% of the programmers I have seen. Mind you, I still think that is a bare minimum for calling yourself a programmer but each instance of proper coherency checking requires notice so that others can learn from it.
  
  --
  "Someone needs to talk to the tree of liberty about its ghoulish drinking problem." by ohnocitizen
What the? by NewtonsLaw · 2016-11-23 20:03 · Score: 5, Informative

So they didn't correlate the IMU data with ranging radar or even barometric altitude information so as to avoid this?
I know weight and volume are at a premium on such craft but a barometric sensor (even one capable of operating in Mars's rarefied atmosphere, is the size of a thumbnail and weighs just a fraction of a gram.
Sigh!
1. Re:What the? by MichaelSmith · 2016-11-23 21:17 · Score: 1
  
  You can even correlate it with your own kinematic model. The scenario which the vehicle followed is impossible. It can't land one second after dropping the parachute, and so timing alone should have made it reject the invalid data.
  
  --
  http://michaelsmith.id.au
2. Re:What the? by thegarbz · 2016-11-23 21:26 · Score: 3, Insightful
  
  even barometric altitude information
  I'm interested to know how you calibrate your barometric altitude information, and even more so what vacuum followed by a sudden atmospheric entry will do to such a sensor.
  If I'm going to take a guess I'd so no, an instrument capable in operating that range of pressures, temperatures, vibration, etc is not the size of a thumbnail weighing a gram.
3. Re:What the? by Ihlosi · 2016-11-24 03:08 · Score: 2
  
  I know weight and volume are at a premium on such craft but a barometric sensor (even one capable of operating in Mars's rarefied atmosphere, is the size of a thumbnail and weighs just a fraction of a gram.
  Even one that works at the velocity encountered during atmospheric entries?
  Sounds like you're suggesting putting a Pitot tube on a space probe ...
4. Re:What the? by vtcodger · 2016-11-24 06:31 · Score: 1
  
  Will a barometric sensor work properly while descending through gases emitted from thrusters that are trying to slow the vehicle?
  
  --
  You can't see ANYTHING from a car, You've got to get out of the goddamned contraption and walk...Edward Abbey
5. Re:What the? by RockDoctor · 2016-11-25 16:25 · Score: 1
  
  So they didn't correlate the IMU data with ranging radar or even barometric altitude information so as to avoid this?
  How do you know the barometric pressure profile before you enter the atmosphere? Mars has a trickily variable atmosphere.
  There was a large dust storm developing at the time, which is a (potentially) global event. How much does that affect barometric pressure? (On Mars, not necessarily on Earth.)
  
  --
  Birds are not dinosaur descendants;birds are dinosaurs, for all useful meanings of "birds", "are" and "dinosaurs"
Oops by wonkey_monkey · 2016-11-23 20:38 · Score: 4, Funny

Should've used metric seconds.

--
systemd is Roko's Basilisk.
IMU by dohzer · 2016-11-23 20:57 · Score: 1

What kind of IMUs are normally used in these craft? The same kind used in aircraft and weapons?
1. Re:IMU by Cochonou · 2016-11-24 09:13 · Score: 1
  
  An Honeywell MIMU.
2. Re:IMU by Anonymous Coward · 2016-11-29 10:36 · Score: 0
  
  No it was a Northrop Grumman (Litton Guidance & controls) LN-200S soildstate Strapdown FOG IMU. The same one that has been running reliably on Opportunity for almost 11 years not to mention many other spacecraft. The IMU did not fail. I believe that it sends a max limit message which the Klaman filter code in the CPTU ignored and treated as good data after one second. The max angular velocity for this unit is 1000 degrees/sec. So Shiaparelli's yaw rate was faster than this for some reason like the chute aerodynamics but they did not test this as the Italian Space Agency claims "the contract to test the flight and entry into atmosphere of the module, which was worth EUR 1.1 million. ARCA,
  however, ran into a series of problems and cancelled the tests ASI
  told Italian newspaper La Repubblica" If they had ran this critical tests they would have discovered the 1000 degree/second yaw rate limit being exceeded for more than a second and would have included an Ada Precondition to ignore it . I believe they had redundant Ln-200s but they also had a set of accelerometer triads and a sun sensor which probably could at least provide a crude yaw rate. But despite all this as someone mentioned there is always system time as a precondition to release of the backshield chute assembly. There probably was not enough fuel on board for a safe landing since the chute was released too early so the fact that the thrusters operated for only 3 of the intended 30 seconds is irrelevant. Its a software error from lack of flight testing and good coding --Slaith, Electronic Engineer w 24 years of aerospace experience.
This wouldn't be a problem by Anonymous Coward · 2016-11-23 21:10 · Score: 0

if they used metric seconds.
Meanwhile.. by Mascot · 2016-11-23 21:25 · Score: 1

... $1000 quadcopters back here on Earth ship with multiple IMUs for redundancy, since the bloody things are about as trustworthy as your average politician.
Having made that glib remark, I'm sure it either did have redundancy, or if it didn't that was for a good reason (e.g. risk of failure deemed too low to warrant the weight penalty in adding redundancy). I would also like to think that they're using somewhat more reliable IMUs than those found in quads.
Radar by Viol8 · 2016-11-23 21:48 · Score: 1

Yes, you have to wonder why on a mission of this expense and complexity the height about the ground is essentially done by mathematical dead reckoning. Would adding a ranging radar really have added so much to the weight and/or required package size that it was infeasible to include it? Obviously they must have considered it and I'd be interested to know why in the end it was not seen as a viable part of the solution.
1. Re:Radar by MichaelSmith · 2016-11-23 21:58 · Score: 1
  
  The article says that the radar was working. But the data from the radar seems to be have been ignored at this point.
  
  --
  http://michaelsmith.id.au
2. Re:Radar by khallow · 2016-11-24 04:01 · Score: 1
  
  Yes, you have to wonder why on a mission of this expense and complexity the height about the ground is essentially done by mathematical dead reckoning.
  
  Because it works really well. The other replier, MichaelSmith indicated it had radar as well.
3. Re:Radar by ddtmm · 2016-11-24 04:14 · Score: 1
  
  The radar unit plugs in to the lander's headphone jack. Unfortunately, the headphone jack was removed on the new landers.
Re: by Anonymous Coward · 2016-11-23 21:52 · Score: 0

If only we can remove the human-factor from our experiments, we can make things perfect!
Java by Anonymous Coward · 2016-11-23 22:47 · Score: 0

I knew it was going to be a bad idea...
Abstraction is bloat by evanh · 2016-11-23 22:54 · Score: 1

Hidden malloc()'s is a good example of the bloat problem I'm referring to.
1. Re: Abstraction is bloat by Anonymous Coward · 2016-11-24 01:35 · Score: 0
  
  malloc is lightning fast. You have to watch out for hidden block is on io or semaphores and stuff.
2. Re: Abstraction is bloat by maugle · 2016-11-24 09:00 · Score: 1
  
  MISRA disallows using malloc in a car's critical systems code. I imagine a Mars lander would be subject to at least that level of stringent requirements.
Single point of failure? by Visarga · 2016-11-23 22:58 · Score: 1

Why wasn't the IMU sensor doubled by other ways of detection? There was no fallback in case it malfunctioned.
State machine/discrete event control? by Ihlosi · 2016-11-23 23:21 · Score: 1

No basic sanity checks? As in "This phase must last at least X seconds", or "No switching to landing behavior if altitude measurement from 1 second ago still said '2 miles above surface'"?
1. Re:State machine/discrete event control? by Anonymous Coward · 2016-11-24 03:42 · Score: 0
  
  They never sent it to Mars, this is all an excuse so that they don't have to bother faking any photographs or data. People are waking up to the scammers that run NASA. Mars Curiosity is on Devon Island on Earth.
  All your objections are very relevant and are yet more reasons to believe that NASA isn't doing what it says it is.
2. Re:State machine/discrete event control? by Ihlosi · 2016-11-24 03:47 · Score: 1
  
  Um ... this is about a joint mission by ESA and Roscomos.
3. Re:State machine/discrete event control? by khallow · 2016-11-24 04:16 · Score: 1
  
  You're talking to a genuine flat earther. He did the experiments to show that the world is flat, but the Freemasons are holding him back.
What? No landing sensor? by Larsen+E+Whipsnade · 2016-11-23 23:37 · Score: 1

If the landing struts are subject to a compressive force, you've probably landed. If not, you haven't. Why wouldn't the computer make use of this?

Am I missing something, or is this a stupid design?
1. Re:What? No landing sensor? by Xolotl · 2016-11-24 00:55 · Score: 1
  
  It was supposed to have released the parachute and made the last part of the descent using retro-rockets for final braking, and only then would you get compression of the struts. In this case it released the parachute at 3+km rather than a few tens of metres ... oops.
Calculations were prescient by tomhath · 2016-11-24 00:04 · Score: 2

FTFA:

"[T]he erroneous information generated an estimated altitude that was negative," ESA said.
Which resulted in an actual altitude that was negative.
1. Re:Calculations were prescient by Anonymous Coward · 2016-11-24 00:28 · Score: 0
  
  Not my Programmers!
2. Re:Calculations were prescient by Anonymous Coward · 2016-11-25 06:10 · Score: 0
  
  Man, you've really got a bad altitude.
The Martians' gravity weapon test worked! by msk · 2016-11-24 00:30 · Score: 1

A brief burst was enough.
Apollo 11 overcame similar problem by Anonymous Coward · 2016-11-24 01:04 · Score: 0

Margret Hamilton (who just received a Presidential Medal) and J. Halcombe Laning's clever software design overcame such an event on the Apollo 11 Moon landing. From Wikipedia:
In one of the critical moments of the Apollo 11 mission, the Apollo Guidance Computer together with the on-board flight software averted an abort of the landing on the Moon. Three minutes before the Lunar lander reached the Moon's surface, several computer alarms were triggered. The computer was overloaded with interrupts caused by incorrectly phased power supplied to the lander's rendezvous radar.[18][19][5] The program alarms indicated "executive overflows", meaning the guidance computer could not complete all of its tasks in real time and had to postpone some of them.[20] The asynchronous executive designed by J. Halcombe Laning [18][21] allowed the computer to cope with the increased demand by prioritizing tasks. Hamilton's priority alarm displays interrupted the astronauts' normal displays to warn them that there was an emergency “giving the astronauts a go/no go decision (to land or not to land)”.[22] Jack Garman, a NASA computer engineer in mission control, recognized the meaning of the errors that were presented to the astronauts by the priority displays and shouted, "Go, go!" And on they went.”
Sounds like a significant design flaw by Anonymous Coward · 2016-11-24 01:08 · Score: 0

Generally such mission critical systems include redundancies and fail-safes. In this case the probe should have looked at its mission profile and said "wait a minute, I'm not supposed to be this low yet?" and took a look at its other systems (radar, LiDAR, etc) to confirm/dispute the location being provided by the IMU. Putting all of your eggs in one basket (the IMU) was about as daft as a car that loses all steering/breaking if its engine stalls.
Sure NASA, we believe you... by Anonymous Coward · 2016-11-24 01:18 · Score: 0

They have never been to Mars. Mars Curiosity is on Devon Island.
Open Source? by Midnight+Thunder · 2016-11-24 01:19 · Score: 2

I wonder whether making the source code of these probes available to the public, for vetting would help spot bugs like these? I am also curious whether releasing the code would be problematic for any reason?

--
Jumpstart the tartan drive.
1. Re:Open Source? by pr100 · 2016-11-24 02:08 · Score: 1
  
  I wonder whether making the source code of these probes available to the public, for vetting would help spot bugs like these? I am also curious whether releasing the code would be problematic for any reason?
  Dunno. But I suppose it might be that the code is written by a contractor and they hope to make money out of the code in other contexts.
2. Re:Open Source? by volodymyrbiryuk · 2016-11-24 04:59 · Score: 1
  
  Very unlikely. They didn't do it in the past. However, this story will make another good anecdote for a software engineering lecture.
  
  --
  sudo rm -r -f --no-preserve-root /
3. Re:Open Source? by Anonymous Coward · 2016-11-24 06:33 · Score: 0
  
  They probably used formal methods but failed to use extensive simulation due to lack of time or access. It would be embarrassing if they opened the source and some person in the Internet would find the bug in question by using a static analyzer.. :)
4. Re:Open Source? by DerekLyons · 2016-11-24 14:38 · Score: 1
  
  I wonder whether making the source code of these probes available to the public, for vetting would help spot bugs like these?
  Probably not, as the public wouldn't spend the months needed to study the hardware and interface specifications needed to understand what's going on in the software. Seriously, this is a tightly integrated system not a standalone program - without understanding the system, you can't tell a bug from working as intended.
5. Re:Open Source? by Anonymous Coward · 2016-11-24 21:54 · Score: 0
  
  I'd use it for my own mission to Mars.
All you flight software noobies.. by Anonymous Coward · 2016-11-24 03:19 · Score: 1

Lots of just plain old ignorant comments here. I say this in a nonperjorative sense - if you've not worked on flight software, there's no way you could know.
1) Space is unforgiving, hardware designs change very, very slowly. Project schedules move fast and have limited budgets. Just because you can buy a MEMS based IMU for your quadcopter does not mean that you can get one for a spacecraft that will work reliably from -40 to +80C, withstand the vibe tests, the pyroshock, etc. Oh, yeah, and it (and the surrounding electronics) has to not suffer any ill effects from a stray high energy particle: Single Event Effects, generally, but Latch-up, gate rupture, etc. are the issues one would worry about - bit flips and SEFI are something software can potentially handle, but destructive latchup is, well, destructive. Parts made for consumer or automotive high-rel applications just don't worry about this kind of thing. Sure, you could test your brand new whiz-bang IMU in an accelerator, but that costs money and schedule.
2) Yes, not range checking or reasonableness checking the data from the IMU was a definite design problem. Or, maybe, the entry descent and landing algorithm was such that if you DID get a hiccup and detect it, the missing data means all is lost anyway, so why bother. There's a basic principle in spacecraft design that you don't add hardware or software (which might fail, and costs time, money, and maybe mass or power) to give you information which you cannot use.
3) With respect to the landing radar - it's very possible that it has poor accuracy up high and is only used in the final descent stages.
4) Fault handling is tricky - you can easily go down a rat maze of low probability events generating code (and hardware) to handle obscure corner cases, thereby increasing your test costs and time, and potentially introducing other faults. For a lot of plausible error scenarios, it's likely you're going to fail for other reasons, so there's no point in trying to do things like estimate state from other sensors.
Summary - Entry, Descent, and Landing on another planet is really, really hard. It's even harder when you have time and budget constraints.
1. Re:All you flight software noobies.. by presidenteloco · 2016-11-24 14:51 · Score: 1
  
  It's really really hard, granted.
  In my experience in the systems engineering industry, there was rarely any re-use of design or code from one project to the next similar project. Silo-ism and misaligned incentives.
  Imagine if the reliability of this kind of EDL system and its software could be improved by evolution where different space agencies and subcontractors shared and re-used their ideas for improving solutions to the complex problem.
  Imagine all the landers... living for today.
  
  --
  
  Where are we going and why are we in a handbasket?
2. Re:All you flight software noobies.. by cwsumner · 2016-11-27 13:11 · Score: 1
  
  ...
  4) Fault handling is tricky - you can easily go down a rat maze of low probability events generating code (and hardware) to handle obscure corner cases, thereby increasing your test costs and time, and potentially introducing other faults. For a lot of plausible error scenarios, it's likely you're going to fail for other reasons, so there's no point in trying to do things like estimate state from other sensors. ...
  That's true, but it can also encourage a habit of lazyness in the designs. And, an exceleration spike when the parachute opens or the landing gear locks, is not something that has low probability. It sounds like a lot of "not my job".
Have you considered that it was... hacked? by Anonymous Coward · 2016-11-24 03:25 · Score: 0

Was someone inside your systems where the firmware and other software was being developed? Did the vehicle only accept encrypted and signed software updates sent back from ESA, or was it possible for a malicious actor to compromise the vehicle with different software?
Don't think for a second that there's not a space-race going on, and that sabotage of this kind is unthinkable. This is a genuinely concerning question that must be asked.
no backup by khallow · 2016-11-24 04:34 · Score: 1

The Schiaparelli EDM lander is an example of the typical one-off missions that humanity does these days. It's worth noting that they could have had built and launched two or more of these vehicles for much less than the first and already be correcting the erroneous code on a second spacecraft. Then they wouldn't have to wait years for a replacement mission and have a much better chance of mission success.
1. Re:no backup by vtcodger · 2016-11-24 06:47 · Score: 1
  
  Could be wrong, but I believe there are follow up missions and the Schiaparelli probe was intended as a great-if-everything-works-but-if-it-doesn't-we'll-probably-learn-a-lot proof of concept mission.
  
  --
  You can't see ANYTHING from a car, You've got to get out of the goddamned contraption and walk...Edward Abbey
2. Re: no backup by Anonymous Coward · 2016-11-24 07:51 · Score: 0
  
  Look, it was just an experiment and nobody who really matters in the EU (Germany and to some extent France) really wanted an Italian lander on Mars.
3. Re: no backup by Anonymous Coward · 2016-11-24 08:23 · Score: 0
  
  Look dude, Mars needs pizza. If we are going to send humans there, we have to have pizza waiting for them, it's a long trip and they will be hungry!
  The secret mission of the Schiaparelli probe was to install and make ready the pizza ovens for the manned missions to Mars.
4. Re: no backup by Anonymous Coward · 2016-11-24 08:40 · Score: 0
  
  Except that the lander wasn't Italian, only the name was.
  It was managed and funded by ESA, whose director happens to be german ( https://en.wikipedia.org/wiki/... ), just like most of the staff. Which once again proves what a worldwide laughing stock germany is, and that you shouldn't put a german in command of any relevant organization, after all it's a country that has been ridiculed, defeated and leveled down in any war it has ever been part of.
5. Re: no backup by Anonymous Coward · 2016-11-24 13:11 · Score: 0
  
  who really matters in the EU (Germany...)
  "Matters" and "germany" are two words that simply cannot be in the same sentence. Germany is a meaningless, ugly country that has been authorized to survive by Russia and the US after being ridiculously defeated in both the world wars, occupied, divided into 4 parts, and disarmed. No nukes, no geopolitical standing, Putin could turn it into a radioactive cemetery in few seconds, just for fun.
  And the EU is basically a geopolitical corpse, the european political parties that are winning the elections are those who want it to die.
6. Re:no backup by Anonymous Coward · 2016-11-24 16:11 · Score: 0
  
  The Schiaparelli EDM lander is an example of the typical one-off missions that humanity does these days.
  Indeed. It's a result of the developed world's attitude these days, the attitude of being highly skeptical at best for any big collective action, especially ones with government involvement.
  Getting people on board for anything long term will be met with huge resistance, so governments settle for these one time projects.
7. Re:no backup by khallow · 2016-11-25 02:56 · Score: 1
  
  It's a result of the developed world's attitude these days, the attitude of being highly skeptical at best for any big collective action, especially ones with government involvement.
  Stuff like this gives them just cause to be skeptical. Funny how the people who are right are the ones getting blamed here.
8. Re: no backup by mcswell · 2016-11-25 15:45 · Score: 1
  
  I thought all the space pizza was on Io? http://www.space.com/18272-jup...
Divide By Zero by Anonymous Coward · 2016-11-24 07:46 · Score: 0

-1.#INF -1.#INF -1.#INF -1.#INF
yay, we're on the surface! Deploy the chutes!
FUUUUUUUUUUUUUUUUUUUUCKKKK!!
Not gonna be open source by Anonymous Coward · 2016-11-24 08:14 · Score: 1

Software to land a probe on mars is quite similar, if not identical, to software to put a (nuclear) warhead on a target. That's an important strategic capability for "first world" nations - otherwise you're in the category of Saddam firing Scuds, which are basically V2s with newer parts, and quite literally cannot hit the broad side of a barn (albeit from 100 km away).
So, the hard parts of solving the problem (after you've done the basic college physics part) are likely to not be open source. Things like handling the rapidly changing aerodynamic effects at hypersonic speeds are a long way from "considering air as an incompressible fluid": Your state estimator has inputs with wildly varying inputs (dynamic range is huge compared to, say, a quadcopter), with equally wildly varying uncertainties. The transitions as you go through various deployments are also something that does not have a lot of commonality with other applications: airplanes do have deployments (landing gear, flaps, stores), quads and hexes do not; and the deployment of flaps on a plane occurs in a fairly narrow, and well understood, set of dynamic conditions.
It was only a functional example by evanh · 2016-11-24 11:20 · Score: 1

This is not a bugginess issue.
Point is the layers create bloat. Any hidden dynamic memory allocations that occur, by whatever system call, is just one more part of the bloat.
1. Re: It was only a functional example by maugle · 2016-11-25 10:47 · Score: 1
  
  Ah, yes, good point. One nested function call too many could exhaust available memory just as easily as malloc.
Off by one by Anonymous Coward · 2016-11-24 12:06 · Score: 0

Fucking off by one errors!
Sanity checking code by presidenteloco · 2016-11-24 14:21 · Score: 2

So say they're doing some kind of weighted average of an altitude computation from the inertial navigation unit and an altitude computation from the doppler radar altimeter.
They should have some code in there saying: If these two values that we're averaging are wildly off from each other, let's not take the average. Instead, let's go into some exception handling code which uses some kind of heuristic (and a little time perhaps) to determine which of the two instruments should become the solely trusted source of the altitude value.
Sounds like a lack of hazard analysis / fault tree analysis and or fault-tolerant design in the design process.

--

Where are we going and why are we in a handbasket?
1. Re:Sanity checking code by lucien86 · 2016-11-25 23:50 · Score: 1
  
  Its basic engineering. You need to consider and have plans for as many worst case scenarios as possible, while at the same time maintaining a positive good position vis spurious fault events. So agree with you totally..
  
  --
  Below the speed of light Special Relativity is one of the most accurate theories in physics - above the speed of light..
Java? by Anonymous Coward · 2016-11-24 16:07 · Score: 0

Interesting was Java garbage collecting?
Fat finger? by Anonymous Coward · 2016-11-24 19:29 · Score: 0

Guess they should have used a uint32_t rather than an int32_t. I would like to think that the system should be able to handle a flood of data even if the data were right. There is not a lot that you can do with a flood of incorrect data.
Exactly - for most errors, the mission's a goner by Anonymous Coward · 2016-11-25 10:32 · Score: 0

You've hit the nail on the head for flight systems development - for the vast majority of detectable errors, there's no reasonable recovery possible, so why bother checking, especially for low probability errors. The chance of introducing some new problem from the code you're adding for the check/recovery is probably greater than the event you're checking for. Add that to the non-zero time/budget to rigorously define, code and test that corner case.
If you're running a real time system with a time critical calculation, and your math check throws an exception, what are you going to do? Substitute the value from the last time tick? Or are you going to write a more complex control system which can deal with missing data? that then has to be tested, debugged, etc.