Mars Rovers Have Incorrect Instruments Installed
Christopher Reimer writes "The New Scientist is reporting that the twin Mars rovers, Opportunity and Spirit, has instruments installed in the wrong rovers. From the article: 'While the bungle does not undermine the main scientific conclusions drawn from the data collected by the rovers, it is an embarrassing slip-up for a space agency that once lost a Mars spacecraft because engineers mixed up metric and imperial units.'"
It annoys me that so much is made of this problem. This in no way compares to the lost spacecraft error, it's simply a calibration adjustment to a sensor. I think the fact that they have two rovers that have performed extremely well under harsh conditions 4x over their rated life is an incredible accomplishment. This just sounds like someone looking for sensationalism in a non-issue.
Why is it hard to support them when they're in the middle of a hugely successful Mars mission?
No one outside the community even noticed this until recently, and in the end it really made no difference. So where's the beef?
While the lead scientist says that it wasn't a big deal and no investigation will be held, I think he isn't analyzing the significance of this event. While scientists are more focused on the validity of data, engineers have to analyze not just events that occur (like loss of a rover), but also events that could occur. Putting the wrong instrument into a rover is due to "failure to follow procedure". This is a big deal. Failure to follow procedures could have been caught by a better QA system, better monitoring of the installation, and better training (including walkthroughs on the installation of the instruments).
Even though this minor event that has had no impact on the mission, it has shown that there are holes in JPL's QA system, their monitoring system, and their training program for building these rovers. If you want to dig further you might find that all of these problems were caused by an unnecessary sense of urgency which may have been caused by poor project planning. These exact problems have caused the loss of spacecraft before (and many of them were cited for the loss of Challenger and Columbia).
No investigation? The lead scientist really needs to take a look at his project management priorities. Having experience working in nuclear power I have learned and have been trained that small problems are many times the only symptoms of much larger problems. The lead scientist's attitude on the problem gives me no confidence in his ability to run a more complicated mission. Like in gambling, one or two successes doesn't mean that you are going to win on the next roll.
Suddenly, the hairy finger of a familiar monkey tapped me on the shoulder. It was time.--G. T.
"There was a point when both of them were sitting on the same bench, and that has to have been it."
Wouldn't they have been labeled, what does this have to do with anything?
This instrumentation calibration error issue does not surprise me, and if it were work hours I'd be making a couple phone calls to bolster my own guess at the root cause.
:
:
:
There reason the MAJORITY of recent mars missions failed is gender and race bias in hiring and promotion against whites and asians.
Vital FACT! Nasa switched to forced female hiring in most of the recent Mars failures.
For the first time ever ONLY WOMEN called the shots on the largest mars mission that failed. read
http://www.nytimes.com/library/national/science/ 04 1899nasa-women.html
for the first time ever all three KEY positions of the failed mars missions were female
Sarah A. Gavit = the mars project manager
Suzanne E. Smrekar, 37, the lead mars scientist
Kari A. Lewis= the mars project's chief engineer
Current hiring rules from the new top level NASA female administration dictate this new female forced hiring policy.
NASA has hiring policies that try to hire women DESPITE IQ or experience. In fact they now PREVENT job related award honors and bonuses based on how many females you hire and how many females and black contractors you hire!!! This is a fact!
NASA publicly has stated this from the woman in charge. I can't tell you about my own memos.
NASA is proud to boast 2% female active engineers minimum and that is WAY out of whack with societies norms.
The mars missions are even more than 2% female.
The average IQ of a Caucasian US Male holding a medical degree is IQ 124, but as the front page of the San Jose Mercury proclaimed in huge block letter headlines, and millions of IQ scores show (see the Bell Curve book data), the chance of a FEMALE obtaining a test score of 124 is EIGHT TIMES LESS LIKELY than an equivalent male. EIGHT TIMES LESS LIKELY. Conversely very low IQ people are almost always males. The average IQ is the same for both genders 100, but the IQ distribution bell curves are dramatically different shapes.
NASA boasts a female-minority web site documenting how not only are contractors hired by whether or not they are female or black but what state their small companies reside in! NASA apparently requires all 50 states to have minority participation in parts design and supply for the mars missions! REGARDLESS of competence! Sex and race are the prime criteria for 1999. Check out NASA own detailed list of female and minority small contractors at : http://sbir.nasa.gov. SBIR is a euphemistic acronym for small business innovation research, but as you can easily see it is actually a gender and race quota based system spearheaded by the new women helping to run NASA now.
from the female mars leader
"Women have really added to the workplace because we do come at things from a different angle," she said.
"For the same reason that cultural diversity works, gender diversity is wonderful, too, especially when you're trying to do something creative."
Also from the female mars leader Gavit:
"The fact that we're women hasn't made a difference," she said. "It's not an issue here. But it's good that young girls see that engineering and technical fields are wide open to women. That's the good thing about saying it's a woman-led team."
The report in The Guardian (British) December 7th a couple years ago included the following comment: "The total launch and development costs of NASA's lost Mars spacecraft is put at $320 million.
Forced hiring of women disregarding IQ score or talent created this staggering $320 million loss and many more female related losses are already in the works.
Kennedy Space Center rents out IMAX II theaters for a wizbang "Take Our Daughters To Work Day" the recent theme was about how the shuttle is now COMMANDED by a female and last years motto was "The Future is Me".
Even study grants awarded from NASA are targeted to females now at expense of males : refer to Federal Re
So, let me get this straight: NASA has managed to successfully send two completely functional rovers to the planet Mars 45 million miles away. Since they have arrived, the two rovers have expanded our understanding of the planet greatly and have had few and mostly correctable errors. They are now way, way past their expected mission time and are still running, and a few people have the nerve around to here to bash NASA for their horrible, numerous mistakes?
This stuff isn't easy. Just because you reap the benefits of the entire space program from your living room couch via the TV without actually contributing one bit does not mean you have any understanding of how complex and spectacular these great accomplishments are.
To the NASA / JPL engineers and scientists: Thanks.
See, that's what I thought happened at first. I assumed it was something like one had an X-Ray detector while the other had a mass-spectrometer or something (I would think NASA could tell the difference between a drill and a spectroscope). It was nothing of the sort, they got the calibration files mixed up between the rovers (technically the rovers mixed up between the calibration files, but it's the same end result).
This isn't journalism, this is headline mongering. Especially throwing in that metric/imperal thing. This would be journalism if it was "NASA discovers error in rover calibrations, corrects data". Since they have all the raw data they just stick it back through the computer and it's like it never happened.
Instead they try to make the public think NASA screwed up big again, like where one rover was supposed to have a camera and the other some kind of gas meter and they swapped 'em.
You can argue about whether there is bias in the media (and whether it's liberal or conservative), but the BIGGEST problem is crap like this. Why report the good stuff ("US troops build new school in Iraq despite RPG fire") when you can report just the bad ("US troops attached by RPG fire"). The former spokesman (he was temporary, can't remember name or title) for the Bush administration recently said that this was what he thought was wrong with the media in this country first and foremost, and I agree. I just wish whoever submitted this to /. had found a less sensationalistic source to link to rather than promoting this kind of crap.
Comment forecast: Bits of genius surrounded by a sea of mediocrity.
Then ask yourself how many times identical twins that you've known managed to play some trick on you.
And can we tone down the headline sensationalism a bit? You'd think the rovers have a core drill where there should be a camera or something. They somehow managed to switch two spectrometers, as identical as modern metallurgy can make them, destined for two similarly identical rovers - and now the error's been uncovered and the data recomputed. Jeesh...
Right, a Post-It. On a spacecraft to Mars? These are highly sensitive one of a kind instruments. You don't just go sticking paper and glue all over it.
Post-Its are not static dissipative. You could have a static discharge damage components and you wouldn't even know until the rover had landed on Mars. You could accidently leave a Post-It on the spacecraft and cause damage. How do you know residue from the glue on the Post-It won't cause damage? Now you have to test for that. It is amazing how one stupid thing like a Post-It note could add more complexity and make things even worse.
Now what would have been smart is to have devices like this keyed so that they can't possible be installed in the wrong place. But that tends to add complexity to the design and when you are only building a handful of rovers in highly controlled conditions, it can be hard to justify.
What is stupid is that there is no investigation of what happened. Sure, in this case the mixup was relatively harmless, but the next one might not be. NASA needs to be more proactive and not wait until things blow up to have an investigation. I don't expect perfection, but they at least have to understand their flaws.
What I don't understand is why this is a big deal.
It isn't a big deal. Instead of "Mars Rovers Have Incorrect Instruments Installed", a better headline would have been "Mars Rover Data Analyzed With Incorrect Calibration Data Files". But the editors would have rejected a headline like that.
It's true that the swap occurred when the instruments were installed. But it's really just a matter of semantics whether you consider the instruments to be swapped in the rovers on Mars, or their calibration files to be swapped in a computer's filesystem on Earth. Once the swap is discovered, it's over.
but lets keep it in perspective... these people PUT ROVERS ON MARS
a whole boatload of things had to go exactly correct for it to work at all. to find one chink in the system and think of it as a screwup is like looking at the -- well I can't think of anything it's like, but it's lame.
Small problems lead to medium sized problems which lead to big problems. Example: In the 1970's the NRC was similar to the Department of Transportation or FAA (pre 9/11) in that their job was to help facilitate the nuclear economy, not to beat down offenders. In the early 70's plant managers at a nuclear power plant in Alabamba, Browns Ferry Nuclear Power Plant, received reports that their insulation connecting to a cable room was not in accordance with fire specifications (small problem). Since this was not a significant problem, managers ignored it. Later workers testing the air-tightness of the room failed to follow the correct procedures by using candles to check the air tightness (if the flame is deflected, air is moving in that direction--small problem). Managers were aware but dismissed the problem. During testing for air leaks the flame of a candle was sucked into insulation and a fire erupted. The cable run that caught on fire was non-redundant and carried all of the control features for two nuclear reactors. Control of the reactors was lost and reactor safety was severly compromised. Problems that occured included that the operators of the reactors did not know how to properly respond to this casuality (including attempts to put out a large class A fire with portable CO2 extinguishers). Over $100 million in damages occured, but the reactors narrowly escaped tragedy (medium sized problem). This occured in 1975 and the NRC mostly covered up the problem. No congressional hearing were held. No significant corrective actions were issued and review of the ability of the operators to fight a casuality at a nuclear power plant was not reviewed. Fast forward four years and we arrive at Three Mile Island (big problem), where many of the shortcomings of the Brown's Ferry Plant and of the NRC being able to regulate and control the nuclear industry were exposed.
The lesson to learn here: if small problems exist, dig at them to see how far you can get and then fix *all* of the problems that you uncover. There are many other examples (including the 9/11 incident) but I think the point is obvious: there are problems at JPL that are not being looked at because *nothing* happened. They should be examined and corrected prior to a medium or large problem occuring.
Suddenly, the hairy finger of a familiar monkey tapped me on the shoulder. It was time.--G. T.
No, it wasn't. Yes, there was error accumulation. But the accumulation was due to a metric-english conversion factor that had been dropped during the port of the flight software from a previous program. The lack of decent documentation for the software meant that the folks assigned to do the port were unaware of the significance of the conversion factor. without the conversion factor thruster burns were executed incorrectly, resulting in a deviation from the designed trajectory. Every burn resulted in worse errors. The mission still could have been saved, but the mission managers elected to ignore the growing deviations, and just "hope that things get better".
Please -- don't post if you don't know what you are talking abuot. It's much more expensive to test and bin passive components than to just build them with higher or lower tolerances as appropriate.
Had either of the Mars Rovers crashed or broken in some way, this mistake would never have been discovered. With only 1 rover's data, there would be no mysterious discrepency to solve and this mistake would have never been resolved.
So scientists would have spent the next 10 years developing their theories of martian geology based on incorrect data if either one of those rovers hadn't deployed and you call this a minor issue?!
This kind of error is inexcusable. But of course, it'll get brushed over because NASA was lucky enough to be in a position to fix it.
They were certainly the wrong instruments, as they are providing incorrect data.
It is only by virtue of the luck that both Rovers are functional that NASA discovered this problem. If either one had end up dysfunctional after landing, this error would have remained uncorrected and scientists would be basing the next decade of Mars geology on incorrect measurements!
...and it proceeded to install those instuments all over the surface of mars.
Spirit and Opportunity have performed incredibly well. These guys deserve nothing but respect.
No, small problems lead to no serious consequences. That's why they're called small problems. If they can lead to serious consequences then by definition they are not small problems. The magnitude of the problem is determined by the worst case scenario (Murphy's Law being what it is and all). Let's look at your example:
In the early 70's plant managers at a nuclear power plant in Alabamba, Browns Ferry Nuclear Power Plant, received reports that their insulation connecting to a cable room was not in accordance with fire specifications (small problem).
What is the worst case scenario if there should be a fire and the cables fail? If this is the cabling to the coffee pot, not much (small problem). If this is the cabling to the non-redundant control features of the nuclear reactor then this is a BIG problem and should have been treated as such.
Later workers testing the air-tightness of the room failed to follow the correct procedures by using candles to check the air tightness
What is the worst case of using this alternate procedure? In this case, there is an increased likelihood of fire. Even if the cabling was not faulty ANY fire is bad, so this should have been flagged as a BIG problem as well.
Both of these should have been recognized as big problems and not ignored. The fault is not that small problems were ignored, it was that they were not properly classified and prioritized. It sounds like there may have been many other problems as well, but they are not related to your main point.
The lesson to learn here: if small problems exist, dig at them to see how far you can get and then fix *all* of the problems that you uncover.
This sounds very profound but it is a fallacy. The lesson to learn from your example is to properly classify and prioritize potential problems. It is a major waste of time and effort to address every single tiny problem which creeps up, especially in highly complex systems it is close to impossible. There are only a limited amount of resources available. You must prioritize the truly important vs the trivial or you will never accomplish anything. BTW, way to pull out the nuclear bogeyman to help make your case.
Of course, this really has nothing to do with the NASA screw up since it really is a small problem. I doubt that the sensors were really that far off to begin with, and now that the problem has been discovered it can be 100% fixed with no loss of data. No harm no foul. Problems like this will continue to happen because everything NASA builds is a prototype! These are not mass produced items. When you build something (or write code) for the first time, is it perfect? I am also suspect of your conclusion that this problem indicates that "there are problems at JPL that are not being looked at." There may very well be problems in the bureaucracy, however this problem is indicative of nothing more than "shit happens." Of course, don't let this get in the way of a good NASA/JPL bashing.
When you lose something irreplaceable, you don't mourn for the thing you lost, you mourn for yourself. - Harpo Marx
Oops, you just blew up a spacecraft with that spelling error. See how easy that was?
Unless you've tried this you have no idea how hard it is. Try designing a flight program to make sure all the i's are dotted and t's are crossed, then having the budget slashed over and over again until you can barely manufacture, test, and launch. Then try the same thing in a three-shift environment that goes on for a couple of years, and make sure that not a single mistake happens.
Please understand that a lot of time and money is spent testing for mission failure scenarios, and there isn't much left for the two-identical-instruments-switched scenarios. For all you know the Scientists missed, misread, or forgot the memo telling them the switch occurred.
Of course it did, and they are. Do you know how many pieces originally intended for Spirit were installed on Opportunity because of schedule issues, and vice-versa?
I did not have the honor of serving with the people that worked that mission, but I surely respect the sacrifices they made to make it happen, and the level of success they have achieved. If you think you can do better, feel free to send in a resume.
What got me is that surely you'd calibrate it after putting it in the rover. You don't calibrate something, install it, and then test it, you install it, test it, and then calibrate it. (Then test the calibration.)
So maybe they're confused, and the problem isn't that they swapped the instruments, it's that they saved the data under the wrong filenames or whatever. Someone tested the two, and then sat down and filed Spirit's test results under Opportunity, and vis versa.
If corporations are people, aren't stockholders guilty of slavery?
"The lesson to learn from your example is to properly classify and prioritize potential problems. It is a major waste of time and effort to address every single tiny problem which creeps up, especially in highly complex systems it is close to impossible. There are only a limited amount of resources available. You must prioritize the truly important vs the trivial or you will never accomplish anything."
The classification of large and small is by the person observing the problem, not the person interpreting it. With 20/20 hindsight we can say that the failure of airport security to find weapons on the 9/11 hijackers was a big problem, but to the supervisors, it was a small problem (they were looking for guns and drugs).
While your logic works well for software development projects where noone can be killed if it fails, high-risk or high-value technologies cannot follow the same procedure, especially when they are of an integrated design (where many items can affect the operations of remote items). NASA operates high-risk and high-value technologies. So do nuclear plants. The QA system for a nuclear reactor is not Bugzilla. The developer's of Brown's Ferry nuclear power plant did not realize that having a non-redundant cable run was a problem. Noone did. They did know; however, that their materials were not correct and that their personel were not following procedures. They took no action and almost had a nuclear accident. At no time did they have a meeting discussing the safety of the foam insulation, what procedures to take until the foam could be replaced, and what would be the worst case scenario if the foam caught fire. By your definition this isn't a problem because a reactor meltdown didn't occur. But if the nuclear community would have learned from it, TMI might not have occured. (By the way, I'm not overemphasizing it because I believe that "any reactor malfunction is 30 minutes to meltdown". I work as a reactor operator. This was a very serious incident.)
"When you build something (or write code) for the first time, is it perfect? I am also suspect of your conclusion that this problem indicates that "there are problems at JPL that are not being looked at." There may very well be problems in the bureaucracy, however this problem is indicative of nothing more than "shit happens."
Wrong attitude. When you build something, you build it to specification, and you write procedures for it. If it ever deviates, you carefully analyze the problem and fix it. If something breaks, you determine why it broke, because you might have a bigger problem. And then you test your product to verify that it meets the standards. You never say that "shit happens". "Shit happens" is just a codeword for "I'm too lazy to determine the real cause".
Suddenly, the hairy finger of a familiar monkey tapped me on the shoulder. It was time.--G. T.
When did I say this primarily applies to software developement? I don't even work in software developement. Your point was that every problem, large or small, needs to be addressed with the same diligence. This is ridiculous and impossible. Problems must be categorized and prioritized. This applies to everything, including software, hardware, as well as high-risk high-value technologies. Actually more so in this case since the system is more complex. If you do not evaluate and prioritize nothing will get done.
By your definition this isn't a problem because a reactor meltdown didn't occur.
Nice strawman, when did I ever say anything like that? My exact words were "Both of these should have been recognized as big problems and not ignored." That seems to be the exact opposite of claiming that there wasn't a problem. My point was not that there was no problem (obviously there was), my point was that the problem was not what you said it was (a small problem being ignored leading to an almost catastrophe).
I work as a reactor operator.
And I assume you spend your days tracking down and solving every problem, no matter how trivial? If someone forgets to change the water in the coffee pot you track it down and fix it, because any small problem could lead to catastrophe, right?
Wrong attitude.
Sorry, that's reality, where things do not always go according to plan, no matter how carefully you plan or test.
When you build something, you build it to specification, and you write procedures for it.... And then you test your product to verify that it meets the standards.
Of course you do, however sometimes you don't foresee everything when you write the spec or procedures, or sometimes you have a new assembler or mechanic come on board and something is not done quite correctly and the tests don't catch it, or any of a hundred other things. I am a mechanical engineer working in aerospace and I can tell you that SHIT HAPPENS, no matter how much we wish it didn't, and no matter how many tests you plan or steps you take in order to make sure it doesn't. Engineers are not supermen, and the people who actually put it together and test it are human too.
If it ever deviates, you carefully analyze the problem and fix it. If something breaks, you determine why it broke, because you might have a bigger problem.
Of course you do, I never said you just ignore the problem. However, my point was that this is a prototype, some problems are expected. What if your tests don't catch the problem and it isn't discovered until it is out in the field? In my field we fix the part and sometimes retrofit the fix back to units in the field, however NASA only has one shot at this. If NASA manufactured hundreds of these rovers you can bet that they would become super reliable and all of these issues would be caught and fixed. However, it is simply impossible to catch all of the potential issues in the lab, and a one off prototype is going to has a few mistakes.
You never say that "shit happens". "Shit happens" is just a codeword for "I'm too lazy to determine the real cause".
I'm getting pretty sick of your strawman arguments and misrepresentation of my position. Once again, I never said the problem should be ignored, I just said that this is properly classified as a small problem. It does not impact the mission success, and in fact it can be corrected 100%. Obviously NASA should still investigate the cause and take steps to prevent it in the future, however to expect an experimental prototype to be perfect is ridiculous, and taking NASA to task for this error is equally ridiculous.
When you lose something irreplaceable, you don't mourn for the thing you lost, you mourn for yourself. - Harpo Marx
I disagree - it is a minor thing. As this project had a finite development budget. A Risk Analysis (RA) was performed. RA would tell you that this instance, because the "data processing" was done back here on Terra, that it made no difference which instrument was installed in which rover. Risk Mitigation would also have pointed out that it's an easily correctable problem and therefore time/money shouldn't be spent verifying it. Time would be better spent on making sure that the instruments worked. There were many, many, many more important thing to make sure were right - like trajectories, timings, Radar altimeters, etc. That's where you focus on the "major things."
Bill
It's my Sig and you can't have it. Mine! All Mine!
Because the devices went into the wrong rovers, they were using curve A to correct the results of device B and curve B for device A. I'm not sure how they realized this, but once the determination was made, it is trivial to fix the problem by swapping a couple of files and reprocessing the data....