10 Years In, Mars Rover Opportunity Suffers From Flash Memory Degradation
astroengine writes Mars Exploration Rover Opportunity has been exploring the Martian surface for over a decade — that's an amazing ten years longer than the 3-month primary mission it began in January 2004. But with its great successes, inevitable age-related issues have surfaced and mission engineers are being challenged by an increasingly troubling bout of "amnesia" triggered by the rover's flash memory. "The problems started off fairly benign, but now they've become more serious — much like an illness, the symptoms were mild, but now with the progression of time things have become more serious," Mars Exploration Rover Project Manager John Callas, of NASA's Jet Propulsion Laboratory in Pasadena, Calif., told Discovery News.
Memory bristles
Like Scottish thistles
Make operation tough
Plus the interplanetary stuff
Burma Shave
Get thee glass eyes, and, like a scurvy politician, seem to see things thou dost not.--King Lear
But to claim it under warranty, you have to return it to the manufacturer
It is time to start building out the martian rover maintenance infrastructure so these guys can be towed in for repairs and upgrades.
the growth in cynicism and rebellion has not been without cause
At least they have identified a fix. But it surely won't be too long before more of the flash memory banks start exhibiting similar behaviour.
Still, 44x longer lifespan than originally planned == win in anyone's books.
Would it have cost to ship it with a RAID array of flash drives?
If only they had over-engineered it last, this never would have happened!
http://mars.nasa.gov/mer/mission/status_opportunityAll.html
I don't know that one could expect similar behavior from the other banks on a similar schedule. This is fairly old technology in terms of design and software, so I don't think they're doing any sort of automatic wear leveling, for instance. It's probably "manually leveled" if at all. For all we know, bank 7 was used the most and it's worn out. Or, it's taking more total ionizing dose (TID) because of the physical location on the card. Or, it's just a process variation when making the flash chips themselves. They were probably fabricated in 2000, most likely at Micron, since for a 2003 launch, the computer was probably assembled by early 2002, if not earlier.
Or, the software is not optimized for "space flight use" but, rather, for "consumer camera memory card", which has a different read/write/erase pattern and error tolerance.
http://spinroot.com/gerard/pdf/25MC.pdf describes an improved file manager under development, but also describes the existing flash architecture.
If it was long-known that long-duration, low-intensity heat would revive failed flash, why did these rovers leave without the ability to do so?
And why am I not able now to buy flash memory that will heat itself to 800 degrees and heal itself?
And why isn't flash memory sold in ceramic housings that can stand me baking them in an oven for a few days to fix failed flash manually?
I'd like to buy hardware that works, or that can be repaired. That's not flash.
if the issue turned out be mould.
Or at least the failing flash isn't the reason the problem is serious. Software bugs involving how the failed flash is handled are the problems, causing infinite loops and automatic reboots.
So, does that mean that NASA needs to go back to the plated wire memory and tape systems like the Honeywell systems that ran the Viking and Voyager systems for decades on Mars and in space?
No, the article says that you either need low-intensity, long duration heat (which has apparently long been known), or high-intensity, short-duration:
We are still buying flash that we can't fix because of the packaging. We're still shipping this unfixable flash in mission-critical applications. When does it get fixed?
Let me guess: they used OCZ flash memory?
The Christian Right is Neither (Christian nor right). See: Matthew 23, Matthew 25, Ezekiel 16:48-50
Dave, my mind is going. I can feel it. I can feel it. My mind is going. There is no question about it. I can feel it. I can feel it. I can feel it. I'm a... fraid.
"Win treats sysadmins better than users. Mac treats users better than sysadmins. Linux treats everyone like sysadmins."
12 years ago no smartPhones, tablets, on flash laptops due to expense of flash. Even Ipods had micro-disks. Just a few cameras and mp3 players with very limited memory. Those devices, or at least there chips, were upgraded long ago.
Now I'll just fire up my Steampunk Mars Exploratron and off we go!
-- Tigger warning: This post may contain tiggers! --
Until a gear strips. Or a bearing freezes up. Or... there's lots of things to go wrong with a mechanical window that will render it non functional.
If good science would be still available after a decade (Opportunity) or many decades (Voyager), at least light components like flash and electronics in general should be designed with good degree of redundancy. Or else if the probe has a limited mission and has accomplished it, there is nothing wrong with abandoning it and focusing money and talent on new missions. Would engineers working on attempts to fix Opportunity be more useful working on newer Curiosity mission? My gut feeling is that making existing missions last longer is much more cost effective than launching new ones. But I am not a space scientist. The point is that mission planning should have clear focus one way or the other.
The memory, as little as it is, the Voyager spacecraft, must be of a different sort. Launched in the late 1970s, the electronics is still functioning, although with a few issues. That'll soon be four times longer that the Rover.
The Voyager craft were intended to operate for many years. The mars rovers weren't. The mars rovers also reside in a much harsher environment than the space probes which float weightlessly in a vacuum at a constant temperature.
There was no reason to design the flash memory to last much longer than the expected lifetimes of the wheel bearings or solar panels. Just because by some miracle those both lasted much longer than expected, it doesn't mean that additional investments of resources into the memory would have been justified.
That, I tell friends, is why I'm happy to drive a 30+ year old car. It has issues, but the hardware it's built from is inherently more long-lived than that in today's cars. A crank-up window just keeps working. One driven by an electric motor doesn't.
False. Cars from that era were routinely sent to the scrapyard when they were less than 10 years old because they were rusted beyond repair. Now the average age of US cars is over ten years, twice what it was in the 1960s. Old cars also required constant maintenance of problem-prone mechanical parts such as ignition points and carburetors.
This is an interesting event. Failure of the flash memory can only really be overcome by either replacing it or having a secondary flash that's on standby, syncing up periodically so that it has much less wear on it, so you can extend the mission by switching over to the backup/secondary flash memory. However, this would add precious ounces to the payload, thereby requiring more fuel, etc.
Awk! Pieces of eight. Pieces of eight. Pieces of seven... ERROR: General Protection Fault. [Paroty Error.]
Hah, FlashZheimer :D
But then again, who does?