Houston, We Have a Software Problem
An anonymous reader writes "The computer system that launches the Space Shuttle is an old, but important, computer system. It is built from mid 70's technology and features SSI chips like 7400's...which are getting hard to find. It has 64k of memory and no room to repair any software bugs. NASA started the CLCS project in 1996 which uses state of the art computer languages, OO methodologies, and hardware. Everything that you could actually hire people off the street for. However, NASA is in a budget crunch with the Space Station cost overruns. It is looking to trim costs to keep the Space Station going. There are stories about CLCS getting cancelled here and these guys say its already cancelled."
Certainly the 7400 series as a whole is still widespread and used in hobbyists kits, I'm not that old. Maybe the original 7400 is becoming obsolete, being replaced with the 74LS (low-power Schottkey) or CMOS chips? If then it shouldn't be too difficult to replace the TTL logic with CMOS logic, given a few adjustment levels in voltage, or they could use the TTL-logic and CMOS-logic in one compatible chips.
Of course, the 5400 series SSIs (small-scale integrated circuits) are preferred over the 7400s for industrial purposes, and as a plus they are completely backwards compatible. Why isn't NASA using those?
"The lesson to be learned is not to take the comments on slashdot too literally." --Vinnie Falco, BearShare
Auditing the emulator and the host OS would be a problem - the code they've currently got has a very low rate of bugs, and has been extensively audited. NASA knows everything from the hardware up, exactly what the failure rate is and so forth.
Now, imagine you take modern commodity hardware (which changes periodically - look at how often Intel silently release new steppings of their CPUs). You're not going to have a guarantee of consistency there. You're going to have to boot an OS off it - and even the simplest RTOSes are still much, much bigger than the whole platform currently. Then you need an emulator. Then you need the system. And the only problem you've solved with all that work is the unavailablility of the old hardware - you still have a old machine language on a tiny platform which can't be easily extended for new functionality.
What?
"shuttle_launcher_0_1"
Excellent. That'll save a few dollars. What's the development status?
"1 - Planning, sir"
Ah.
(1) Print up 50,000 numbered authenticity certificates...
(2) Break down the old mainframes until you have roughly 50,000 pieces...
(3) Sell it on eBay (or other auction sites) as space memorabilia, mention that the computer the parts came from were responsible for guiding the Apollo missions to the moon, etc and so on... The machines are SO obsolete now that the only way they could pose a security risk is by sending them back in time...
(4) Profit!
(5) Buy a nice little beowulf cluster, hire 20 Linux geeks and feed each of them $50 in dew and pizza in exchange for setting up the system...
(6) Use remaining funds to pay the Russian space agency to have a little "airlock accident" for that Nsync guy...
Just because you can mod me down, doesn't mean you're right. Shoes for industry!
It's not like this is rocket science!
Oh, wait....
$0.02 (CDN)
I'm not one to replace things that are working fine, but as I understand it, newer designs could be a whole lot cheaper to operate. So I wonder if pouring more into the Space Shuttle program is the best thing to do.
I'm not saying "let's throw out the space shuttle" but it bothers me that there's apparently nothing in the works with a decent shot at replacing it any time soon. It seems the field of space exploration is becoming antiquated.
There comes a time in every products lifetime when its time to start over,.
Exactly. And that includes the shuttle. It has never lived up to what it was envisioned to be and it is only going to become more costly and more failure prone in the future as every bit of hardware on that pig is already showing signs of fatigue.
There are many launch systems that cost far less per pound to throw things into orbit. The reasons we still have those monstrosities flying are political only, not technological or scientific.
Sure this is flamebate. (Gosh, getting rid of the old karma system is so LIBERATING!) But if we can discuss how some little bits of hardware in the shuttle are past their time, why can't we discuss the big bit?
This is a very pertinent point that appears to have been lost on the initiators (and now burger flippers) of the replacement-launch-thingy project.
What they have, right there, is one spectacularly reliable piece of software. I suspect it's significantly more bug free than even the microcode in a modern processor, let alone the companion chips, bios, operating system, and virtual machine for some god awful p-code language (not that I'm naming names here).
The question that should have been asked is "how can we make a sustainable process for making extremely reliable control computers?". How to go about cutting custom silicon, tiny os's etc. How to save the happy tax payer hundreds of millions of dollars by reselling these services to people making nuclear power stations, heart pace makers etc. instead of going shopping for big sun boxes.
Oh well, reality strikes again.
Dave
I write a blog now, you should be afraid.
From an article in the Sydney Morning Herald .
The software is built in a similar way - lots of internal checks, tell-me-thrice memory, soft-failure-bit-flip-correcting daemons etc. In this case, lives aren't at stake, but the people doing the programming are used to situations where they are.
Zoe Brain - Rocket Scientist
Replacing it can be harder. I used to work in newspaper publishing; the core editorial systems of one employer were old ATEX J11 systems with a proprietary, tightly integrated OS and application suite. Over time, various aspects of the system were offloaded to more modern systems (eg, PostScript output and integration with graphics from desktop systems had dedicated AIX systems, imagesetters driven by PostScript RIPs, dumb terminals run from dedicated I/O boards replaced with terminal emulators on the desktop).
Despite all this tweaking, the crufty old systems stayed in place. Why? Well, on each of these old boxes, we could support 25-30 journos and the systems just worked, grinding out newspapers day after day.
People kept talking about replacing them, not least because we had to train up operators and engineers on them every time new staff came in, parts were hard to come by (the standards-not-compatible SCSI and ethernet interfaces were picky about what they talked to, and the filesystem could only address 600 MB of disk per system), and they used huge amounts of power and floor space.
For the three years I worked there and in the three years hence no-one has been able to deliver an editorial system that just works. When vendors rolled their rigged demos in, they crash. The major vendors like CyberGraphics and ATEX couldn't point to successful implementations of their new systems producing a decent number of newspapers on the basis of more than one edition per day.
Would it have been nice to have a Unix or Windows based system? Sure. Reduced overheads and training burdens, able to buy the latest and greatest hardware, and so on. But no-one could actually deliver something that worked better than the crufty old J11 systems.
NASA are probably in a similar bind; it's a very familiar problem: old systems developed by tight, focused, skilled teams and developed over the years are very, very hard to replace.
Yes, he does mean Core Memory, and yes, the AP-101 as flown in the Shuttle from mid-70s through to mid-90s did indeed use Core memory.
Indeed, the upgrade to the AP-101s with (I think) static-column RAM took so long because Core memory has the lovely property of retaining information even when the power dies - a key factor, sadly, in the ability to retrieve information from Challenger's onboard computers after the 1986 crash. Another key factor is that Core memory is remarkably resilient to bit-flipping caused by cosmic rays and other radiation (events known as "SEUs" or "Single Event Upsets").
All of which meant that it was a major project just to replace that memory with more modern RAM. And it's not just a couple' sticks of SDRAM either - most of the space-savings you'd expect from replacing bulky core with nice compact RAM chips is taken up with additional hardware to a) provide sufficient power support to retain memory in the event of main power failure b) continually scan through memory doing parity checks to detect and correct for SEUs...
Don't diss Core, man...
--
I'd rather have a bottle in front of me than a frontal lobotomy