Houston, We Have a Software Problem
An anonymous reader writes "The computer system that launches the Space Shuttle is an old, but important, computer system. It is built from mid 70's technology and features SSI chips like 7400's...which are getting hard to find. It has 64k of memory and no room to repair any software bugs. NASA started the CLCS project in 1996 which uses state of the art computer languages, OO methodologies, and hardware. Everything that you could actually hire people off the street for. However, NASA is in a budget crunch with the Space Station cost overruns. It is looking to trim costs to keep the Space Station going. There are stories about CLCS getting cancelled here and these guys say its already cancelled."
But I thought 64k should be enough for anybody...
I have nothing to allude to, and I am alluding to it.
And what plans do they have to keep this from happening again in a decade?
Sorry if the article answers this, I can't get to it.
"a quote" -me
Given todays hardware, why you can't just simulate the old system if finding parts for repair becomes a problem. You would just run your old software on the simulated machine.
At the beginning was at.
At some point it might be cheaper to give up on computers and just pilot the Shuttle by hand.
-- Ed Avis ed@membled.com
This is a common problem in big projects. The time it takes to design a system and then actually implement that system is so great, that by the time the sytem is complete, the hardware used to make that system is 'obsolete.' You can't just add more memory and speed, because then you'd have to go through and make sure that everything still works perfectly, and that would take so long as to make the current hardware 'obsolete.' The real problem here is public hype. You don't need 4 GHz and 40GB of memory to program the space shuttle, but if the public finds out that NASA only uses 64k, they will think NASA is behind the times, even though 64K is enough for the system. Of course, the space shuttle is already considered obsolete by some, and new sytems are being created, so don't fret much over this.
Stephen
Fault loves the past, worry loves the future, but content enjoys the present.
Certainly the 7400 series as a whole is still widespread and used in hobbyists kits, I'm not that old. Maybe the original 7400 is becoming obsolete, being replaced with the 74LS (low-power Schottkey) or CMOS chips? If then it shouldn't be too difficult to replace the TTL logic with CMOS logic, given a few adjustment levels in voltage, or they could use the TTL-logic and CMOS-logic in one compatible chips.
Of course, the 5400 series SSIs (small-scale integrated circuits) are preferred over the 7400s for industrial purposes, and as a plus they are completely backwards compatible. Why isn't NASA using those?
"The lesson to be learned is not to take the comments on slashdot too literally." --Vinnie Falco, BearShare
Nothing to do with the government. Pretty much every replacement of a so-called legacy system I've ever seen blows out the same way. Anyone who's seen replacement banking systems, SAP rollouts, you name it, will have the same experience.
What?
"shuttle_launcher_0_1"
Excellent. That'll save a few dollars. What's the development status?
"1 - Planning, sir"
Ah.
(1) Print up 50,000 numbered authenticity certificates...
(2) Break down the old mainframes until you have roughly 50,000 pieces...
(3) Sell it on eBay (or other auction sites) as space memorabilia, mention that the computer the parts came from were responsible for guiding the Apollo missions to the moon, etc and so on... The machines are SO obsolete now that the only way they could pose a security risk is by sending them back in time...
(4) Profit!
(5) Buy a nice little beowulf cluster, hire 20 Linux geeks and feed each of them $50 in dew and pizza in exchange for setting up the system...
(6) Use remaining funds to pay the Russian space agency to have a little "airlock accident" for that Nsync guy...
Just because you can mod me down, doesn't mean you're right. Shoes for industry!
It's not like this is rocket science!
Oh, wait....
$0.02 (CDN)
It has 64k of memory and no room to repair any software bugs.
LOAD "NASASHUTTLE",8,1
Just what is the space station actually for?
The money spent on this (and the space shuttle) could be spent on real science and could get a thousand off-the-shelf spaceprobes to interesting places.
I suppose getting rid of Lance Bass would have made it worthwhile, but even that's not going to happen anymore (unless /.ers constribute to a paypal account for this purpose...)
roses are red
violets are blue
the Russians have satellite laser weapons
so why can't we too?
## W.Finlay McWalter ## http://www.mcwalter.org ##
I'm not one to replace things that are working fine, but as I understand it, newer designs could be a whole lot cheaper to operate. So I wonder if pouring more into the Space Shuttle program is the best thing to do.
I'm not saying "let's throw out the space shuttle" but it bothers me that there's apparently nothing in the works with a decent shot at replacing it any time soon. It seems the field of space exploration is becoming antiquated.
Hire John Carmack to do the job. He's into rocketry so he gets to learn more about the whole thing, you get a kickass system, and he may even do it for free.
The guy's so good he may do a better job than a bloated team of 400 contractors.
They need a cheap replacement for a 7400? No problem! I have an old 7800 they can have for free. I'll throw in some 2600 games that it can play - StarMaster & Missle Command, that should get them back into orbit in no time, right?
Remember "Bring 'em on"? *sigh
I was thinking this. Why don't they open some of the current code and some of the requirements they need to "the community".
Think of who space enthusiasts are and what a lot of them do; software and hardware development. In a budget crunch a good strategy would be to allow interested hobbyists to write some of the code, and then have NASA's boys peer review it.
-- The unsig...
From an article in the Sydney Morning Herald .
The software is built in a similar way - lots of internal checks, tell-me-thrice memory, soft-failure-bit-flip-correcting daemons etc. In this case, lives aren't at stake, but the people doing the programming are used to situations where they are.
Zoe Brain - Rocket Scientist
Get a TI-89 and write an assembly program to control the space shuttle. The TI-89 runs off a MC68000 chip, and has (almost) a meg of space. That's about the programming power of the Apollo computers in a pocket-sized object--plenty of power to calculate the orbital trajectory/angle of entry/etc. It even has built-in calculus functions in case the astronauts forget the Fundamental Theorem of Calculus :) .
I'm the Devil the Windows users warned you about.
Some interesting information in the article... like the main reasons for cancelling the project are a lack of significant improvements in safety, reliability, or cost savings over the shuttle program's remaining lifetime. I'm no fan of keeping obsolete systems hobbling along beyond their years, but this reasoning doesn't seem outrageous to me. The outrageous thing is that it took 400 contractors to develop something that won't outperform a 30-year-old system that runs in 64k.
I love the idea and would love to see this. It also makes sense, since NASA, like most national space agencies are for the good of human kind.
On the other hand trying to see it from you average government official's point of view, there would be paranoia since, I believe, NASA shares technology with the military and even without the mililtary ties, there would still be fear mungering over 'national security'. Also, your average programmer probably wouldn't have access to the hardware to run and test the stuff on.
Jumpstart the tartan drive.
Might I suggest using FPGAs to emulate the hardware old system so the software doesn't have to be thrown out?
Assuming that circuit layouts are available for these old chips, it would be a piece of cake to emulate them in VHDL (a hardware description language) because they are comparatively simple to today's integrated circuits. Once the chip descriptions are written in VHDL, it would be relatively easy to 'port' the hardware over to a new FPGA if the old one dies or whatever. Then it would not be necessary to truly port or re-code any of the currently working code, and it would be much easier to fix bugs and extend it because you don't have the memory and speed limitations of the old system.
C _is_ the state of the art of procedural languages for over 30 years. It's not like such a simple thing has a lot of room for improvement, there are other areas where new languages can be created, but honestly how many ways are there to do things like preprocessor, functions, variables, etc.?
OO languages' authors may feel that they are doing something more "advanced" but in fact they are working on a completely ortogonal area of development. And most of them are far from C elegance (ex: Stroustrup doesn't even understand C design properly, so C++ is even more inconsistent than what its origin would suggest, and I don't even consider a rotting pile of shit that its "standard" libraty is, to be a part of language), or are simply badly designed (ex: Java), or are not languages but eclectic messes made by including specific librariers' and object models design into the language itself (ex: C#).
This is the area where we can use a lot of progress until it will reach the state where we can keep call the same thing "state of the art" for 30 years, but I won't hold my breath -- OO language design is dead, everyone is just making "OO" languages as various vehicles to promote their narrow-minded ideas. So I won't be surprised if Stroustrup's mess will remain the most useful semi-OO language for the next 30 years, too (but those libraries HAVE TO GO, and so should the attitude that students should learn that atrocity without studying C first).
Contrary to the popular belief, there indeed is no God.
At the time of the Challenger inquiry, the late physicist Richard Feynman was part of the investigation committee. He found that most of NASA at the time was in full delusional mode about how reliable the Shuttle really was.
The only exception was the computer systems group, in particular the software side. They had metrics, procedures and rigour.At the time of the enquiry the hardware was already old.
It's the attitude that counts, not the hardware, not the methodology of the month. OO is not going to solve NASA's problem, it's going to be difficult. Myself I'd just make sure that the hardware would always be available, and not change a thing.
NASA falls under the classification of "independent agency" within the Federal government. The budget is hooked up with other agencies such as the Vetran's Administration if that tell you anything about how things are considered.
"player 4 hit player 1 with 0 stroms"
Not only that, a single space launch of even a fairly small satellite still costs over a billion dollars. If there's a software glitch, it could render the satellite totally inoperable, and I doubt that these engineers want to tell their source of funding that a glitch they're responsible for just wasted the whole launch...
Which is also why Microsoft doesn't do aerospace embedded systems. :) Whoops, Satellite Redmond I just had a BSOD...
Qu'on me donne six lignes écrites de la main du plus honnête homme, j'y trouverai de quoi le faire pendre.
They obviously don't need very high performance, since it runs on 1970s hardware, but they do need high reliability and low development costs.
That means that they should be using a safe, secure high-level language. Something with a virtual machine might be a good idea so that it will be easy to adapt to new hardware platforms: you verify the virtual machine on the new machine and then have reasonable confidence that your code runs.
If they want something in widespread use, a home-built Java byte-code interpreter (not a JIT--they are too buggy) might be a reasonable choice--it's well specified and there are lots of people who know how to program it. They should probably avoid JNI like the plague and instead add new bytecodes for I/O and communications and verify them the same way that they do the virtual machine itself.. VLISP might be another good choice--or at least a source of ideas for how to implement a verified Java interpreter--DARPA already has paid for its development.
And they should hire someone who doesn't recommed COTS with C++, lest we see the next shuttle go up in flames again.
No, they're all big into Ada95. Based on Pascal, so not too far off, though it does have OO capabilities.
"These people look deep within my soul and assign me a number based on the order in which I joined" --Homer re:
I used to work for GSFC (Goddard Space Flight Center). It was wonderful... many years ago.
Anywho... they had *shitloads of unbelievable equipment... ages old... *name that piece of hardware*. We could wander from building to building, and look/view/see the equipment.
Lots were there because they were running projects that took many many years to see results, thus they could not upgrade *in-the-field* because it would stop the project.
Indeed, part of GSFC when I was there was to backup Houston on launches. When they upgraded they built a totally new floor above the existing backup, and on a *grand* day they transfered power, with one big switch, from one floor to the next - why? because they had to. It had to be well tested and well checked before it could be put in live production, yet the existing systems had to be on-line to backup Houston.
It was fantastic walking through the various buildings and rooms... I've seen equipment I've no idea what it did. For example, one room had these rather large, circular platforms with clear plastic or glass domes. Inside the domes where flat plates - think silicon... but BIG.. 1 1/2 ft octogon. Stacked with about 2 inches spacing, about 10 of them. I'd say, looking at the room, some very old old old type of RAM.
That's the wonder of NASA :)
(* Surely OO is a bit risky for such a thing. *)
Amen Brotha!
oop.ismad.com
Table-ized A.I.
Replacing it can be harder. I used to work in newspaper publishing; the core editorial systems of one employer were old ATEX J11 systems with a proprietary, tightly integrated OS and application suite. Over time, various aspects of the system were offloaded to more modern systems (eg, PostScript output and integration with graphics from desktop systems had dedicated AIX systems, imagesetters driven by PostScript RIPs, dumb terminals run from dedicated I/O boards replaced with terminal emulators on the desktop).
Despite all this tweaking, the crufty old systems stayed in place. Why? Well, on each of these old boxes, we could support 25-30 journos and the systems just worked, grinding out newspapers day after day.
People kept talking about replacing them, not least because we had to train up operators and engineers on them every time new staff came in, parts were hard to come by (the standards-not-compatible SCSI and ethernet interfaces were picky about what they talked to, and the filesystem could only address 600 MB of disk per system), and they used huge amounts of power and floor space.
For the three years I worked there and in the three years hence no-one has been able to deliver an editorial system that just works. When vendors rolled their rigged demos in, they crash. The major vendors like CyberGraphics and ATEX couldn't point to successful implementations of their new systems producing a decent number of newspapers on the basis of more than one edition per day.
Would it have been nice to have a Unix or Windows based system? Sure. Reduced overheads and training burdens, able to buy the latest and greatest hardware, and so on. But no-one could actually deliver something that worked better than the crufty old J11 systems.
NASA are probably in a similar bind; it's a very familiar problem: old systems developed by tight, focused, skilled teams and developed over the years are very, very hard to replace.
I think it's important to realize that the Shuttle also represents the pinnacle of 1970's computing and that the whole of computing has changed significantly in the last ~25 years. In the 1970's, you didn't worry about things like GUIs (and all the "bloat" that they entail), TCP/IP stacks, extensive amounts of code to deal with the wide variety of hardware configuration, etc.
It's not so much an issue of bloated code as it is an attempt to cover all the bases. The shuttle software was designed with one purpose in mind -- get that shit-heap into orbit. You can't compare it to a modern Linux distro without invoking an apples-to-oranges counter-argument.
Furthermore, the launch of the shuttle isn't handled by a single onboard computer. It's handled by several. Please reference The Space Shuttle Operator's Manual for more on the systems aboard the shuttle. It's a general, non-technical overview, but a great reference, nonetheless.
You ask "where will it stop?" Here's a hint: it won't. And this same argument probably came up in the 1970's when they started writing the spec for the shuttle. The computer aboard the shuttle is more capable than Apollo for a mission profile that isn't significantly more difficult in any regard (generally speaking). Hell, the PDA you have sitting on your desktop right now has far more computing power than all the computers involved in the Apollo program put together, and it certainly doesn't do anything like putting men on the moon.
But again, it's all a matter of the scope of usage.
blog |
"State of the Art" is a good way to run your pocket book into the ground. Jumping on the newest, fanciest programming language doesn't usually make a business successful.
Here's yet another example: My company's (former) largest competitor invested *millions* into Sun hardware and development in Java. Why? "State of the Art". And guess what! With all of their "state of the art" infrastructure, their system was still slow as molasses.
What did we do? We spent less than a tenth of what they did to develop with Perl on x86 servers. Our site handles huge traffic loads pretty effectively, and we did it without running ourselves to the bankruptcy court.
steve
Oh, you're not stuck, you're just unable to let go of the onion rings.
NASA is currently struggling with obtaining a reasonably modern rad-hard CPU. The market is so dinky that nobody wants to bother with it. But they have been able to retrofit flat panel displays, at least.
I used to work with military electronics and found that the best gear was always from the 80s. The stuff from the 60s and 70s (yes, some of that is still in service) was too primitive. The 90s hardware was too complicated and suffered from unreliable software.
In the 80s the microcontroller technology was just good enough to embed a processor with 64k of ROM full of finely crafted code written by a single programmer and it always just worked, perfectly, every time.
Stop worrying about the risks of nuclear power and start worrying about the risks of not using nuclear power.
John Carmack might be a kick-arse game programmer and a very smart guy, but he is not an expert compiler designer, complexity theorist, or, as is most relevant here, embedded systems programmer for safety-critical systems (though I'm sure he's rapidly learning about it with his rocketry hobby).
Any sufficiently advanced technology is indistinguishable from a rigged demo
--Andy Finkel (J. Klass?)
If NASA's budget is hurting so badly, why not swallow a bit of pride and recruit help from fans of the space program who may also happen to be hardware and software engineers?
Perhaps the crew at, say, ham radio organizations like AMSAT, or other groups that already combine volunteer engineering effort with an interest in space exploration, would be happy to help out with modernizing the systems. I wonder if anyone's asked them?
NASA would, of course, keep enough engineering staff around to check the improvements out, but why limit themselves to paid labor if the resource to pay is drying up?
Bruce Lane, KC7GR,
Blue Feather Technologies
Yeah. I'd trust my life to a bunch of geeks with no clear design principles.
"Information wants to be paid"
No that article was based on a stable situation (i.e. same technology, stable requirements and same methodology for each project). Because of stringent control and heaps of experience, they had perfected the process. With this project they picked new technologies and new methodology, hired a bunch of contracters and ran into the inevitable problem of shifting requirements, more new technology and methodological issues (off the shelf methods always need to be fine tuned and tailored).
The worrying thing is that with their expertise, they could have known that this would happen. The appropriate thing to do would have been pilot projects followed by more ambitious projects. Now they've bet everything on one horse and are left empty handed. What they had before they started was a piece of shit (presumably that was why they wanted to get rid of it), and now they have to face maintaining that piece of shit again.
It may have been an elegant system when it was first designed. However by now it has probably seen countless adaptations, comes with thousands of pages of documented changes (they are control freaks) and probably is very hard to understand and maintain. Likely many of its original designers are deceased or retired by now.
The decision to cancel the project appears to be a panic reaction by management. What they will soon find out is that their old stuff no longer can be modified cost effectively to new requirements. Replacing it will easily take half a decade if you start from scratch and they have just thrown away the efforts of half a decade of development.
Jilles
Note that they are most likely using GNU software. Here is a list of the software development environments for these chips, and Here is the European Space Agency's web page for the tools and emulator.
They obviously don't need very high performance, since it runs on 1970s hardware, but they do need high reliability and low development costs
this raizes an interesting question: how much better would a rocket with fast-response feedback mechanisms be ?
and what are the time-scales involved ?
how much can you raize efficiency and reliability (automated problem detection and solving) with better computing ?
would a "real-time" (at the time-scales involved) automated simulation and analysis of the machinery involved (using inputs from the hardware) be beneficial at all ? how ?
Working for necessity's mother.
Um... Functional programming might be a much more appropriate tool for this job than any OO language I know. And I program C++ for a living. :-)
If you disagree, post your argument. (-1, Overrated) isn't your personal censorship tool for views you don't like.
That really depends on your definition of reliability. If you're talking about things like buffer overflows and memory leaks, then yes, good basic programming technique in C++ makes it a far more powerful tool than most give it credit for. OTOH, C++ compilers are complex and buggy, and the imperative rather than declarative nature of the language makes verification of algorithms much harder than it might be. There is still plenty of scope for unreliable C++, on the scales we're talking about here.
In a case like this, you could well afford to go for a more advanced language and make sure you've got a well-trained development team and a verified compiler. The programming world has better tools than C++ available, but pragmatism puts them beyond mainstream use for the time being, which is why C++ remains such a useful tool. In this case, though, you have both the resoures and the motivation to use better.
If you disagree, post your argument. (-1, Overrated) isn't your personal censorship tool for views you don't like.
It strikes me that this is exactly the sort of project where you don't want to attempt to construct an ambitious, all-singing, all-dancing, state of the art, eighth wonder of the world. This misses the point about what is actually needed. Instead, you go for something as simple and straightforward as you can design which will have the capacity to do the job and continue doing the job for the forseeable future. It needs to be simple so that you can analyse its behaviour and failure modes with a high degree of confidence. You can push the sexy bells and whistles out to helper boxes, but the core systems must 'just work'. And technology that's far enough behind the bleeding edge for its characteristics to be well understood is definitely a Good Thing in these situations.
Remember the old engineering rule of thumb: "when in doubt, make it stout, out of things you know about".
I know the Java JVM is alreasy stack based , but is is far too complex to for the generated code to be verified. Stick with a very simple FORTH based stack with three data stack, long (64) int, Floating point ( 80/128? ). Note, no strings at all, all object/Array access via int syscalls.
You just don't see it.
What was "state of the art" in the '80s is now ubiquitous and hidden from the end user.
Ever found a bug in your portable Nomad MP3 player (The flash-based ones, not any of the disk-based ones - Although the disk ones are still pretty strong)
Has your car ever shut down because its computer crashed? (Note: Hardware failure doesn't count, although automotive ECU failure is RARE unless you've done something to screw with its cooling.)
What about your VCR?
These are all cases of coding like you described - Fitting as much as possible into as little space as possible. In Lucent's (now Avaya) business communications division, there was (maybe still is) a raging debate on whether the usability benefits of using two LEDs rather than one justified the *pennies* of extra cost on an item that sold for a few hundred dollars. In a cost-cutting environment that intensive, you're not going to spec a processor with 8k of flash and 2k RAM when a processor with 2k/256 bytes will do. (Note - Popular microcontroller such as the Atmel AVR, Microchip PIC, Motorola 68HC11, etc. all are in this range.)
And it's quite easy for a single programmer to do all of this. I've seen CD-based MP3 players developed in a few weeks by a team of two college students for their Microcontrollers course taking 3-4 other classes at the same time.
retrorocket.o not found, launch anyway?
So we'll end up having to rocket jump into orbit...destroying a large portion of Florida in the process.
Hmm, might not be such a bad idea afterall...;)
How much of what's in the system is really not currently available? They only mentioned 7400-series logic...variations (74LS, 74HCT, etc.) of that are readily available from companies such as DigiKey and Mouser, last time I checked. For what's not available, I'd think that old databooks would have functional descriptions and/or block diagrams from which VHDL could be written. While a transistor-level diagram would be nice, I'm not sure that it would be necessary--or even useful. (Example: if the 7400 wasn't still available, the databook would tell you that it was a quad 2-input NAND chip. That ought to be enough to duplicate its functionality.)
(Then again, I changed majors from computer engineering to computer science, so I could be all wet here. :-) )
20 January 2017: the End of an Error.