Slashdot Mirror


Why Hubble Broke and How It Was Fixed

angry tapir writes "I recently had the opportunity to sit down with Charles (Charlie) Pellerin, who was NASA's director of astrophysics when the Hubble Space Telescope launched with its seemingly fatally flawed optical system. Pellerin went on to head up the servicing mission that finally fixed the telescope and for that was awarded NASA's highest honor, a Distinguished Service Medal. Since Hubble he has done a lot of thinking about the problems that led up to the error and how organizations can best avoid making similar mistakes."

29 of 73 comments (clear)

  1. The real hero by vonshavingcream · · Score: 4, Informative

    The real hero of that project was a man called Story Musgrave. http://en.wikipedia.org/wiki/Story_Musgrave There was a lot of planning put into fixing it, but without him actually up there in space improvising when stuff went south, the Hubble would be useless today.

    1. Re:The real hero by JosKarith · · Score: 5, Funny

      I guess the flawed optics were another example of NASA's short-sightedness...

      --
      'Don't worry' said the trees when they saw the axe coming, 'The handle is one of us.'
    2. Re:The real hero by Big+Hairy+Ian · · Score: 4, Funny

      > I guess the flawed optics were another example of NASA's short-sightedness... Groan.... I should have seen that coming!

      --

      Build a Man a Fire, and He'll Be Warm for a Day. Set a Man on Fire, and He'll Be Warm for the Rest of His Life.

    3. Re:The real hero by decsnake · · Score: 4, Interesting

      yeah, the guys that designed the corrective optics, the mechanism that deployed them, all the tooling, processes and procedures that were needed to install them and trained the astronauts didn't matter at all. It was all Story. Yup, he's the real hero.

      The real driver behind the repair missions was this guy: http://en.wikipedia.org/wiki/Frank_Cepollina

    4. Re:The real hero by line-bundle · · Score: 3, Insightful

      Looks like you did not read the article, but this is slashdot after all.

      He talks about teamwork. Individuals contribute, but group dynamics are very important, and perhaps a deserve larger share of the success than any individual.

      By picking someone out as the hero you are committing the same errors as in the past.

  2. The real story... by jbrandv · · Score: 5, Informative

    I worked at Ball Aerospace years ago and found out the real story. NASA cut the budget for Hubble so that a final optical train alignment task was never done. The engineers had designed a laser test to check the optical path but NASA wanted to save the $50000 the test would take. So until it was turned on, in space, they had no clue how bad it was. Working with NASA was tough mostly due to their arrogance.

    1. Re:The real story... by trout007 · · Score: 5, Insightful

      It's not only the $50k for the test. Most likely there will millions in cuts and this test happened to be in the mix. It would be nice if you only had to pay for the tests that showed problems. It would make engineering much easier. Unfortunately you have to test for everything even the stuff that works fine.

      --
      I love Jesus, except for his foreign policy.
    2. Re:The real story... by Anonymous Coward · · Score: 5, Interesting

      I worked for NASA at the time of the repair. Sadly, because of the ridiculous cost of the shuttle the cost of repair could have built 3 Hubbles, launched two using Atlas boosters to a higher, clearing and more useful orbit and kept one in reserve. Just. for. the. rescue. mission. STS was a horrendous waste of talent and opportunity.

    3. Re:The real story... by T.E.D. · · Score: 5, Insightful
      You ought to RTFA. That was just one test out of many, and all the previous tests showed the mirror failing too. They just didn't report the failures. Why? Well, because they had other big "emergencies" going all the time, and (this is key) they were under intense pressure from management to solve all these other "emergency" problems quickly, since the whole project was already over budget by nearly a billion dollars.

      Your anecdotal story is intersting, but it fits right into what he was talking about with the Management failures at NASA. Clearly it wasn't the lack of that test that caused the problem. It was a management decision to not perform it. Probably under the exact same pressures. Even if it had been performed though, who's to say they wouldn't have rationalized away the results like they did all the other failed tests?

      "We tested that mirror over and over and over with a different kind of device, the old style refractive null corrector," Pellerin says. The results? "Half wave of error, half wave of error, half wave of error." "So some people sat down and said, 'What's going on?" Pellerin recalls. "The mindset was that the mirror can not be other than perfect. So something else is happening. They concluded that the mirror was sagging under the force of gravity in the three point mount rather than being on the bed of nails by half a wave. "Well it turned out that was wrong. But they rationalised, rationalised, rationalised.

      ...

      The project had suffered other challenges beyond fabricating and mounting the mirror; staff were being "hammered" all the time, Pellerin says. In addition there was constant angst about how far the project had gone over budget. "Hubble's initial budget was $434 million we closed it at $1.8 billion just for the flight segment; big overruns." "So the way it works is you tend to blame the people doing the work," Pellerin says. "So we're hammering on them, hammering on them so they had no free time or no inclination to track down anything that wasn't a critical problem because we have other critical problems. Difficult technical things that we couldn't solve yet." The review board also found that a hostile environment had been created for the contactor, which meant "they told us about any problem at their peril," Pellerin says.

    4. Re:The real story... by careysub · · Score: 2

      Is NASA really so underfunded they cannot afford a 50k dollar test to make sure their 1.5 billion dollar telescope?

      Read the article (I know, this is Slashdot...). The problem was not that they didn't do obvious tests that would have revealed a major flaw - THEY DID! But they didn't believe the results since the far more sensitive null corrector should/could not possibly have made an error of this magnitude (it should have been accurate to 1/65 wave, but was off by 1/2 wave). They assumed that the test set-up they were using was flawed -- and the environment they were working in (schedule and budget overruns, prestige and jobs on the line putting pressure to get the mirror out the door) discouraged taking pains to understand fully what was happening.

      --
      Starships were meant to fly, Hands up and touch the sky - Nicky Minaj
    5. Re:The real story... by careysub · · Score: 2

      According to TFA, they did do the final test, and it showed problems. Unfortunately, they came to the conclusion that the test was bad, not the mirror. They assumed that since the mirror was no longer on it's 'bed of nails', it was sagging under gravity, and that was causing the test error.

      Given the thickness error in the mirror was less than the thickness of a piece of paper, that is a reasonable explanation. It was really small error given the size and weight of the mirror and gravity unfortunately does have a huge effect in slightly deforming such heavy optics. And yes, the optics were carved with gravity deformation in mind as well.

      And the other bad thing is, well, the further something is away from you, the tighter the tolerances needed in order to resolve that object, so an error as tiny as it is makes for very blurry images.

      To anyone who works with telescope mirrors (even ones costing 0.001% the cost the Hubble mirror) knows all about gravity sagging, it is every present and even my 13.1" mirror requires carefully designed supports. The degree of sag with the 3-point support should have been (and probably was) a calculated, pre-known quantity, and that it was sagging far more than expected should have led to an investigation as to why that was if the cultural environment had supported proper review. So no, it was not a reasonable explanation.

      --
      Starships were meant to fly, Hands up and touch the sky - Nicky Minaj
  3. Afraid to speak up about problems by afeeney · · Score: 5, Insightful

    The article mentions that the contractor was afraid to bring up problems.

    That, plus the mentality from management that people who bring up problems are "troublemakers," "negative," "not team players," etc. (because they've put too much of their ego or political capital into a project) has got to be responsible for more disasters, large and small, than any other deadly combination.

    I worked for a large nonprofit that blew money on doomed projects as though money grew on trees. Each time, it started with somebody, usually a contractor or somebody else who stood to gain from it, flattering the leadership that this was huge and visionary and would make or save them millions. Then the organizational mind control started, where everybody was saying that it was the greatest thing ever. Then the flawed project management started. Then when the cracks were obvious, people who pointed them out were vilified as naysayers. It was only the lower-downs who said anything because to rise, one had to be a "team player," and the organization was hierarchical enough that lower-downs were ignored. Then denial that there were problems, together with tossing more money at it (including adding more people to a software project at the last minute because that always works). Then even when the leadership [sic] team [sic] all realized there were problems, they all waited until the person responsible for the project was willing to concede defeat. because in a political environment, nobody wants to confront somebody who might retaliate

    Those elements are the inevitable recipe for disaster for any project, but it's fear that drives virtually all of them. Fear of not looking good (note that the Congresscritter didn't yell about wasting taxpayer money, she yelled about being made to look bad), loss aversion, fear of admitting a mistake, fear of speaking up.

    Pellerin was brave enough to do something technically illegal and scrape up the funds for servicing it.

    That is what a leader does.

  4. Re:Interesting read by buchner.johannes · · Score: 4, Informative

    I wanted to joke about the PhDs getting drunk at their desks, but there are a couple of gems in the text:

    "I saw this guy, Richard Feynman, who was a review board member, take a piece of rubber O-ring and put it in his icy water on television, and showed that it stiffened up. So immediately I said, 'Oh, that's the technical problem, they didn't do the O-ring well.'"

    "That was nuts," Pellerin says. "These guys understood the O-ring, but I put that story in my head because technical people look for technical answers. I never read the conclusion of [the review board] report that said it was a social shortfall."

    We see this very clearly when discussing evoting.

    Then towards the end there is an interesting analogy of the Shuttle accidents with a Korean airline company having an extreme crash rate, referring to people put under too much pressure, and irrational .

    "There's a bunch of research I've come across in this work, where people say that the social context is a 78-80 per cent determinant of performance; individual abilities are 10 per cent. So why do we make this mistake? Because we spend all of these years in higher education being trained that it's about individual abilities."

    It's actually a good read for people interested in managing.

    --
    NB: The message above might reflect my opinion right now, but not necessarily tomorrow or next year.
  5. Re:Interesting read by Anne_Nonymous · · Score: 4, Funny

    Mars Impactor was designed to hit the Red Planet at 600 mph, but because of The French, it hit the planet at 1000 kph instead, ruining the mission.

  6. Wonderful article. by NoahsMyBro · · Score: 2

    I don't remember the last time I so thoroughly enjoyed and appreciated an article. To the original poster, and the /. editors that approved this - thanks. This one was a winner.

    1. Re:Wonderful article. by neurogeneticist · · Score: 3, Funny

      Totally agree. This is actual news for nerds who are interested in how to effectively manage, and be managed, by other nerds. Or, we could go back to arguing whether Autism is a fake diagnosis, based on, you know, our skill with Java. I kid, I kid. Sort of.

    2. Re:Wonderful article. by Overzeetop · · Score: 5, Interesting

      This is the core of real engineering work, and it's one of the reasons I loved working at NASA under great management. I mostly squandered the opportunities I had there, and yet I still learned more from that time than anything else in my career. I actually started there working for a brilliant optics guy who was at Perkin Elmer during the Hubble years. Later, my direct supervisor went on to play a key role in the servicing mission, and (last I heard) was part of the JWST team.

      Later, worked in private industry for the team the (essentially) discovered the hole in the ozone layer. We got into it verbally from time to time, but I really respected his knowledge of the physics we were involved in. I once joked about getting fired if the part I was working on failed. He looked me right in the eye and said, "Oh, I won't fire you. I'll make you stay here and fix it." I smile a bit every time I think about that meeting.

      --
      Is it just my observation, or are there way too many stupid people in the world?
    3. Re:Wonderful article. by angry+tapir · · Score: 2

      Hey, It's the author of the article here - thanks so much for your kind words. I was pretty happy with how it turned out!

  7. Re:Interesting read by operagost · · Score: 3, Insightful

    The basis of the units are irrelevant; consistency in their use is. Unless you're able to tell me that the length of a path travelled by light in a vacuum in 1/299,792,458 is directly related to landing a probe on Mars.

    --

    Gamingmuseum.com: Give your 3D accelerator a rest.
  8. Re:Interesting read by I_am_Jack · · Score: 2

    It was because people weren't paying attention. Had their been a second set of eyes regarding the specs for the control system, and then a review to make sure the programming was correct (it was the control system that had inaccurate data), there would have been no issue. There is the misplaced notion that somehow S.I. is more accurate than Imperial. It isn't. You could create anything to be the basis for a standard of measurement. As long as your measurements (and methods thereof) are reproducible, and your instruments of measurement are calibrated to a traceable standard, all is good. While it would be convenient for the US to adopt S.I. wholly, it doesn't make measurements in Imperial units any less accurate. It does make instrument manufacturers richer, though, as companies working with both systems have to buy two sets of instruments, two sets of standards, etc.

  9. Re:Interesting read by bws111 · · Score: 5, Insightful

    You should read the article, because your comment is exactly the kind of thing he is talking about. Technical people who think they have found a technical problem, therefore the solution is to correct that problem. If the problem was really that measurement system the US uses is 'wrong', then how can the US have so many successful space missions? The problem is not that there are multiple measurement systems, or that one is somehow superior to the other. The problem is that the teams did not communicate successfully - not a technical problem at all. And don't say 'well, if the stupid US would use the same system as the rest of the world it wouldn't be a problem', because that just shows you completely missed the point. The point is that there was ineffective communication - a leadership problem - not simply a technical problem.

  10. Re:Interesting read by MrFlibbs · · Score: 4, Interesting

    I attended an astronomy conference a year ago that included a presentation from a NASA guy on the mars rovers. He had a few disparaging things to say about Lockheed-Martin, including blaming them for the Mars Climate Orbiter failure. He said their contract included a statement to recalibrate the thruster in the metric system but they failed to do so. (Of course, he neglected to mention that NASA was managing the project and failed to catch the error.) He also said one of the rovers drove by the heat shield (built by Lockheed-Martin) from the rover landing and there was a big disagreement over examining the heat shield up close to see how well it held up. Lockheed-Martin wanted the data but wanted to keep it secret on the grounds it was a proprietary design. NASA said all their data is public so it's either we drive by without looking, or we take a look and release all the data. They eventually did the latter.

    One more thing -- the same conference included a presentation by a professional astronomer who had overseen the building of an observatory in Chile. He had disparaging things to say about NASA -- that their cost estimate was 10X over what he eventually spent on the project. Guess it all depends on your point of view.

  11. Re:Interesting read by jo_ham · · Score: 2

    That's exactly my point. NASA *does* use SI units... just not consistently. It has suppliers that use non-SI units. Those units are also self-inconsistent in some cases (for example, the size of a gallon).

  12. Re:Interesting read by jo_ham · · Score: 4, Informative

    NASA specified SI.

    Supplier did not supply SI, since it bases its measurements on US system.

    Problems.

    Yes, it was a communication and management error, but not entirely. It has been standard in scientific settings to use SI units for years and years. Failure to use them *especially when specifically outlined by the design brief* is not just a "communications problem" - it's a fundamental error in the product that was delivered unfit for purpose.

  13. Re:Interesting read by bws111 · · Score: 2

    Well, you did miss the point. It was ENTIRELY a communications problem. Saying 'NASA specified SI' does not mean there was effective communication. That's like saying just because a teacher gave a lecture all the students have learned the information. How do you know the information was received and understood? How do you know that information was received and understood by every single person working on the project?

    It is obvious that somewhere along the line there was a breakdown in communication. It could have been between NASA and the supplier, it could have been internal to the supplier, but somewhere communication failed, and that failure is what lead directly to the disaster. And the poorest excuse for bad communication is 'you should have known'.

    Note that this failure could have just as easily occurred even if SI was being used. What if one party thought they were getting a rate specified in cm/sec, but the other party thought it was supposed to be in mm/sec? All SI units, just as spectacular a failure. Didn't an Arianne rocket blow up because of something very similar to this?

  14. The "anti-indiviudal abilities agenda" by paradigm82 · · Score: 3, Insightful

    I think the article was in some ways flawed. It gave a good description of how the error occurred. Then it moved on to a huge tirade against the focus on "individual abilities" which it blames for the whole error. Firstly, even taking the description of how the error occurred at face value, it is not at all clear that the error had anything to do with a focus on "individual abilities". On the contrary, it seems this was just an instance of really poor management that - due to cost overruns - pushed their employees to work harder, to the point that they lost their focus on quality and maybe even started cutting corners in the fabrication process. This has absolutely nothing to do with a focus on "individual abilities". However, let me address the "anti-individual abilities agenda" anyway.

    The anti-individual abilities agenda is routinely promoted by managers, project managers and other people engaged in the management layers (management consultants, business schools etc.). The motive is pretty clear: Many bosses don't like admitting that the success of their project comes down to individual abilities of a few core members on the project. After all, what is the value of management then, they ask? It's like the tail wagging the dog.

    However, this is just denying reality. I can firmly say that on any project of major size I worked on, the was a few 5-10% of people on the project running the show. This in itself is not very surprising, what is surprising is the fact that these 5-10% were not centered at the top of the pyramid. Rather, it was evenly spread out over all 'layers' from 'highest to lowest'. These people (by virtue of their skills and dedication to the project, something that is often lacking with the project management itself!) automatically assume a role of authorities whether management likes it or not. It's simply the only way to get things done. Let's face it, on any project there's going to be a lot of 9-5'ers that don't really care. They are never the ones driving the car, nor should they. It's the 5-10% who has both the ability to and the interest in getting the job done that counts. Those that dream about the project at night and who feel their personal honour is at stake in making it succeed. Also, as Fred Brooks noted in 'a mythical man month', some (sub)projects are like surgery. You need one highly skilled person to be in charge and carry out the job, and the rest of the team members are really just accessories of that person. Their contribution can be important of course, but at the end of the day, all choices, responsibility resides with the 'surgeon' etc.

    I think the lesson to be learned from these observations is that management needs to accept that this is the structure that projects will generally fall into, no matter what they do. The job of management is to get the best result out of it. On projects with poor management that creates obstacles for progress and makes lots of bad choices (this often happens on politically infested projects as well as on projects where management doesn't have a clue about the technical aspects), often the project finds a way to completely bypass management. Decisions by management may be outright ignored, or important decisions are never brought up to this level but are just made behind the scenes. This is a very dangerous situation since important decisions may not be properly reviewed and may not even be known by all stake-holders. While most decisions taken may have been correct, it takes just one bad decision to jeopardize the project, and problems related to this kind of "skunkworks decisions" tend to surface very late where they may cause huge problems, sometimes disasters.

    The job of management is to embrace the individual abilities, and to listen carefully (but of course not uncritically) to arguments brought forward, no matter if it is from a project manager or a "lowly" techie. They need to make a decent effort to try to understand what they are talking about, even if the explanations are not always clear and even if it can sometimes be highly technical.

  15. Re:Interesting read by Solandri · · Score: 3, Insightful

    Contrary to popular belief, the mixup was not an SI vs English units problem. The problem was that the numbers were passed from Lockheed to NASA without units. Without the actual units jotted down after the numbers, the Lockheed people knew the units were lb-f. The NASA people assumed the units were Newtons.

    It's an important distinction because the same error can happen even if you work entirely in SI units. If I write down a number in kilonewtons but fail to write down the units, and you assume it is just newtons, we end up with the same problem. I've seen this happen countless times in the lab and while tutoring, with kids plugging grams into an equation when they're supposed to be using kg. (Which BTW is one stupid thing about SI units - really confuses the kids that the base unit for everything else has no prefix, but the base unit for mass is a kilo-gram.) Fortunately, forcing them to write down the units after every number usually takes care of this problem.

    In science and engineering, any time you see a number without units, your immediate reaction should be to ask the person who provided the numbers what the units are. (Actually you should be ripping him a new one for failing to write down the units, dimensionless numbers excepted.) Never assume the units, always ask.

  16. Re:Interesting read by jo_ham · · Score: 2

    But that's the point - they are a standard (and ever since their inception, the standards body has been looking for ways to define them based on invariable things, rather than on arbitrary things like a mass of platinum/iridium alloy, or a metal rod that is a certain length.

    SI as a system for standardising units across science is not controversial, or even new - the fact that a multi million dollar space mission can fail so spectacularly because one supplier was using imperial units is just unforgivably moronic. They can blame it on "communication errors" all they want - the fact is that the people who built the thing didn't even *look* at the units on the design brief - they just assumed everything was in imperial units.

  17. The REAL real hero by Baby+Duck · · Score: 2

    The REAL real hero would be the advisor against naming your child Story.

    --

    "Love heals scars love left." -- Henry Rollins