Slashdot Mirror


Design, Hardware, Software Errors Doomed Japanese Hitomi Spacecraft (scientificamerican.com)

Reader Required Snark writes: The Japanese space agency JAXA said its recently launched X-Ray observation satellite Hitomi has been destroyed. After a successful launch on February 17, contact with the satellite was lost on March 28. Off the 10-year expected life span, only three days of observations were collected. Preliminary inquiry points to multiple failures in design, hardware and software. After the launch it was discovered that the star tracker stabilization didn't work in a low magnetic flux area over the South Atlantic. When the backup gyroscopic spin stabilization took control, the spin increased instead of stopping. An internal magnetic limit feature in the gyroscope failed, causing the spin get worse. Finally, a thruster based control started, but because of a software failure the spin increased further. The solar panels broke off, leaving the satellite without a long-term power supply. It seems that untested software had been uploaded for thrust control just before the breakup. This is a major loss for astronomical research. Two previous attempts by Japan to launch a high-resolution X-ray calorimeter had also failed, and the next planned sensor of this type is not scheduled until 2028 by the ESA. Just building a replacement unit would take 3 to 5 years and cost $50 million, without the cost of a satellite or launch.

19 of 101 comments (clear)

  1. Is that all? by Dan+East · · Score: 4, Funny

    Design, Hardware, Software Error

    Oh, is that all?

    --
    Better known as 318230.
    1. Re:Is that all? by phrostie · · Score: 3, Interesting

      it sounds like everyone is starting from scratch every time a project like this is built.
      regardless of success or fail, wouldn't it be best for everyone to release the engineering and software so that the next one is an improvement over what went before.
      it also might reduce the reduce the life cycle of the next project.

      just my .01999 USD

    2. Re:Is that all? by Anonymous Coward · · Score: 2, Informative

      On the first launch of the Ariane 5 rocket, it used parts of the control software of the Ariane 4, a very reliable rocket with a success rate of more than 97%. The launch ended with the destruction of the rocket 37 seconds into the flight due to an arithmetic overflow. It had not been taken into account that the bigger rocket would cause bigger values in the control software.

    3. Re:Is that all? by Eunuchswear · · Score: 2

      On the first launch of the Ariane 5 rocket, it used parts of the control software of the Ariane 4, a very reliable rocket with a success rate of more than 97%. The launch ended with the destruction of the rocket 37 seconds into the flight due to an arithmetic overflow. It had not been taken into account that the bigger rocket would cause bigger values in the control software.

      It was great, they used software coded in ADA that detected the overflow and raised an exception, disabling the faulty part, the work was then taken over by the backup system which, being identical, did exactly the same thing. Whoops.

      --
      Watch this Heartland Institute video
    4. Re:Is that all? by Eunuchswear · · Score: 2

      I would also have thought that there would be simulators for most of this crap.

      A KSP add-on?

      --
      Watch this Heartland Institute video
  2. 3 days of data? by JustAnotherOldGuy · · Score: 3, Insightful

    Only got 3 days of data? Damn, that's gotta hurt.

    Also, the "Design, Hardware, Software Error" bit is funny in a way...I mean, what else was left to screw up? This was like the Trifecta of Fuckups.

    --
    Just cruising through this digital world at 33 1/3 rpm...
  3. Software uploaded before breakup. by fahrbot-bot · · Score: 3, Funny

    It seems that untested software had been uploaded for thrust control just before the breakup.

    See what happens when you don't disable the GWX settings.

    --
    It must have been something you assimilated. . . .
  4. I feel sorry for that guy... by Anonymous Coward · · Score: 5, Informative

    From the TFA

    Dan McCammon, an astronomer at the University of Wisconsin–Madison, helped to design and build Hitomi’s premiere scientific instrument, an X-ray calorimeter that measures the energy of X-ray photons with exquisite precision. He has been working on the technology for more than three decades, flying versions of it on the ASTRO-E mission, which failed on launch in 2000, and the Suzaku spacecraft, in which a helium leak rendered the instrument useless weeks after its 2005 launch.

  5. Re:It's only when the backups kick in... by mrbester · · Score: 2

    If only. All they'd need to do in that case is "reverse the polarity"

    --
    "Wait. Something's happening. It's opening up! My God, it's full of apricots!"
  6. Re:Good by Midnight+Thunder · · Score: 2

    Space is dead. It's a radiation-blasted vacuum. Nobody is going to live there. Ever. Get over it, Space Nutters. We should kill all astrophysicists and burn all scifi books. Like in Europe.

    Europe got bored of that and the sport is now found elsewhere in the world. I for one welcome space nutters, since they give us something else to talk about :) I would burn the trolls, but not considering myself a violent person will accepting making a sport of them.

    --
    Jumpstart the tartan drive.
  7. Those are not software and hardware errors -- by vmaxxxed · · Score: 5, Interesting

    Those are called political and budget pressure by managers who have no clue on engineering ---

    Software uploaded with out testing ? There is no way they could have gotten this far with out testing. I am sure there is no engineer in Japan that does not test thoroughly. Actually Japanese code is famous for being of the best quality -

    This was caused by politics, bureaucracy and plain bad management.

    1. Re:Those are not software and hardware errors -- by gweihir · · Score: 2

      Indeed. And very likely by a culture of "not contradicting the boss". An engineer that is unwilling to "contradict the boss" is a bad engineer, no matter what other skills he has. Of course, many bosses simply get rid of the "naysayers" and foster a culture of "can do". The results are invariably what we see in this story, although many managers manage to conceal that they were responsible for quite a while and sometimes forever. If the damage is huge, it is very rarely the engineers that have screwed up.

      --
      Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
    2. Re:Those are not software and hardware errors -- by gweihir · · Score: 2

      I most certainly do not want to be the engineer responsible for a spectacular failure. Of course, the software field has far too many "engineers" and many of them bad in other ways, which makes the problem worse. But while I work on a level where I cannot only speak up, it is required that I speak up, I can understand the person that decides to keep quiet.

      --
      Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
  8. Dating an Engineer by fahrbot-bot · · Score: 5, Funny

    It seems that untested software had been uploaded for thrust control just before the breakup.

    Note to self: Don't ask your girlfriend questions you don't want the answers to - again.

    --
    It must have been something you assimilated. . . .
  9. Re:Open source satellite software? by jc42 · · Score: 3, Interesting

    It was probably running Linux, first mistake.

    Nah; it was probably running ITRON. It may well have included a POSIX library, but that wouldn't qualify it as a version of linux, even if some linux code is included there.

    I haven't actually bothered to dig up the info, but that's what anyone acquainted with how such things are done in Japan would guess for a situation with serious RT requirements. Maybe it'd be interesting to investigate, to get an idea whether the OS and system libraries might have had anything to do with the failures.

    --
    Those who do study history are doomed to stand helplessly by while everyone else repeats it.
  10. Cretinization of engineering by gweihir · · Score: 2

    This is just one of the more spectacular examples. I have heard of managers of large software teams that "do not believe in testing", I have seen Internet-reachable critical software that got a security evaluation only after deployment, because it was finished only a few days before deployment, and quite a few more things of similar utter incompetence. My guess is that the people responsible for these completely ridiculous screwups are "managers" that think they know how it all works (while being clueless), and that have eliminated all resistance to their views by firing anybody actually competent.

    This is a dangerous and completely unacceptable regression. Humanity needs to be good at engineering if it is to have a future.

    --
    Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
  11. Re:Suggestion to JAXA by joe_frisch · · Score: 2

    You don't want to knee-jerk it. Who approved the upload of untested software and why. There could be a valid reason - say a fatal bug discovered in the existing code and no way to change the launch schedule. It could be budget pressure - simply not enough money to test. It could be plain incompetence.

  12. Root cause analysis? by drwho · · Score: 3, Insightful

    I'd like to see a more thorough investigation of this set of incidents. That means no one involved gets to skip out by Seppuku. One of the problems with having a number of backup systems is that people tend to think "well, if it breaks, there's a backup system" - not realizing that each time a backup system is added, complexity is added, and that overall reliability goes down, instead of up. I don't know if over-reliance of backup systems, and failure to manage complexity, was the cause here, but it's the only thing other than "bad luck" or "sabotage" that can explain this disaster from a country which has many talented engineers.

  13. IBM 9000 by drwho · · Score: 4, Funny

    "Well, I don’t think there is any question about it. It can only be attributable to human error. This sort of thing has cropped up before, and it has always been due to human error."