Slashdot Mirror


Crowdfunded, Solar-powered Spacecraft Goes Silent

Last week saw the successful launch of the Planetary Society's LightSail spacecraft, the solar-powered satellite that runs Linux and was crowdfunded on Kickstarter. The spacecraft worked flawlessly for two days, but then fell silent, and the engineering team has been working hard on a fix ever since. They've pinpointed the problem: a software glitch. "Every 15 seconds, LightSail transmits a telemetry beacon packet. The software controlling the main system board writes corresponding information to a file called beacon.csv. If you're not familiar with CSV files, you can think of them as simplified spreadsheets—in fact, most can be opened with Microsoft Excel. As more beacons are transmitted, the file grows in size. When it reaches 32 megabytes—roughly the size of ten compressed music files—it can crash the flight system." Unfortunately, the only way to clear that CSV file is to reboot LightSail. It can be done remotely, but as anyone who deals with crashing computers understands, remote commands don't always work. The command has been sent a few dozen times already, but LightSail remains silent. The best hope may now be that the system spontaneously reboots on its own.

58 of 366 comments (clear)

  1. Seriously? by Anonymous Coward · · Score: 5, Insightful

    I’m usually the first to defend others when some bug like this makes it through testing. Hindsight always being 20/20, only takes one bug amongst a million good bits of code, etc. But this just seems like something that even basic testing should have caught.

    Did they not run this thing on the ground for a few weeks? That’s just basic testing, especially for something that is going to be inaccessible for a while. Also that some critical bit of processing relies on stuff being written (and then presumably read back from) a csv file is very worrying.

    This sounds like some very shoddy work.

    1. Re:Seriously? by Mr+D+from+63 · · Score: 5, Insightful

      Testing might have found it, but I'd say that regardless of testing they should assume something bad will happen with the software and have a mechanism in place to force reboot & update on a locked up system. Maybe they thought they did. Its a shame if they can't get it fixed.

    2. Re:Seriously? by harperska · · Score: 5, Informative

      One report I read made it sound like they were aware of the bug for a while. It's possible that they had to launch with an old version of the software because the patch wasn't ready yet, and being a secondary payload on a launch you have no say whatsoever as to the launch date. They probably expected to be able to upload the patch after launch, but the log filled up faster than expected.

      That being said, it is shoddy programming to blindly write to a log on a resource-constrained embedded platform (or any platform, really. Just especially so on something like this), so somebody definitely goofed. All I am saying is that it probably was caught by testing, but couldn't be fixed in time due to various constraints. It was a dumb move on the developer's part to not do enough diligence and to rely too heavily on QA in the first place.

    3. Re:Seriously? by NotInHere · · Score: 4, Informative

      Their current plan is to wait charged particles to affect electronics so that it forces a reboot.

      Spacecraft are susceptible to charged particles zipping through deep space, many of which get trapped inside Earth’s magnetic field. If one of these particles strikes an electronics component in just the right way, it can cause a reboot. This is not an uncommon occurrence for CubeSats, or even larger spacecraft, for that matter. Cal Poly’s experience with CubeSats suggest most experience a reboot in the first three weeks; I spoke with another CubeSat team that rebooted after six.

    4. Re:Seriously? by itzly · · Score: 5, Funny

      Their current plan is to wait charged particles to affect electronics so that it forces a reboot.

      Watchdogs are for wimps. Real designers use supernovas in a distant galaxy to reset their boards.

    5. Re:Seriously? by fahrbot-bot · · Score: 2

      But this just seems like something that even basic testing should have caught.
      Did they not run this thing on the ground for a few weeks?

      It was tested by the same guys that tested the Boeing 787 for only 247 days ...

      --
      It must have been something you assimilated. . . .
    6. Re:Seriously? by mnooning · · Score: 5, Insightful

      As a retired QA guy, I can tell you that checking that no files can grow without bound is standard fare. Same with exercising all code for long periods of time, as you pointed out. That means there was not a single experienced QA guy on the team.

      By the way, CSV was the golden standard for many years. Given the tight compactness/memory budget that space projects have, CVS with it's small foot print might well be the logical choice.

    7. Re:Seriously? by ihtoit · · Score: 3, Informative

      it's not so much the capacity of local storage, plus you have to consider that this is a system which is unlikely to be touched by a human being ever again so whatever goes up has to be physically resilient - super-compact flash storage such as micro/SD would be out, I'd go a couple generations back and use Compact Flash with slightly lower capacity to take advantage of larger dies - this is why NASA went on a shopping trip very recently for Pentium I and Pentium Pro chips for space systems, they're by virtue of their architecture, fairly hard against the environment. Back to topic, a cursory search around and it apepars that it's an issue with the kernel, sysvinit and/or php, all of which at some point or another default shared and allocated memory spaces for various purposes (including scripting and logging) to 32MB.

      --
      Political debates have me rolling my eyes so much I think I got optical whiplash. I should sue. - Foamy The Squirrel
    8. Re:Seriously? by Kester1964 · · Score: 2

      NAND Flash is probably not seen as a reliable technology for use in a satellite, so they went with the much lower density but higher reliability of NOR Flash

    9. Re:Seriously? by penandpaper · · Score: 4, Funny

      issue with the kernel, sysvinit and/or php

      So, what you are saying is that I can blame systemd? Or did I miss it and systemd is our savior?

      I am confused and unsure how to be outraged by this. I am going to go eat ice cream.

    10. Re:Seriously? by g0tai · · Score: 2

      Unfortunately you can't trust electronics in space (see the other headlining article about cubesats rebooting from stray particles every 6 weeks).

      One of those particles hits your flash chip, given the size the dies are now, then it's going to effect a large number of cells potentially and corrupt the filesystem quite badly.

      Electronics in space have to be uber radiation resistant, this is why it's still 100MHz (etc) stuff that's being used and not the latest GHz stuff, because, reliability in an adverse environment! :-)

    11. Re:Seriously? by funwithBSD · · Score: 4, Funny

      32MB is all that anybody with a satellite would ever need.

      --
      Never answer an anonymous letter. - Yogi Berra
    12. Re:Seriously? by gstoddart · · Score: 5, Informative

      No, csv sure as hell is NOT a Microsoft format.

      Comma-separated values is a data format that pre-dates personal computers by more than a decade: the IBM Fortran (level G) compiler under OS/360 supported them in 1967.

      This has nothing whatsoever to do with Microsoft, as much as you seem to want to blame them.

      --
      Lost at C:>. Found at C.
    13. Re:Seriously? by luther349 · · Score: 2

      you would think they would have test ran the softwhere for up-time stability for weeks or months. then again being Linux crashing one piece of software should not cripple the system.i think they have a bigger issue.

    14. Re:Seriously? by RavenLrD20k · · Score: 5, Funny

      Oh...you mean the 'C-x M-c M-supernova' command shortcut in Emacs, right?

    15. Re:Seriously? by ihtoit · · Score: 2

      space is not the issue, it's the way the kernel and/or apps handle memory.

      Think of it like running win32 on a 64-bit chip with 8GB of RAM. It's nice having 8GB of RAM but Windws can't actually address it - it's a 32-bit kernel which means it can only address 4GB.

      Some logging processes particularly those configured to write to volatile memory are constrained by default configurations (in eg. php) which *allocate* memory space for scripting in 32MB segments. This can also include the scripts themselves and the actual log file, but it can also *allocate* one segment for the script and another 32MB for the log. You could have 4GB of total usable memory, all that means to the user/developer is that there is potential for 125 discrete pre-allocated memory segments.

      Fill that 32MB *allocation* and you run the risk of causing a page out of range error. If this happens in kernel space, well, that's a showstopper. If it happens in userspace, it can be anything from a minor annoyance (nice or kill the process then restart the system, all of which is doable remotely) or a complete showstopper (three fingered salute or in extreme cases, the hard power cycle. Which you ain't doing from low Earth orbit).

      --
      Political debates have me rolling my eyes so much I think I got optical whiplash. I should sue. - Foamy The Squirrel
    16. Re:Seriously? by macs4all · · Score: 4, Insightful

      Testing might have found it, but I'd say that regardless of testing they should assume something bad will happen with the software and have a mechanism in place to force reboot & update on a locked up system. Maybe they thought they did. Its a shame if they can't get it fixed.

      Speaking as an embedded developer, this is completely inexcusable.

      Not having a Watchdog, PLUS not making the limited-filesize log file "roll-over", is clearly Amateur-Hour stuff. Who wrote this code, anyway? An eight year old???

      Next we're going to hear that they bricked it with a software update, because they didn't think they needed to checksum the uploads, or provide enough RAM to hold the updated code before they re-flashed the OS, or something similar.

      Pathetic. They deserve to lose their spacecraft.

      Fortunately, if extraterrestrials discover the floating hulk of this abomination, they will (rightly) conclude that there is no intelligent life worth exploiting on this planet, and will decide not to enslave us...

    17. Re:Seriously? by war4peace · · Score: 4, Funny

      Nah, a Microsoft OS would reboot much sooner.
      Punishment for using Linux: now they'll have to wait for a decade to see a reboot.

      --
      ...gis sdrawkcab (usually not responding to ACs; don't bother posting as AC)
    18. Re:Seriously? by unrtst · · Score: 2

      By the way, CSV was the golden standard for many years. Given the tight compactness/memory budget that space projects have, CVS with it's small foot print might well be the logical choice.

      We're talking about telemetry beacon data written once every 15 minutes. CSV is NOT the ideal format for that, and is nowhere near compact. Naive CSV parsers are trivial, but also break very very easily (ex. embedded new lines in a quoted field; quotes in a quoted field; mixed quotes; etc). Also, while CSV can be read in a text editor, it doesn't format nicely there and can be difficult to read, so human readability is low; add to that the fact that humans are unlikely to be logging in directly and reading the file directly, and it being in plain text is pretty much useless. A fixed format binary file would be FAR more compact, easier to parse via a program, trivial to convert to CSV if needed, and really has no downsides besides users not being able to double click it and open it in excel/oocalc.

      Using a binary file would also allow more efficient access. There were comments implying that they were sucking the whole thing into memory at some point, which isn't needed for CSV either (unless a stray quote got in there and the parser didn't have a max record length limit), but it's certainly easier to jump to a specific record if you have fixed length records (which, being telemetry data, should be entirely possible).

      None of that really matters though. They file grew in size, and wasn't getting truncated or rotated - that's broken by design. Waiting for reboot is crazy (and implies that this was going to a tmpfs in memory, which is all the more reason to use a more compact format).

    19. Re:Seriously? by Tablizer · · Score: 4, Funny

      which is unlikely to be touched by a human being ever again

      Something slashdot readers can relate to

    20. Re:Seriously? by Holi · · Score: 2

      Turn in your geek card now. Your membership has been revoked for that comment.

      --
      Sorry, teleporters just kill you and then make a copy. A perfect, soul-less copy.
    21. Re:Seriously? by amicusNYCL · · Score: 4, Insightful

      Not having a Watchdog, PLUS not making the limited-filesize log file "roll-over", is clearly Amateur-Hour stuff. Who wrote this code, anyway? An eight year old???

      It's not even who wrote it, it's who designed it. Reading the summary actually made me angry that there is a group of people out there somewhere with the ability to build, launch, and track a satellite but without the common sense to recognize that they're creating a system that will grow infinitely in size without a mechanism to clear that data out. Does the satellite have unlimited storage space available? No? Then how about designing a way to monitor and clear the data other than saving it in /tmp?

      Pathetic. They deserve to lose their spacecraft.

      They definitely do. And no amount of descriptions of a CSV file meant for a grade school kid, or saying that 32MB is about the size of 10 songs, is going to minimize the schadenfreude that I'm feeling. Such a basic design error and they never even bothered to run tests for a significant period of time before putting the damn thing in space.

      Way to go, LightSail team. I dub thee LightFail.

      --
      "Our two-party system is like a bowl of shit looking at itself in a mirror." - Lewis Black
    22. Re:Seriously? by Megane · · Score: 2

      They're waiting for reboot because it froze the system completely. TFA says that the manufacturer of their "avionics board" had fixed this bug but it wasn't in the one that went up. So most likely it was a driver bug. A crash or lock-up in kernel space is a lot more problematic than just filling up a filesystem. And apparently they had scheduled an upload of the fix, but the satellite crashed right before the comms window. So now instead of a solar sail, they have a solar brick.

      --
      #naabhaprzrag, #sverubfr-000, #agi-fcbafberq, negvpyr[pynff*=' negvpyr-ary-'] { qvfcynl: abar !vzcbegnag; }
    23. Re:Seriously? by nitehawk214 · · Score: 3, Informative

      These guys did not launch a satellite, ULA did. Basically LightSale simply took a ride on an Atlas 5 that was deploying the X-37B and was thrown out as a secondary payload. Pretty much anybody can do that. A lot of CubeSats are often made by college students.

      Also, describing CSV and measuring files in songs makes me want to punch Bill Nye, and I love Bill Nye.

      --
      I'm a good cook. I'm a fantastic eater. - Steven Brust
  2. CSV by Anonymous Coward · · Score: 5, Insightful

    I know the average IQ at /. has gone down over the years, but I think the explanation of what a CSV file is is slightly too much dumbing down.

    1. Re:CSV by ArcadeMan · · Score: 5, Insightful

      I think the "32 megabytes—roughly the size of ten compressed music files" part is even more insulting.

    2. Re:CSV by gstoddart · · Score: 4, Insightful

      Honestly, I'm surprised they didn't try to define space, Linux, and solar.

      This sounds like someone failed to run a bench test where the system was up and running for an extended period of time.

      Which strikes me as utterly bizarre.

      --
      Lost at C:>. Found at C.
    3. Re:CSV by Megane · · Score: 4, Interesting

      To be fair, that was copypasta from TFA. And they carefully omitted the next sentence: "The manufacturer of the avionics board corrected this glitch in later software revisions. But alas, LightSail’s software version doesn’t include the update."

      That still doesn't excuse a problem that would have been found by bench-testing the thing for a few days before sending it up. Nor does it excuse constantly appending one file to store data in an unattended system. Also, anything that JPL sends up has a backup channel that can push that little red button on the main computer. All they can do now is hope for cosmic rays to reboot it randomly. At least it's in LEO and not zipping off into interplanetary space.

      In the meantime, the team is looking at several fixes to work around the software vulnerability once contact is reestablished. One is a Linux file redirect that would send the contents of the troublesome beacon.csv file to a null location, a sort-of software black hole. Lab testing on this fix has been promising—over a gigabyte of beacon packets have already been sent into nothingness without a system freeze.

      Well, isn't that special. Now they test it. So if they can just link it to /dev/null, did they really even need that data? It's always fun to cause a mission to fail by recording data that wasn't even needed.

      --
      #naabhaprzrag, #sverubfr-000, #agi-fcbafberq, negvpyr[pynff*=' negvpyr-ary-'] { qvfcynl: abar !vzcbegnag; }
    4. Re:CSV by ArhcAngel · · Score: 5, Funny

      Extended testing was one of the stretch goals. Sadly they never reached that tier.

      --
      "A person is smart. People are dumb, panicky dangerous animals and you know it." - K
    5. Re:CSV by pr0fessor · · Score: 4, Funny

      Bill Nye the Science Guy is their CEO so...

  3. How embarrasing by Tyrannosaur · · Score: 3, Insightful

    You'd think that something as small as 32MB would have been tested before they launched the thing... It doesn't sound like it takes very long to fill up 32MB either

  4. Re:Should have used apps! by atouk · · Score: 4, Funny

    How much v Could a LightSail see If a LightSail could c s v

  5. Systems Administration 101 by plopez · · Score: 4, Insightful

    Roll your log files. I smell a DevOps debacle.

    --
    putting the 'B' in LGBTQ+
    1. Re:Systems Administration 101 by prefec2 · · Score: 3, Insightful

      Any competent software designer and developer should have known basic rules of embedded systems. One of them: Do not use dynamic memory (and files are just alike). If you need space all the space you need must be determined at compile or design time. BTW why store all this data in the device? This should have been (if at all) implemented as an round robbing database. Yes that overwrites old data, but who cares? If you need all the data you should have calculated the amount for the complete mission and reserved enough memory for that. And why did they use a CSV file? Are their physicists?

  6. Mebibyte is an idiotic term by Anonymous Coward · · Score: 2, Insightful

    and you are an idiot for using it.

    1. Re:Mebibyte is an idiotic term by David_Hart · · Score: 3, Insightful

      Just because you don't like the term doesn't make it wrong. Highjacking SI prefixes and changing their meaning is wrong and has led to countless problems.

      And historical meanings shouldn't be changed simply so that marketing speak can be used to sell less at the same price.

      I love how 1 MB of RAM is 1048576 bytes but 1 MB of storage is now 1000000 bytes of storage, simply because the hard-drive industry decided that they could make more money by using the same term, change the historical meaning in the computing industry from base-2 to base-10 (essentially downsizing the actual storage), and charging the same amount.

      Either convert totally to GiB, MiB, etc. for everything computer related or stick with the old convention. It's when you are mixing the two in a particular context (i.e. computers) where you run into problems.

    2. Re:Mebibyte is an idiotic term by ArcadeMan · · Score: 4, Informative

      I wish people would stop thinking that hard drive manufacturers are the "source" of this so-called "problem". Digital communication speeds never used base 2, clock speeds didn't either.

      People are simply stuck on terms like "Mebibyte" because they either don't want to accept the fact that mega is an SI prefix or because they don't like how the IEC units sound. Get over it.

  7. tachyon eddy by The+Grim+Reefer · · Score: 2

    It came across a tachyon eddy and is at warp speed on it's way to the Cardassian homeworld.

  8. Re:UAT by itzly · · Score: 4, Insightful

    Well, how do you test it before you're happy ? If the beacon is 40 bytes, and transmitted every 15 seconds, it would take half a year before you fill up 32 MB. That's a long time for testing.

    This is the kind of mistake you shouldn't even make in the first place.

  9. Re:Professional Level FUBAR by Anonymous Coward · · Score: 2, Interesting

    I say professional because NASA screwed up a few years back with a probe to Mars when two systems attempted to communicate. One "spoke" in Kilometers, the other "Miles".

    Actually this particular failure wasn't as obvious of an oversight as you may think. The reason it happened was because in an existing system one particular set of parameters were logged in miles since they weren't responsible for flight control (which NASA mostly uses metric for). Later on portions of this design were reused and an engineer decided to use the originally non-essential values as a feed into the navigation system.

    The problem in this case is when you have something large and complex (a space craft) and a large organization with many projects (NASA/JPL) the younger generation tends to just rely on what's in place without doing the research they should.

    That being said there were many times this particular error could have been caught on the ground and weren't, and that's a process failure. The "process" should have caught it.

    Now get off my lawn!

  10. Re:Is this a joke? by nedlohs · · Score: 2

    No they mean MB because they even though they've crowdfunded a tiny satellite launch they are still not as autistic as you.

  11. Re:What the computer needs is ... by plopez · · Score: 5, Insightful

    No. They need programmers and sysadmins that knew that they were doing. E.g. roll log files and/or put logs on a non-critical partition. Systems Administration 101 for systems where memory and disk space are at a premium. It was a rookie mistake.

    --
    putting the 'B' in LGBTQ+
  12. Re:UAT by Anonymous Coward · · Score: 3, Informative

    First off.. LightSail isn't a NASA mission.. it's a low budget cubesat and cubesats tend to trade risk for rigor.

    NASA does run stuff for days/weeks/etc in testing. And you'll note that the Mars rover flash file system thing was able to be recovered from, thanks to smart people at JPL realizing that you always need a way to recover. This is not necessarily the case for cubesats, often built by enthusiastic grad students whose hair is not yet grey from living through near and actual disasters in flight projects: them young-uns just don't know any better.

    As a practical matter, "running for weeks on the ground" isn't practical: As an experienced software developer, I'm sure you know how real projects are always running tight for time: and a space mission where the launch date is determined well in advance can't just say "oh, I guess we'll slip the release a few weeks". You're building the spacecraft and verifying that everything works as well as you can: you verify that you can wiggle all the interfaces, you verify that the debugger/backdoor capabilities that will allow you to recover work; you verify the watchdogs. And you get what test time you can, before you ship to launch.

    Don't forget that for a lot of the testing, you reset the system state to a known starting point (that means wiping the non-volatile memory).

    And then you test, if you can, during the 8-9 months the spacecraft is on the way to Mars (which is WHY Spirit had the issue: they got a lot more test time on the software in flight than they had during the 3 year buildup of the spacecraft on the ground; log files got bigger, etc.)

  13. We don't need no stinkin' testing by petes_PoV · · Score: 4, Funny

    when some bug like this makes it through testing

    Testing? what testing? If it compiles, it works. Every hacker knows this.

    I have to say, when I read that the spacecraft ran Linux and had died, I naturally assumed that someone had left the auto-update enabled and it was busy trying to apply about 50 million kernel patches.

    --
    politicians are like babies' nappies: they should both be changed regularly and for the same reasons
    1. Re:We don't need no stinkin' testing by thegarbz · · Score: 2

      Really? If a system dies after sending it a reboot command I would blame systemd.

  14. At least the post used "hope" as a noun by jpellino · · Score: 2

    and not as a verb. Using "hope" as a verb in spaceflight hasn't always gone very well in the past.

    --
    "Win treats sysadmins better than users. Mac treats users better than sysadmins. Linux treats everyone like sysadmins."
  15. Mirab, his light-sails unfurled! by Anonymous Coward · · Score: 4, Funny

    Shaka, when the walls fell

  16. For those who don't know what Linux is... by tekrat · · Score: 2

    Coming up next on Slashdot... Linux is an operating system, kinda like Windows or Mac OS, but built by a bunch of neckbeards, and uses about the same amount of space as 10 compressed music files. Some versions use less, some use more depending upon how it's configured.

    Wow; I think it's time to move on from Slashdot. Taco would be spinning in his grave, assuming he was dead.

    --
    If telephones are outlawed, then only outlaws will have telephones.
  17. Re:UAT by itzly · · Score: 2

    You write your test so that it sends the 40 bytes to the csv file every 10 milliseconds instead of every 15 seconds.

    The moment that you think of doing that, is the moment that you realize that the file will grow too big.

  18. The most interesting spacecraft. by foradoxium · · Score: 2

    We don't normally test our spacecraft systems, but when we do, we do it after launch.

  19. Re:UAT by grimmjeeper · · Score: 4, Informative

    Speaking as an engineer working on software that is on the Orion spacecraft, I can say that rigorous testing is budgeted into the project from the beginning because it helps to avoid most of the problems like this. The testing that goes on with flight software is orders of magnitude more than you find for a traditional commercial product. You have to. The consequences of failure are, obviously, a lot more significant.

    That being said, it's impossible to catch every single possible bug, especially as systems get more and more complex. But there are strategies that help reduce your risk. For example, you don't just run off to kernel.org and throw the latest stable release on a board. You pick operating systems that are maybe a bit harder to use (i.e. limited in what they can do) but are far better suited to real-time embedded work. And you certainly don't blindly append to a file without verifying that you're not going to overflow your space. And you always have an automated recovery plan for any dynamically allocated space in the event of an overflow.

    This kind of failure is caused by amateurs making amateur mistakes. It was caused by application programmers who don't understand the consequence of failure in a constrained environment where you can't just click a mouse to restart the program. It was caused by poor planning and a lack of understanding of the environment in which they were designing. This was caused by hiring coders instead of experienced engineers. It was caused by trying to do it cheap rather than spending the money to do it right. They got what they paid for.

  20. Updated Summary W/ Tech Terms Explained by cve · · Score: 5, Funny

    Last week a week is approximately the amount of time between new 'Keeping up with the Kardashians' episodes saw the successful launch of the Planetary Society's LightSail spacecraft, the solar-powered satellite that runs Linux Linux is like Windows for smart people and was crowdfunded on Kickstarter Kickstarter is a place to buy digital watches . The spacecraft worked flawlessly for two days, but then fell silent, and the engineering team has been working hard on a fix ever since. They've pinpointed the problem: a software software is like what you download from the app store glitch. "Every 15 seconds, LightSail transmits a telemetry beacon packet a telemetry beacon packet is like a tweet . The software controlling the main system board writes corresponding information to a file called beacon.csv. If you're not familiar with CSV files, you can think of them as simplified spreadsheets—in fact, most can be opened with Microsoft Excel. As more beacons are transmitted, the file grows in size. When it reaches 32 megabytes—roughly the size of ten compressed music files 32 MB is also approximately the size of 13 iPhone 6 selfies —it can crash the flight system The satellite's twitter feed blows-up ." Unfortunately, the only way to clear that CSV file is to reboot LightSail Like holding down the power and home buttons on your iPhone at once -- don't try this unless instructed by someone at the Genius Bar . It can be done remotely, but as anyone who deals with crashing computers understands, remote commands don't always work Like when Siri plays Billy Ray instead of Miley . The command has been sent a few dozen times already, but LightSail remains silent. The best hope may now be that the system spontaneously reboots on its own Like when drop your phone in the pool and it still works .

  21. Re:Whew! I nearly funded that one... by camperdave · · Score: 4, Funny

    Meanwhile, at Planetary Society's headquarters...

    Well, Jason. What have you got to say?
    Well, Mr Nye...
    Doctor! It's Doctor Nye.
    But I thought those were honourary degrees.
    It is DOCTOR Nye. Say it! SAY IT!
    Y..Yes. D..D..Doctor Nye.
    So, what happened to our bird, Jason?
    As you know, um... Doctor Nye... We used a kickstarter campaign to fund the satellite's development and testing.
    Get to the point, Jason.
    We ran out of funds. If we had one more donor, we would have been able to complete the final testing.
    So we lost the satellite and now face public humiliation because one anonymous person was too cowardly to donate?
    Yes. Um.. Doctor Nye. That's about the size of it.
    Well, Jason. That fellow had best pray that he and I never cross paths. You may go.

    --
    When our name is on the back of your car, we're behind you all the way!
  22. Re:Embedded and dynamic memory by chuckugly · · Score: 2

    Today, though, dynamic memory allocation is a reasonable thing. Granted you want to make sure it can't fail, and that "out of memory" is handled appropriately.

    I don't completely disagree but you might watch the CPPCON 2014 presentation on the Curiosity rover for some insights into how the industry actually does things. One thing I noticed right off; rad hardened hardware is way behind the latest thing from Intel.

  23. Better example by tomhath · · Score: 2

    Actually, NASA had a "file system full" problem on one of the Mars rovers, almost exactly the same problem that Lightsail has. Fortunately they were able to fix it remotely.

  24. Re:Why Linux!?! by 0100010001010011 · · Score: 2

    This is something that should have been done with any number of RTOSes. FreeRTOS is a good start, I prefer ChibiOS/RT.

  25. Re:Embedded and dynamic memory by ihtoit · · Score: 2

    some people still do code for the Z80 in 16KB of memory.

    It is still in widespread use from robotic control systems to hardened consoles to law enforcement (it's the processing unit in portable breathalysers). Some modern mobile phones (some Ericsson models) still use the Z80. Some musical synthesisers use the Z80 in realtime voice processing. The Harvard Zed SBC uses a Z80 core.

    --
    Political debates have me rolling my eyes so much I think I got optical whiplash. I should sue. - Foamy The Squirrel
  26. Re:UAT by Maow · · Score: 2

    I'll never understand how groups (Especially NASA) can spend millions, or even BILLIONS on projects like these and not even complete the sorts of rudimentary testing that those of us in the professional software fields have to do every day.

    This is not a NASA project, so you've made a stunningly basic error in your first sentence. Not looking too good for attention to detail for someone "in the professional software field".

    Regardless, if you want to see how NASA does software, or for anyone even remotely interested in how the best practices for true mission-critical software gets written, you can't find a more interesting story on the creation of space shuttle software:

    The right stuff kicks in at T-minus 31 seconds.

    As the 120-ton space shuttle sits surrounded by almost 4 million pounds of rocket fuel, exhaling noxious fumes, visibly impatient to defy gravity, its on-board computers take command. Four identical machines, running identical software, pull information from thousands of sensors, make hundreds of milli-second decisions, vote on every decision, check with each other 250 times a second. A fifth computer, with different software, stands by to take control should the other four malfunction.

    But how much work the software does is not what makes it remarkable. What makes it remarkable is how well the software works. This software never crashes. It never needs to be re-booted. This software is bug-free. It is perfect, as perfect as human beings have achieved. Consider these stats : the last three versions of the program — each 420,000 lines long-had just one error each. The last 11 versions of this software had a total of 17 errors. Commercial programs of equivalent complexity would have 5,000 errors.

    This software is the work of 260 women and men based in an anonymous office building across the street from the Johnson Space Center in Clear Lake, Texas, southeast of Houston. They work for the "on-board shuttle group," a branch of Lockheed Martin Corps space mission systems division, and their prowess is world renowned: the shuttle software group is one of just four outfits in the world to win the coveted Level 5 ranking of the federal governments Software Engineering Institute (SEI) a measure of the sophistication and reliability of the way they do their work. In fact, the SEI based it [sic] standards in part from watching the on-board shuttle group do its work.

    The group writes software this good because that's how good it has to be. Every time it fires up the shuttle, their software is controlling a $4 billion piece of equipment, the lives of a half-dozen astronauts, and the dreams of the nation. Even the smallest error in space can have enormous consequences: the orbiting space shuttle travels at 17,500 miles per hour; a bug that causes a timing problem of just two-thirds of a second puts the space shuttle three miles off course.

    Some of my favourite parts begin with the following quote:

    The process can be reduced to four simple propositions:

    1. The product is only as good as the plan for the product. At the on-board shuttle group, about one-third of the process of writing software happens before anyone writes a line of code. NASA and the Lockheed Martin group agree in the most minute detail about everything the new code is supposed to do — and they commit that understanding to paper, with the kind of specificity and precision usually found in blueprints. Nothing in the specs is changed without agreement and understanding from both sides. And no coder changes a single line of code without specs carefully outlining the change. Take the upgrade of the software to permit the shuttle to navigate with Global Positioning Satellites, a change that involves just 1.5% of the program, or 6,366 lines of code. The specs for that one change run 2,500 pages, a volume thicker than a phone book. The specs for the current program fill 30 volumes and run 40,000 pages.

    That is how one writes software. NASA cannot be beaten when lives matter.