Crowdfunded, Solar-powered Spacecraft Goes Silent
Last week saw the successful launch of the Planetary Society's LightSail spacecraft, the solar-powered satellite that runs Linux and was crowdfunded on Kickstarter. The spacecraft worked flawlessly for two days, but then fell silent, and the engineering team has been working hard on a fix ever since. They've pinpointed the problem: a software glitch. "Every 15 seconds, LightSail transmits a telemetry beacon packet. The software controlling the main system board writes corresponding information to a file called beacon.csv. If you're not familiar with CSV files, you can think of them as simplified spreadsheets—in fact, most can be opened with Microsoft Excel. As more beacons are transmitted, the file grows in size. When it reaches 32 megabytes—roughly the size of ten compressed music files—it can crash the flight system." Unfortunately, the only way to clear that CSV file is to reboot LightSail. It can be done remotely, but as anyone who deals with crashing computers understands, remote commands don't always work. The command has been sent a few dozen times already, but LightSail remains silent. The best hope may now be that the system spontaneously reboots on its own.
I’m usually the first to defend others when some bug like this makes it through testing. Hindsight always being 20/20, only takes one bug amongst a million good bits of code, etc. But this just seems like something that even basic testing should have caught.
Did they not run this thing on the ground for a few weeks? That’s just basic testing, especially for something that is going to be inaccessible for a while. Also that some critical bit of processing relies on stuff being written (and then presumably read back from) a csv file is very worrying.
This sounds like some very shoddy work.
I know the average IQ at /. has gone down over the years, but I think the explanation of what a CSV file is is slightly too much dumbing down.
Comment removed based on user account deletion
You'd think that something as small as 32MB would have been tested before they launched the thing... It doesn't sound like it takes very long to fill up 32MB either
How much v Could a LightSail see If a LightSail could c s v
Roll your log files. I smell a DevOps debacle.
putting the 'B' in LGBTQ+
I'll never understand how groups (Especially NASA) can spend millions, or even BILLIONS on projects like these and not even complete the sorts of rudimentary testing that those of us in the professional software fields have to do every day. Ok, this computers going into space and going to run for days/months/years... whatever... so hey, maybe we should boot it up while it's still on the ground and see if it'll run for a couple of months without crashing first?
One of the mars rover had the same problem. It worked fine, but after a week or two it died because of a flash bug... they'd never tested it on earth for a week strait prior to launching a billion dollar piece of hardware?!?! What's wrong with these people? This is rudimentary stuff. You test it prior to launch for a long period of time. Then box it up and don't touch it. If you make any changes, re-test.
and you are an idiot for using it.
It came across a tachyon eddy and is at warp speed on it's way to the Cardassian homeworld.
... the ability (small code here) to power cycle and come backup in maintenance mode where it doesn't do anything on its own except receive diagnostic commands.
The computer also needs a sibling for fail-over.
There may be reasons those were left out that I would agree with.
I sure hope they can get this puppy lined out.
It little behooves the best of us to comment on the rest of us.
cat beacon >> beacon.csv
instead of....
cat beacon > beacon.csv
oops.
Do not look at laser with remaining good eye.
They launched it with that basic of a software bug that they already knew about? How about edit out the first line of the CSV file when you add another one and maintain a max length? Or write a backup code where if it fails to reboot, close and delete the file without rebooting.
Actually this particular failure wasn't as obvious of an oversight as you may think. The reason it happened was because in an existing system one particular set of parameters were logged in miles since they weren't responsible for flight control (which NASA mostly uses metric for). Later on portions of this design were reused and an engineer decided to use the originally non-essential values as a feed into the navigation system.
The problem in this case is when you have something large and complex (a space craft) and a large organization with many projects (NASA/JPL) the younger generation tends to just rely on what's in place without doing the research they should.
That being said there were many times this particular error could have been caught on the ground and weren't, and that's a process failure. The "process" should have caught it.
Now get off my lawn!
No they mean MB because they even though they've crowdfunded a tiny satellite launch they are still not as autistic as you.
Should have used Windows; it reboots all the time! Dumb asses!
I say professional because NASA screwed up a few years back with a probe to Mars when two systems attempted to communicate. One "spoke" in Kilometers, the other "Miles".
That's absolutely not like what happened.
. . . have you tried turning it off and on again? :-)
This is why they should use Windows instead of Linux to run the thing. Windows would have rebooted by now.
Ok, so they have a problem when going over 32 000 000 bytes.
Get free satoshi (Bitcoin) and Dogecoins
SpaceX can retrieve it long enough to hit the reboot button...
when some bug like this makes it through testing
Testing? what testing? If it compiles, it works. Every hacker knows this.
I have to say, when I read that the spacecraft ran Linux and had died, I naturally assumed that someone had left the auto-update enabled and it was busy trying to apply about 50 million kernel patches.
politicians are like babies' nappies: they should both be changed regularly and for the same reasons
If it makes you feel better, sure think that the rest of the world is as autistic as you.
and not as a verb. Using "hope" as a verb in spaceflight hasn't always gone very well in the past.
"Win treats sysadmins better than users. Mac treats users better than sysadmins. Linux treats everyone like sysadmins."
Shaka, when the walls fell
How much v Could a LightSail see If a LightSail could c s v
s. Duh.
How much is that in library of congress?
Please, I'm no nerd, I don't know this "technology" stuff.
6 Shakespeares... or
16.5 gzip-Shakespeares... or a whopping
22.6 bzip2-Shakespeares.
The Bard fares well by the Burrows-Wheeler algorithm for his works are so oft-repeated he even runs on and repeats himself. "...So all my best is dressing old words new, Spending again what is already spent" as RLE (run length encoding) and "To smother up the English in our throngs, If any order might be thought upon..." as MTF (Move to Front) Transform. "We render you the tenth; to be ta'en forth! Before the common distribution at your only choice... as encode to Huffmans and selection of the sweetest table, and "Spare your arithmetic; never count the turns. Once, and a million!... symbol usage stored as sparse array.
Here is a brief video clip showing the moment the LightSail team browsed the log file to discover the error.
<blink>down the rabbit hole</blink>
They couldn't afford to pay Will Smith for that many sequels
Just two systems that do the same thing linked to the same antenna that operate in such a way that they're both not going to develop the same problem at the same time... and such that one can upload software patches to the other.
I believe this is the way a lot of the deep space probes were set up. They have a primary computer and a diagnostic computer. And while the main system drops or the diagnostic system drops they don't drop at the same time. The team on earth can figure out what is going on in one of the systems and instruct the other to fix it.
That is my understanding how how many of these systems work?
I've decided to stop wasting my time responding to AC trolls/sockpuppets... so if you want a response from me... login.
For those of you who are too young to have heard of it, inna gadda da vida at 256kbps is 32,727,315 bytes.
Pink floyd's meddle at 320bps is 56,465,408 bytes
Disk Jockies would spin one of these when they wanted to have a 17 to 22 minute break.
"Transparent" is a shit show that trades on every stereotype going. A man in drag is NOT a transsexual.
Coming up next on Slashdot... Linux is an operating system, kinda like Windows or Mac OS, but built by a bunch of neckbeards, and uses about the same amount of space as 10 compressed music files. Some versions use less, some use more depending upon how it's configured.
Wow; I think it's time to move on from Slashdot. Taco would be spinning in his grave, assuming he was dead.
If telephones are outlawed, then only outlaws will have telephones.
A satellite running Linux is contingent upon a spontaneous reboot to function again? Great, now we'll never hear from that satellite again.
Clearly, the plan should have been to run the device on Windows 98. That way, it would only be out of commission for 49.7 days.
32MB ought to be enough for anybody...
We don't normally test our spacecraft systems, but when we do, we do it after launch.
Last week a week is approximately the amount of time between new 'Keeping up with the Kardashians' episodes saw the successful launch of the Planetary Society's LightSail spacecraft, the solar-powered satellite that runs Linux Linux is like Windows for smart people and was crowdfunded on Kickstarter Kickstarter is a place to buy digital watches . The spacecraft worked flawlessly for two days, but then fell silent, and the engineering team has been working hard on a fix ever since. They've pinpointed the problem: a software software is like what you download from the app store glitch. "Every 15 seconds, LightSail transmits a telemetry beacon packet a telemetry beacon packet is like a tweet . The software controlling the main system board writes corresponding information to a file called beacon.csv. If you're not familiar with CSV files, you can think of them as simplified spreadsheets—in fact, most can be opened with Microsoft Excel. As more beacons are transmitted, the file grows in size. When it reaches 32 megabytes—roughly the size of ten compressed music files 32 MB is also approximately the size of 13 iPhone 6 selfies —it can crash the flight system The satellite's twitter feed blows-up ." Unfortunately, the only way to clear that CSV file is to reboot LightSail Like holding down the power and home buttons on your iPhone at once -- don't try this unless instructed by someone at the Genius Bar . It can be done remotely, but as anyone who deals with crashing computers understands, remote commands don't always work Like when Siri plays Billy Ray instead of Miley . The command has been sent a few dozen times already, but LightSail remains silent. The best hope may now be that the system spontaneously reboots on its own Like when drop your phone in the pool and it still works .
It's a low orbit cubesat, this debris will not stay low in orbit (a few month if they don't deploy the sailing part, if I remember correctly)
Spacecraft C2 is a place for an RTOS, some something as big and kludged together as Linux. Hell, it shouldn't be running on a virtual memory machine at all.
Kids these days.
It's a Pi? Then I'm pretty sure it will spontaneously reboot for them in pretty short order...
"File to fit, pound to insert, paint to match" - Aircraft Maintenance 101
Meanwhile, at Planetary Society's headquarters...
Well, Jason. What have you got to say?
Well, Mr Nye...
Doctor! It's Doctor Nye.
But I thought those were honourary degrees.
It is DOCTOR Nye. Say it! SAY IT!
Y..Yes. D..D..Doctor Nye.
So, what happened to our bird, Jason?
As you know, um... Doctor Nye... We used a kickstarter campaign to fund the satellite's development and testing.
Get to the point, Jason.
We ran out of funds. If we had one more donor, we would have been able to complete the final testing.
So we lost the satellite and now face public humiliation because one anonymous person was too cowardly to donate?
Yes. Um.. Doctor Nye. That's about the size of it.
Well, Jason. That fellow had best pray that he and I never cross paths. You may go.
When our name is on the back of your car, we're behind you all the way!
It's called "logrotate" and shame on you if you stared the logger and didn't configure it first.
"File to fit, pound to insert, paint to match" - Aircraft Maintenance 101
Bill Nye was a TV guy who got sucked into a science gig where he became famous. He apparently is under the delusion this makes him a top-notch scientist. I hope his role in the project was Fundraiser and not Chief Engineer.
Play Command HQ online
Pi? Chances are it's trying to reboot every time it gets the command, but the SD card is corrupt.
(name withheld by request)
Sure, follow those old embedded rules when we coded for Z80s and 16k of memory. And we counted cycles by hand.
Today, though, dynamic memory allocation is a reasonable thing. Granted you want to make sure it can't fail, and that "out of memory" is handled appropriately. This is non trivial, but hopefully, you have a generalized approach which can be rigorously tested, and then reused.
Hardware is cheap, software development is expensive: today it is much more appropriate to throw hardware resources at the problem and allow the software people to be more efficient. Particularly in a spacecraft, where flight hardware, while expensive compared to consumer stuff, is still cheap compared to people who are developing software for that flight hardware.
The other thing is that processors are MUCH faster today, so on the fly bounds checking is reasonable: Before pushing stuff on the stack, check to see if room is available. and, oh my gosh, what about array bounds checking to prevent buffer overflow. We're not coding for the 1MHz 1802 we used on Galileo any more.
The "totally deterministic" model of embedded must go, if we are to advance: it's harder to do correctly, but design for soft failure and recovery is a much, much better solution in the long run.
LightSail's problem, though, might be that the development team wasn't aware of the need for care. Cube-sat projects are full of aero-astro majors who have learned system engineering, and assume that data and spec sheets accurately and fully reflect the behavior of the devices they are using, because that's what they were taught.
The point is that the original article was written for the general public. And while most good engineers understand the difference between 32 MB and 32 MiB, the distinction is lost on the general public. Hell, for the most part, most computer savvy people don't care about the distinction in general. Only the most anal-retentive, pedantic types get their knickers in a twist over it.
Sure, if I am designing something I want accurate specifications so that it removes ambiguity when I have write the code. But outside of that, I will use MB and MiB interchangeably, knowing full well that they are not exactly the same. But when you're having general discussions, you're rarely doing anything more than discussing an order of magnitude so you don't have to be precise all the time.
The few purists I have run into who piss and moan about the difference between MB and MiB in general use are some of the most annoying types of people I've ever run into. They're worse than any SJW trying to push their agenda down your throat. They're not just overly pedantic. They're passive aggressive about it. And that's what really makes people dislike them.
Bottom line, MB and MiB are interchangeable in the modern vernacular. Unless you're talking about the details of an engineering specification, it just doesn't matter. Please develop those interpersonal skills and suppress the urges to "correct" people. It will go a long way to improving your ability to win friends and influence people.
Actually, NASA had a "file system full" problem on one of the Mars rovers, almost exactly the same problem that Lightsail has. Fortunately they were able to fix it remotely.
The article is about overflowing a file system. It doesn't matter if it's 32MB or 32MiB. The file system would still have overflowed in the same way. The nit picky detail about exactly how big it was is immaterial to the entire discussion. The failure point is that the system was designed in such a way that the file was able to fill up the file system (regardless of how big it was) which resulted in bricking the whole system. Now can we stop wasting time debating a retardedly pedantic point that only autistic people care about but has nothing to do with the actual failure?
CSV predates *Microsoft* by at least twelve years.
Political debates have me rolling my eyes so much I think I got optical whiplash. I should sue. - Foamy The Squirrel
guess what happens when you fill an NT system drive which has a dynamic swap file on it?
Genius.
Political debates have me rolling my eyes so much I think I got optical whiplash. I should sue. - Foamy The Squirrel
The best hope may now be that the system spontaneously reboots on its own.
If your best hope is a combination of divine intervention and spontaneous Artificial Intelligence, I think you are royally fucked.
Help! I'm a slashdot refugee.
csv is next to plain text as it gets. Why would you want to be complicating shit?
Political debates have me rolling my eyes so much I think I got optical whiplash. I should sue. - Foamy The Squirrel
You don't need a scientist for this. You need an engineer. Scientists are people that deal with spherical cows in circular orbits. Engineers know not to send a cow in space.
Unless I misunderstood the mission, the payload isn't coming back so having a log file for post-mission review is meaningless. If they want to log anomalies, or commands, or telemetry, why aren't they sending it back? Either a continuous stream or regular or on-command bursts. In either case, there still would be no need to retain it in a file, you simply dump the buffer once its be transmitted and start from zero. Am I missing something?
The CSV file does not get culled at a specific point before failure. The application keeps blindly appending to it regardless of size. It would continue to grow until the fail point. It doesn't make one god damn bit of difference if the fail point is at 32MB or 32MiB.
Is that such a hard idea for you to understand?
Let me put it another way. Try driving your car at full speed towards a wall without stopping. Does it matter if the wall is 10 yards away or 10 meters away? I guarantee that when you're on the news, it won't make any difference.
Too bad Canonical can't let go of it. But there will be purists who try to hold onto it at all costs, firmly entrenched in the idea that they are right and everyone else is wrong, even if no one else really cares.
they reset spontaneously and could save this mission...
"Win treats sysadmins better than users. Mac treats users better than sysadmins. Linux treats everyone like sysadmins."
Oh, but that's the best part. There apparently is a watchdog, but it only trips after four or five weeks by (presumably unchanged) default, and it's completely independent (rather than being reset regularly by a signal from a properly-operating system). This for a mission that wasn't even supposed to last two weeks. The good news is that the orbit could last for as long as six months with the sail un-deployed.
#naabhaprzrag, #sverubfr-000, #agi-fcbafberq, negvpyr[pynff*=' negvpyr-ary-'] { qvfcynl: abar !vzcbegnag; }
But I can't decide if it should be modded FUNNY or INSIGHTFUL.
If telephones are outlawed, then only outlaws will have telephones.
effective. Power h
The software controlling the main system board writes corresponding information to a file called beacon.csv. If you're not familiar with CSV files, you can think of them as simplified spreadsheets—in fact, most can be opened with Microsoft Excel. As more beacons are transmitted, the file grows in size. When it reaches 32 megabytes—roughly the size of ten compressed music files—it can crash the flight system."
Eng 101. Resources are not infinite. Didn't anyone thought about cycling logs? Or treat it as a circular buffer? What happened to capacity testing? Or better yet, catastrophe testing as is, what happens when the system runs out of space. This does not look like data that is critical to keep. Critical to capture yes, but not critical to keep. Most on board systems, embedded systems and/or systems with minimal resources use a circular buffer to capture control events for these reasons.
This is not a web site project, but an freaking spaceship. I can see clueless developers doing these kind of mistakes in web/enterprisey systems (I know, I've seen). I couldn't have imagine this on a much more critical type of system... but then we have the Ariane 5 incident.
Unfortunately, the only way to clear that CSV file is to reboot LightSail.
A control system should by default reboot itself and clear its non-critical logs when running out of space, or at worst, keep running without logging the events. This is so trivial to test, did the system and software engineers never saw a use case that capture this scenario.
It can be done remotely, but as anyone who deals with crashing computers understands, remote commands don't always work.
They don't always work if you don't test for them exhaustively... and they are not hard to test... and their continuous testing should be a priority at every release/test cycle. The engineers in this project are far more intelligent that I am, I'm sure of it. But man, this specific problem, I'm like "dude, wtf?"
Actually, I wanted to make a joke about the maximum file size overflow problem and I specified MiB to avoid arguments about 33554432 vs 32000000 bytes in the first place...
Get free satoshi (Bitcoin) and Dogecoins
Apple also uses them and last time I checked, they still sold a metric* shitload of devices every day.
* I did that on purpose just to annoy people who can't acknowledge SI prefixes.
Get free satoshi (Bitcoin) and Dogecoins
IMHO there aren't many good engineers who aren't good scientists.
Play Command HQ online
Except HPUX, AIX, and a handful of other Unixes and Linux flavors, so you'd better know how to recognize and translate betwixt them and modern storage arrays that don't, else your 40TB Oracle DB will run into read/write errors at around 36TB, and you'd damned well better have a hot failover plan, or you'll be updating your resume.
If you run an enterprise database and you don't have a hot failover plan regardless of the actual size you think it is able to handle, you should consider a voluntary career path adjustment before it becomes involuntary.
Nothing like adding a filesize check into the save script so you don't fill up your filesystem and crash it. That would have cost them what two lines of code?
That's like building a nuclear weapon with no off switch. Who does that?
Did they christen this spacecraft? Did they name it the USS Eve, perhaps?
ln -s /dev/null beacon.csv
Okay, so they tried turning it off and back on, but did they check to make sure it's plugged in?
And you can easily extract data from it with a simple set of Unix commands like this! :
cat beacon.csv | tail -n 80| head -n 10
If it would have been a government-run spacecraft, we could hear screams about wasted money and inefficiency. Had it been foreign-government run, we could here about nation's X decline or lack of experience.
This is a perfect scam:
a) come up with a seemingly plausible idea obscured by high tech and/or science;
b) get investors to contribute to seeing a prototype created
c) have the prototype fail for any number of plausible reasons
d) profit
e) repeat until the profits cease to be worth the effort
The solar sail is a theoretically flawed idea, as achievable as perpetual motion. But it is a cool, elegant concept ... and people can be convinced to buy in to it. The best part is, nobody outside the scammers can verify success or failure. Perfect ...
"Consensus" in science is _always_ a political construct.
Have gnu, will travel.