SSD Failure Temporarily Halts Linux 3.12 Kernel Work

Really? by koan · 2013-09-11 07:52 · Score: 5, Insightful

No backup?

--
"If any question why we died, Tell them because our fathers lied."

Re:Really? by gagol · 2013-09-11 07:56 · Score: 4, Insightful

I found spinning rust to at least give some clues prior to a crash and burn. I would say, single ssd is not ready for anything critical, in my opinion. Worst case scenario, you can always get the platters transfered in a good drive and recover from there (pricey, bur cheap if data is valuable enough).

--
Tomorrow is another day...
Re:Really? by Anonymous Coward · 2013-09-11 07:56 · Score: 1, Funny

Given the utter arrogance of Mr. Torvalds, it doesn't surprise me at all... Why bother with backups, when you're God, right ? /s
Re:Really? by Anonymous Coward · 2013-09-11 07:58 · Score: 1

you know what he said ...
Only wimps use tape backup: real men just upload their important stuff on ftp, and let the rest of the world mirror it ;)
Re:Really? by Anonymous+CowWord · 2013-09-11 07:59 · Score: 5, Funny

Haven't you heard?
"Only wimps use tape backup: real men just upload their important stuff on ftp, and let the rest of the world mirror it ;)" - Linus Torvalds[1]
1: https://groups.google.com/forum/#!msg/linux.dev.kernel/2OEgUvDbNbo/bTk-VE1zrnYJ

--

Disclaimer: My opinions are my own and do not, in any way, reflect the opinions of my employer or university.
Re:Really? by Anonymous Coward · 2013-09-11 07:59 · Score: 2, Funny

Ask Obama!
He's got a backup...
Re:Really? by Anonymous Coward · 2013-09-11 08:01 · Score: 5, Informative

No backup?
http://lkml.indiana.edu/hypermail/linux/kernel/1309.1/01690.html
I long ago gave up on doing backups. I have actively moved to a model
where I use replacable machines instead. I've got the stuff I care
about generally on a couple of different machines, and then keys etc
backed up on a separate encrypted USB key.
So it's inconvenient. Mainly from a timing standpoint. But nothing more.
Linus
Re:Really? by Anonymous Coward · 2013-09-11 08:04 · Score: 1

RTFM ;-). Looking at the thread:

I long ago gave up on doing backups. I have actively moved to a model
where I use replacable machines instead. I've got the stuff I care
about generally on a couple of different machines, and then keys etc
backed up on a separate encrypted USB key.
I only periodically back up the stuff i really care about (pictures, music, movies) and generally have a majority or all on laptop and desktop.
Re:Really? by Zero__Kelvin · 2013-09-11 08:05 · Score: 1

No. Not really. He has thousands of them all over the internet. How else do you think he is going to finnish the job with his laptop (excuse the pun.)

--
Guns don't kill people; Physics kills people! - John Lithgow as Dick Solomon on Third Rock From The Sun
Re:Really? by SJHillman · 2013-09-11 08:06 · Score: 5, Funny

Maybe Linus doesn't consider Linux to be critical...
Microsoft sure as hell doesn't seem to find Windows to be critical.
Re:Really? by Anonymous Coward · 2013-09-11 08:06 · Score: 5, Insightful

I used to think that too, until I had a mechanical hard drive experience controller failure without warning. Single drive is not ready for anything critical, regardless of the storage mechanism.
Re:Really? by Zero__Kelvin · 2013-09-11 08:09 · Score: 1

So you've never had a hard disk controller failure then?

" Worst case scenario, you can always get the platters transfered in a good drive and recover from there"
What makes you think you can't take FLASH devices and access them in a similar way to platters? Just like with platters, you won't be able to access data on any damaged portions but unlike with platters it is unlikely that the platters will trash the read/write heads of the new drive.

--
Guns don't kill people; Physics kills people! - John Lithgow as Dick Solomon on Third Rock From The Sun
Re:Really? by pubwvj · 2013-09-11 08:12 · Score: 1

There are automated solutions that keep hourly backups. Even a day's worth may be worth saving. Provided you do important work and don't just surf the web. :)
Re:Really? by sjames · 2013-09-11 08:13 · Score: 1

The world is his backup. No code seems to be lost, just temporarily not where he wants it to be.,/p>
Re:Really? by stewsters · 2013-09-11 08:14 · Score: 4, Funny

Yeah, i wonder if anyone has ever told him about git. Too bad he didn't back it up. Now we will have to start a new Linux kernel.

Sarcasm Intended.
Re:Really? by Joce640k · 2013-09-11 08:21 · Score: 1

No backup?
Didn't he write some source code control system or other to prevent this...?

--
No sig today...
Re:Really? by pubwvj · 2013-09-11 08:23 · Score: 5, Funny

Ah, even Jesus saves. ;-)
Re:Really? by MightyYar · 2013-09-11 08:24 · Score: 1

Crashplan even runs on Linux (and Windows and Mac, and with some nudging FreeBSD). Dropbox (or Sparkleshare if you want to stay Open Source). My goodness, the carelessness here is crazy.

--
W..w..W - Willy Waterloo washes Warren Wiggins who is washing Waldo Woo.
Re:Really? by Anonymous Coward · 2013-09-11 08:24 · Score: 1

oh the irony
there are 2 kinds of people who work with data: those who have lost data and those who will lose data. do you know where your backups are?
Re:Really? by chuckinator · 2013-09-11 08:27 · Score: 4, Interesting

Seconded. I've had a RAID1 mirror on my primary workstation at home for roughly... 4 years. I had one of those "oh, drat, my drive is starting to click, and we all know what that means..." moments and barely had time to backup the /home partition to an external machine while I went hardware shopping. Since that event window closed, that configuration has saved my butt twice. One time, the mirrored pair started to go after kinetic shock from moving to a new residence, and it didn't even stress me out to wait for a new pair from my online vendor of choice. I don't know what happened the second time, but I'm guessing that some bad components on the mobo were dirtying the 5V and 3.3V power rails into the drive connector because the whole rig decided to go kaput shortly after in a way that forced an upgrade to the latest CPU socket du jour mobo. Thankfully, I was already budgeting for new guts for that rig due to performance demands.
Re:Really? by tlhIngan · 2013-09-11 08:27 · Score: 5, Informative

I found spinning rust to at least give some clues prior to a crash and burn. I would say, single ssd is not ready for anything critical, in my opinion. Worst case scenario, you can always get the platters transfered in a good drive and recover from there (pricey, bur cheap if data is valuable enough).
Sudden SSD failure is actually not really a failure that's detectable. Good SSDs have tons of metrics available through SMART including media wear indicators that tell you impending failure long before it happens.
But when an SSD suddenly dies, it's generally because the controller's FTL tables got corrupted. For high performance drives, it's remarkably easy to do as performance is #1, not data safety. There's nothing wrong with the disk or the electronics.
The FTL (flash translation layer) is what maps a sector the OS uses to the actual flash sector itself. If it gets corrupted, the controller has no way of accessing the right sectors anymore and things go tits up. It's even worse because a lot of metrics are tied to the FTL, including media wear, so losing that data means you can't simply erase and start over - you're completely hooped as the controller cannot access anything.
If you want to think of it another way, treat it like the super block on a filesystem, and the filesystem tables. Now imagine they get corrupt - the data is useless and recovery is difficult, even though the underlying media is perfectly fine. It's possible to hose it so badly that recovery is impossible.
For speed, FTL tables are cached - and modern SSDs can easily have 512MB-1GB of DDR memory just to hold the tables. Of course, you can't write-through changes since the tables themselves need to be wear-levelled on the flash media.
One of the iffiest times for this comes when an SSD is power cycled - pulling the power on an SSD can cause corruption because the tables may be in the middle of an update. But things like firmware bugs and other things can easily corrupt the table as well (think a stray pointer scribbling over the table RAM). A good SSD often has extra capacitance onboard to ensure that on sudden power failure, there is enough backup power to do an emergency commit to flash. This protects against power cycling, but firmware bugs can still destroy the data.
Of course, SSDs without such features mean the firmware has to be extra careful. And sometimes, such precautions can miss a point in time where you cannot pull the power at all.
It's sort of reminiscent of that Seagate failure that resulted in a log file reaching a certain size disabling the drive - the data and media were perfectly fine, it's just that the firmware crapped out.
Re:Really? by Deemus · 2013-09-11 08:29 · Score: 1

I'm sure the NSA has a copy somewhere. :)
Re:Really? by Anonymous Coward · 2013-09-11 08:31 · Score: 1

you know what he said ...
Only wimps use tape backup: real men just upload their important stuff on ftp, and let the rest of the world mirror it ;)
Exactly. Nothing much was lost here. He doesn't develop that much himself - he mostly pulls from his maintainers. So no loss, only a delay while he play around with the failed drive. Delays doesn't matter much in his world - in a business setting, the drive would be replaced in an hour.
Re:Really? by jimbolauski · 2013-09-11 08:34 · Score: 2, Funny

So you've never had a hard disk controller failure then?

" Worst case scenario, you can always get the platters transfered in a good drive and recover from there"
What makes you think you can't take FLASH devices and access them in a similar way to platters? Just like with platters, you won't be able to access data on any damaged portions but unlike with platters it is unlikely that the platters will trash the read/write heads of the new drive.
I don't know what your talking about it's very easy to desolder a couple hundred pins on a board, then install a new chip and resolder the new chip back in. That's just as easy as popping off the back of the HD removing a couple a screws and pulling out the platter.

--
Knowledge = Power
P= W/t
t=Money
Money = Work/Knowledge so the less you know the more you make
Re:Really? by You're+All+Wrong · 2013-09-11 08:34 · Score: 5, Informative

Are you attempting to claim the prize for the person with the least understanding of the Distributed Source Code Control System in use?

There was absolutely no code on his system that wasn't on between dozens and thousands of other systems depending on its age.

Just read TFA: "I had pushed out _most_ of my pulls today". His "pulls" are code that is *elsewhere*. He's just a conduit (and gatekeeper) between a few dozen elsewheres and a server with a fat pipe. And by the construction of the system, it really shouldn't matter how those pulls ordered. (If there'll be a merge conflict one way round, there'll be a merge conflict in other permutations too.)

--
Your head of state is a corrupt weasel, I hope you're happy.
Re:Really? by jedidiah · 2013-09-11 08:34 · Score: 1

> Good SSDs have tons of metrics available through SMART
Only the good ones? Utter trash in terms of spinning rust have a ton of useful metrics available through SMART. Simplifies the selection process quite a bit when everything is suitable rather than just some select models.

--
A Pirate and a Puritan look the same on a balance sheet.
Re:Really? by h4rr4r · 2013-09-11 08:36 · Score: 1

Make backups, lots of them. Rsync to a raid array from your laptop as well. Which should not be your only backup.
SSDs are fine.
Re:Really? by MightyYar · 2013-09-11 08:39 · Score: 1

I like the response to him:

And I won't trust a single USB thumb drive to hold my most important
stuff. And how do you hold onto family pictures and such? It's
amazing how much crap can accumulate, but also how important it can be
to have good backups that are remote. If the house burns down, don't
matter how many machines the stuff is spread across if it's not local.
I've been a good little monkey when it comes to backups and I've STILL been bitten. Once, lightning got both my main drive and my backup - it took me a week to recover little bits from each drive and get it all pieced back together. Another time my backup drive mount failed (it was basically a rubber band... bad design) and fell onto my running main drive, killing both and requiring another major effort for recovery. At that point, I started rsycing (or similar, like Unison) across the network to another machine so that there was some physical distance.
Then Unison noticed some file differences between the backup and the main drive... my hard drive (or maybe OS) was silently corrupting files! And in my family pictures directory, no less. By this point, I'd had enough. Now I run Crashplan (which keeps file history and checks hashes on every file OFFSITE) on all of my computers. I also run the native backup program (Windows Backup, TimeMachine, etc) in parallel (to a local fileserver running ZFS, which has a number of data integrity features). So far, no disasters and a successful recovery or two.
I'm a dumbass compared to Linus... no idea what he's thinking.

--
W..w..W - Willy Waterloo washes Warren Wiggins who is washing Waldo Woo.
Re:Really? by h4rr4r · 2013-09-11 08:39 · Score: 1

The fanboy for a proprietary product is even nuttier. He has GIT he wrote it. If you really must have "backup" software just use bacula.
Re:Really? by gmuslera · 2013-09-11 08:40 · Score: 1

Only wimps use tape backup: real men just upload their important stuff on ftp, and let the rest of the world mirror it ;)
Torvalds, Linus (1996-07-20).
Re: Really? by h4rr4r · 2013-09-11 08:40 · Score: 1

You can do the same thing with linux. Nothing stopping you.
Bacula can do bare metal backups just as well, and that is just my favorite. Lots of options out there.
Re:Really? by MightyYar · 2013-09-11 08:43 · Score: 1

I'm quite familiar with git - I think it is great and use it preferentially. Even if he didn't lose any code, he lost _work_. He also lost other people's work. If he had Crashplan running, he could just scurry over to his laptop, do a quick restore into a directory of his choosing, and pretty much carry on where he left off. Depending on the size of the backup, we might be talking about a few minutes.

--
W..w..W - Willy Waterloo washes Warren Wiggins who is washing Waldo Woo.
Re:Really? by MightyYar · 2013-09-11 08:46 · Score: 1

I'm quite happy with Crashplan, but it is obviously not the only versioning backup system. Sparkleshare is an open source Dropbox clone with a git backend.
I'm not sure why being a fanboy of good software is a problem - you are a fanboy of open software. Did that hurt? No? Now you know how I feel.

--
W..w..W - Willy Waterloo washes Warren Wiggins who is washing Waldo Woo.
Re:Really? by Talderas · 2013-09-11 08:47 · Score: 3, Funny

Apparently Linus.

--
"Lack of speed can be overcome. In the worst case by patience." --Znork
Re:Really? by Guspaz · 2013-09-11 08:51 · Score: 4, Informative

What makes you think you can't take FLASH devices and access them in a similar way to platters?
Because on most SSDs, the data is encrypted, and on all SSDs, the pages are in an effectively random order. If you've lost the controller, you've lost both the encryption keys and the table that enables a logical platter-style presentation of the pages. No amount of soldering is going to fix those problem.
Re: Really? by MightyYar · 2013-09-11 08:53 · Score: 2

On a drive somewhere. It can be on an NAS or something locally attached.

--
W..w..W - Willy Waterloo washes Warren Wiggins who is washing Waldo Woo.
Re:Really? by rssrss · 2013-09-11 08:55 · Score: 1, Informative

I thought NSA backed up all our drives.

--
In the land of the blind, the one-eyed man is king.
Re:Really? by Anonymous+Brave+Guy · 2013-09-11 08:55 · Score: 1

Spinning drives fail all the time, but at least you can hear the click of death starting.
The trouble these days is that different types of hard drives have very different characteristics for when they park their heads. A Click of Imminent Death and a Crunch of Routine Head Parking on an enterprisey disk with a long wind-down delay can sound disturbingly similar. (Either that or almost every hard drive in the various machines in the rack in my office really is about to fail, even though none of them report anything disturbing via SMART etc. That would suck.)

--
If you disagree, post your argument. (-1, Overrated) isn't your personal censorship tool for views you don't like.
Re:Really? by Wonko+the+Sane · 2013-09-11 08:56 · Score: 1

I thought SLC was faster than MLC, at the expense of lower storage density?
Re:Really? by Anonymous+Brave+Guy · 2013-09-11 08:57 · Score: 2, Insightful

Only wimps use tape backup. Real deities just upload their important stuff on FTP and let the rest of the universe mirror it.

--
If you disagree, post your argument. (-1, Overrated) isn't your personal censorship tool for views you don't like.
Re:Really? by pubwvj · 2013-09-11 08:58 · Score: 1

"There was absolutely no code on his system that wasn't on between dozens and thousands of other systems depending on its age."
There is a big difference between having a backup of your system and copies of bits and pieces scattered around the Universe. With a system backup you're up and running again in minutes to hours. With what you're proposing 'bits and pieces scattered around the Universe' you must expend effort reassembling your machine's soul from the bits and pieces. That takes, better said, wastes, time and effort.
Re:Really? by a_claudiu · 2013-09-11 08:59 · Score: 1

Common even NSA have bugs from time to time.
Re:Really? by Zero__Kelvin · 2013-09-11 09:00 · Score: 2

The new drive has a new controller. Where do you think the controller stores all the data it needs to decrypt? Hint: It is in the FLASH devices. I am not saying this will work 100% of the time, since the damaged part might be the component that stores the needed information, but again, that is no different than a platter scenario. There is a reason why data recovery services don't guarantee success with platter based media.

--
Guns don't kill people; Physics kills people! - John Lithgow as Dick Solomon on Third Rock From The Sun
Re:Really? by djdanlib · 2013-09-11 09:07 · Score: 2

It would be great if they would mention stability features on the box, or at least in the marketing material. But they don't. It always looks like this: MEMORY! It's quiet! SATA-II maximum bandwidth of 3.0 Gbps! Speed up your desktop! Look at the rebate! Millions of hours MTBF! Low power usage!
Re:Really? by ArchieBunker · 2013-09-11 09:08 · Score: 1

When flash drives fail they fail suddenly and catastrophically. I've always been able to recover data from platter drives.

--
Only the State obtains its revenue by coercion. - Murray Rothbard
Re:Really? by Anonymous Coward · 2013-09-11 09:10 · Score: 1

If you are running a critical system (which would include a system that is apparently essential to maintaining the Linux kernel code) you put in a RAID storage system at the very least. ECC RAM and redundant power supplies would be a good idea also. None of this is particularly expensive these days.
In terms of mechanical drives vs SSD, mechanical drives are more reliable and have fault recovery options that are not available for SSD drives. Even if the drive's controller fails, one can still take a mechanical drive to a recovery shop which can either replace the controller or remove the physical platters and read from them. It is expensive, so one only does this if the data is critical, but it is possible. I am not aware of any recovery shop that can perform similar operations for an SSD drive.
Hopefully Mr Torvald has learned something from this...
Re:Really? by michrech · 2013-09-11 09:11 · Score: 5, Informative

That's just as easy as popping off the back of the HD removing a couple a screws and pulling out the platter.
You do that outside of a cleanroom and your data is gone forever.
False -- I've done it on a number of occasions (to drives I didn't care about), and was able to run the drives for months without their covers. I'd still be using the drives if I had need for drives as small as they were (somewhere in the 80GB range)...
Would I use a drive in this state for something critical? No, but saying you immediately lose the data if you pull a drive cover is just flat wrong.

--
bork bork bork!
Re:Really? by jellomizer · 2013-09-11 09:22 · Score: 1

Didn't work for Linux

--
If something is so important that you feel the need to post it on the internet... It probably isn't that important.
Re:Really? by jimbolauski · 2013-09-11 09:23 · Score: 1

A simple fish tank can be turned into a clean room so that's not a big deal, trying to desolder 100 pins spaced 0.01" apart then resoldering them, unless you have a 0.1 mill precision soldering robot it is impossible, you can't even buy wire thin enough to do it by hand.

--
Knowledge = Power
P= W/t
t=Money
Money = Work/Knowledge so the less you know the more you make
Re:Really? by Capt.DrumkenBum · 2013-09-11 09:25 · Score: 1

If the data is valuable, then back it up. A raid may not be enough to save you, as last week taught me.
rsync can be your friend.

--
If I were God, wouldn't I protect my churches from acts of me?
Re: Really? by Cyberax · 2013-09-11 09:31 · Score: 5, Funny

You've misspelled 'NSA'...
Re:Really? by Anonymous Coward · 2013-09-11 09:34 · Score: 2, Insightful

Microsoft also sure as hell wouldn't have a single hard drive failure interrupt their patch submission process (yes, it is internal but they have a tree of lab builds, team builds, and "winmain" with a well defined RI - reverse integration process for moving patches in) and their build process. Actually - I don't think anyone would allow a single drive failure to do this. It seems, well, stupid. What was Linus smoking?
Re:Really? by PlusFiveTroll · 2013-09-11 09:54 · Score: 1

That is a common method of failure in flash, but not the only one. There is also the failure mode where the drive goes read only and you can recover data from it to a new drive.
Always been able to recover platter drives?, maybe you've been very lucky then. Many drives fail in methods that are expensive (taking guts out and putting them in new case, or replacing drive controller) or impossible (hard drive patter shatters, happens in notebook drives). .
Re:Really? by Luckyo · 2013-09-11 10:07 · Score: 2

Urban legend. Clean environment inside drives and in the lab that does data extraction from damaged drive is to maximize performance/chance of recovery.
Hard drive itself can run just fine in dirty environment for a while. It will wear down much faster as it's not designed for such operation, but it will very likely remain operational for weeks at the very least.
Re:Really? by cheater512 · 2013-09-11 10:07 · Score: 1

I have nearly the opposite experience.
I have had 3 RAID 5 arrays over the last 8 or so years (10 drives in total). The newest one was bought this year.
Oddly enough not a single disk (all WD Green drives) has failed out of any of them even after running 24/7.
One set of disks has been spinning non-stop for over 4 years.
I've retired the oldest one but only because using 4 SATA ports to get 1TB isn't practical any more.
Re:Really? by Luckyo · 2013-09-11 10:09 · Score: 1

That isn't a failure mode, that is a wear out mode. NAND flash naturally wears itself out into the state where its state can no longer be changed by controller. The drive will remain perfectly readable because reading process is different from writing process.
Re:Really? by cheater512 · 2013-09-11 10:12 · Score: 1

If you are paying megabucks to recover critical data then yep I'm sure it is in a cleanroom.
But a hard drive can operate just fine with its cover open. It doesn't kill the drive at all.
Probably not healthy to do, but it does work without any problems.
Re:Really? by Luckyo · 2013-09-11 10:15 · Score: 1

SLC writes a single cell rather than multiple cells at once. This adds to drive life due to the way NAND flash is written (one cell/set of cells at a time). MLC drives have a far lower useful life expectancy than MLC, and TLC has far lower useful life expectancy than MLC.
SLC drives are usually far more expensive because it's more expensive to make them, and as a result they tend to be built from high quality parts resulting in better performance.
Re:Really? by cheater512 · 2013-09-11 10:16 · Score: 1

You are correct. SLC is faster than MLC. No clue what the parent is on about.
Re:Really? by mhotchin · 2013-09-11 10:20 · Score: 1

...Gretzky gets the rebound, he shoots, he SCORES!!!
Re:Really? by Guspaz · 2013-09-11 10:24 · Score: 1

Um, no, it doesn't. That would be dumb, as well as pointless, since the entire point of SSD encryption is to prevent the flash from being directly accessed like that. SSD controllers store the keys internally, not on the external flash.
Re:Really? by Dunbal · 2013-09-11 10:25 · Score: 1

Exactly. I've had the bad luck of having many platter hard drives fail on me. Before I stopped buying Seagate...and after they bought Maxtor. Those drives were so crappy... anyway I digress. Any single drive is prone to failure. That's why computer savvy people back stuff up often. I regularly back up my 512GB SSD to my 3TB hard drive. It really doesn't take that long, especially if you're just copying the data.

--
Seven puppies were harmed during the making of this post.
Re:Really? by Zero__Kelvin · 2013-09-11 10:25 · Score: 1

I take it you have never actually tried it. A hard drive will not function for long once the seal is broken outside of a clean room environment

--
Guns don't kill people; Physics kills people! - John Lithgow as Dick Solomon on Third Rock From The Sun
Re:Really? by Guspaz · 2013-09-11 10:52 · Score: 2

Perhaps you're unaware that most modern SSDs these days do controller-level AES encryption of all data? Intel's drives do (as do any others based on Sandforce controllers), Samsung's newer ones do, Crucial's newer ones do... and the keys are stored in the controller, not the NAND.
It's kind of odd for you to say I'm on drugs for saying things that are on the spec sheets of the drives themselves...
Re:Really? by Miamicanes · 2013-09-11 10:55 · Score: 2

> What makes you think you can't take FLASH devices and access them in a similar way to platters?
Sandforce controllers enforce mandatory AES encryption that can't be disabled, using a key that can't be recovered or set to a known value. So if your controller decides to quit allowing you to access your data, unsoldering the chips won't do you any good, because the values you read from them might as well be random noise.
Re:Really? by Miamicanes · 2013-09-11 11:01 · Score: 1

Exactly. What good is a drive with a hundred million hours before physical failure if it commits data-suicide every 4-12 weeks? Maybe I'm just weird, but I don't *care* whether a drive (like the OCZ Vertex 2, for example) technically isn't "broken", and "only" has to be "securely erased" to temporarily make it usable again for a few weeks until its next episode of eternal amnesia. If it happens once, it's a random fluke. If it happens twice in six months, the drive is fucked and useless for its intended purpose.
Re:Really? by MSG · 2013-09-11 11:08 · Score: 1

Can you provide a reference for the Seagate failure you mentioned? Offhand, it almost sounds like you're thinking of the Samsung laptop UEFI bug:
http://mjg59.dreamwidth.org/22855.html
Re:Really? by makomk · 2013-09-11 11:51 · Score: 1

It affected Seagate 7200.11 drives, I know because one of my hard drives was affected by the bug. Seagate eventually started fixing people's drives for free after outsiders figured out it was a software bug that locked out access and it hit the tech news sites - prior to that their data recovery service was charging thousands of dollars to get people's data back. I can only imagine they realised people were going to ask pointed questions about how exactly their drives wound up with a "bug" that reliably held people's data to ransom for $$$, and that some of those people might be uniformed and have warrants.
Re:Really? by cheater512 · 2013-09-11 12:10 · Score: 1

I haven't tried swapping platters no, however I have run a older hard drive (I wouldn't exactly do it with a good drive) with its lid off in a standard room with no precautions.
It was plugged in to a computer and I could read and write to it no issues.
Hell I even prodded the heads (being careful not to make them crash) and it mucked around to find its position again before going back to working normally.
Re:Really? by Zero__Kelvin · 2013-09-11 12:17 · Score: 1

Thanks for the information. I'll be sure to avoid Sandforce drives. Do you know of any other drives developed by incompetent engineers that I should avoid?

--
Guns don't kill people; Physics kills people! - John Lithgow as Dick Solomon on Third Rock From The Sun
Re:Really? by Guspaz · 2013-09-11 12:18 · Score: 2

In the case of most drives, the key they ship with is randomly generated at the factory, unless you enable ATA passwords in your BIOS, which will prompt a new key to be generated, secured by that password. This is the typical behaviour; encrypt everything by default using the built-in key, and most support various external interfaces for securing that. Some even support eDrive, which integrates with BitLocker. Anandtech has a nice article about that:
http://www.anandtech.com/show/6891/hardware-accelerated-bitlocker-encryption-microsoft-windows-8-edrive-investigated-with-crucial-m500
Re:Really? by SoftwareArtist · 2013-09-11 12:26 · Score: 1

<flamebait>
If only he had been using a Mac! Then he would have had Time Machine automatically backing up all his work every hour. Maybe he should consider switching.
</flamebait>
(duck!)

--
"I'm too busy to research this and form an educated opinion, but I do have time to tell everyone my uninformed opinion."
Re:Really? by bmo · 2013-09-11 12:26 · Score: 2

And not only that, but the only "cleanroom" you need is a box with a lid, gloves, and a filtration system you can sometimes pull from a dead vacuum cleaner (the kind with a hepa filter).
If you're handy, you can build one of these under 50 bux.
--
BMO
Re: Really? by ndogg · 2013-09-11 12:45 · Score: 1

Same with Deja Dup, which comes installed on Ubuntu. It actually works rather beautifully.

--
// file: mice.h
#include "frickin_lasers.h"
Re:Really? by inflex · 2013-09-11 12:49 · Score: 1

You can still take the flash chips out and get the data off and reassemble it all if required ( yes, it's a major PITA, but it's something that is done routinely ). The data doesn't just vanish, in fact it's quite a problem trying to make data vanish in a hurry.
Re:Really? by citizenr · 2013-09-11 13:50 · Score: 1

No, key is stored inside controller chip, it is only available to manufacturer and licensed recovery companies (using manufacturers backdoor) BUT that key is LOST if controller chip dies.

--
Who logs in to gdm? Not I, said the duck.
Re:Really? by citizenr · 2013-09-11 13:52 · Score: 1

Your claim that most drives do this by default is ludicrous
and yet it is true

--
Who logs in to gdm? Not I, said the duck.
Re:Really? by citizenr · 2013-09-11 13:53 · Score: 2

Yes, the mythical read-only mode that NO ONE could trigger while doing wear leveling tests on SSDs - they ALWAYS DIE suddenly and without warning.

--
Who logs in to gdm? Not I, said the duck.
Re:Really? by MightyYar · 2013-09-11 13:56 · Score: 1

I don't think 10 good drives with an average retirement age of about 3 years is terribly exceptional. Good luck for sure, but not "against all odds" stuff. I've only had 1 drive (coincidentally a WD Green) go bad on me in that time period, if you exclude physical damage. And even then, the WD Green committed suicide by parking its head every 5 seconds for a year. I didn't realize this was occurring until my server started sending me SMART alerts. The replacement has the patch so that it won't do that :) I have some drives that are probably 9 years old.

--
W..w..W - Willy Waterloo washes Warren Wiggins who is washing Waldo Woo.
Re:Really? by citizenr · 2013-09-11 13:56 · Score: 1

no, they all write whole sectors at a time
slc holds 1 bit in a cell
mlc holds 2
tlc holds 3 bits
more bits = smaller difference in charge between values = less margin of error = almost order of magnitude less write cycles

--
Who logs in to gdm? Not I, said the duck.
Re:Really? by BrokenHalo · 2013-09-11 14:13 · Score: 1

I wonder why he did that. He would have to know better, surely. I use a single drive in my laptop, but (a) I don't trust it, (b) the sky won't fall in if it fails, and (c) I keep lots of offline backups anyway.
Re:Really? by ArchieBunker · 2013-09-11 14:25 · Score: 1

Every drive that has gone bad on me had early warning signs. Some showed SMART errors in the event log and others made the head recalibration noise while they were operating. But yeah I have been very lucky. My last drive to fail was a 1.5TB Seagate and it was nearly full.
I'm in need of more storage space but Seagate is garbage now and WD had such a bad reputation in the 90s that it still makes me hesitate.

--
Only the State obtains its revenue by coercion. - Murray Rothbard
Re:Really? by omnichad · 2013-09-11 14:49 · Score: 1

Hope you've run WDIDLE on all those Green drives. As the other commenter said.
http://www.ngohq.com/news/19805-critical-design-flaw-found-in-wd-caviar-green-hdds.html
WD Greens are intentionally sabotaged against running in a RAID, but I still use them. And they work OK for slow storage.
Re:Really? by mysidia · 2013-09-11 15:09 · Score: 1

SSD controllers store the keys internally, not on the external flash.
SSD Key storage

2. How is this key stored internally? Is it itself encrypted? Using what algorithm?
a. We do not disclose the internal details of how the key is stored in order to prevent security issues, but we do disclose it is stored in the flash memory.
Re:Really? by mysidia · 2013-09-11 15:12 · Score: 1

Perhaps you didn't know that drives do one thing and only one thing. You feed them a block of data and an address and tell it to write it, or you feed them an address and tell it to read it.
You should do more research, as many SSDs are self-decrypting.
I do not like this one bit, but it /does/ address a potential issue with SSDs in regards to data destruction (Tools such as Gutman wipe, or a DBAN bootable CD do not work properly with a SSD; because repeatedly writing to the same "sector" does not provide any assurance of data destruction with a SSD, unlike with a magnetic drive)
Re:Really? by mysidia · 2013-09-11 15:18 · Score: 1

That isn't a failure mode, that is a wear out mode. NAND flash naturally wears itself out into the state where its state can no longer be changed by controller. The drive will remain perfectly readable because reading process is different from writing process.
That would be nice.... in practice it wears itself out into a state, where some bits get stuck in the wrong position, or fade to another state, and the rate of bit errors begins to increase exponentially
In other words; with SSD devices, wear out does not affect only writes. With a possible exception of some controllers, which might have a safety mechanism to anticipate when wear-out should begin to occur, and jot something on the controller or blow a fuse in order to shut off the write capability, before the data is damaged.
Re:Really? by mysidia · 2013-09-11 15:23 · Score: 1

I thought real deities just ran it through a stego algorithm or Base64 encoder, and posted it on Slashdot? :)
Re:Really? by __aajfby9338 · 2013-09-11 15:28 · Score: 1

trying to desolder 100 pins spaced 0.01" apart then resoldering them, unless you have a 0.1 mill precision soldering robot it is impossible, you can't even buy wire thin enough to do it by hand.
Nonsense. Packages with exposed pins on the sides will typically have at least twice that pin pitch (0.5mm or larger), and they are certainly hand-solderable. Even if your wire solder is larger than the pin. In rework, this kind of stuff is done by humans, not by robots.
You will want a binocular microscope, available from eBay. And a GOOD soldering iron with good tips, such as a Metcal; also available from eBay, though you'll hunt a while to find one cheap. And some liquid flux to control surface tension and heat transfer. If the pins are not exposed, such as on a BGA, then you'll need a $100 hot air station. I've done this sort of stuff myself, and I don't have nearly as much skill as a good rework technician. At work, I even routinely perform rework involving soldering wires to individual 0.5mm pitch IC or connector pins. It's tricky with 30 gauge wire, since the wire is about as wide as the pin... so that's why we bought a spool of 38 gauge wire, which makes it pretty easy to do.
This sort of stuff takes a bit more practice than through-hole soldering, and it requires different equipment and techniques. But I could do it at home if I had to, where I've set up my bench with a pair of used Metcal irons and a binocular scope. I don't have a hot air station at home yet, but that's a $100 problem to solve when I need to do it.
Re:Really? by cheater512 · 2013-09-11 15:36 · Score: 1

Thanks for that. Checked my drives and they all had it to 8s, but the older set had a low park number and the newer set had a rather high park number.
Odd.
I wouldn't exactly call it slow storage. In RAID5 I hit 220mb/s easy with 3 drives.
Re:Really? by libtek · 2013-09-11 15:40 · Score: 1

Linus is obviously NSA... ...LULZ!

--
Unequivocally the realest of the realz...
Re:Really? by BrokenHalo · 2013-09-11 16:48 · Score: 1

Whereas lesser mortals just let the NSA store it for us...
Re:Really? by mdielmann · 2013-09-11 16:57 · Score: 1

Which just makes me wonder why he doesn't get on that FTP and get his data back? Wasn't loaded to FTP yet? Poor backup policy!

--
Sure I'm paranoid, but am I paranoid enough?
Re:Really? by gweihir · 2013-09-11 17:09 · Score: 1

Indeed. That is why I have all critical work on ssd+hdd hybrid RAID1. Linux makes that easy.

--
Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
Re:Really? by Guspaz · 2013-09-11 18:05 · Score: 1

LSI, who make the Sandforce chipsets, says otherwise:
http://www.lsi.com/technology/duraclass/Pages/Automatic-Encryption.aspx

Fuse-based OTP (one time programming memory) for unique master key
Re:Really? by bemymonkey · 2013-09-11 19:13 · Score: 1

Yeah, that was my first thought too. Especially in a desktop... why not at least RAID1 + nightly scheduled backups to a spinning drive?
Shouldn't tell him he's a fucking asshole and should die a horrible death for not having a backup? Sounds like the kind of thing he'd respond to :D
Maybe throw a chair at him for good measure...
Re:Really? by bemymonkey · 2013-09-11 19:14 · Score: 1

bleh... shouldn't *someone* tell him...
Re:Really? by semi-extrinsic · 2013-09-11 20:09 · Score: 1

Another trick is if you have a small-to-midsize bathroom, run hot water in the shower until the room is all foggy, and then wait until the humidity drops. That way almost all dust particles in the air will have been removed together with the water.

--
for i in `facebook friends "=bday" 2>/dev/null | cut -d " " -f 3-`; do facebook wallpost $i "Happy birthday!"; done
Re:Really? by gagol · 2013-09-11 20:10 · Score: 5, Informative

This is more like a MS employee workstation crash. The linux infrastructure is not hosted on Linux home machines, and replicated around the world. I was simply pointing my favorable opinion for slow spinning disks... not blaming Linus or whatever, shit happens.

--
Tomorrow is another day...
Re:Really? by TheRaven64 · 2013-09-11 22:52 · Score: 1

Controller failures are usually easy to fix. The controller on most disks is on a board on the outside that can be unscrewed and replaced. If you find another drive (often cheap, second hand) of the same make and model you can switch them across. I wouldn't recommend doing anything with the drive other than dumping the data when it's in this state, however...
That said, I completely agree. Stuff that is on a single drive is stuff that you don't care about.

--
I am TheRaven on Soylent News
Re:Really? by Eunuchswear · 2013-09-11 23:02 · Score: 1

No RAID-1?
(Yes, RAID is not backup, but RAID is better for hardware failures).

--
Watch this Heartland Institute video
Re:Really? by msobkow · 2013-09-11 23:19 · Score: 1

Jesus saves...
Moses passes...
And Mohammad tips it in to win the cup!
The crowd goes wild!!!

--
I do not fail; I succeed at finding out what does not work.
Re:Really? by gwstuff · 2013-09-12 00:36 · Score: 1

Like everything else he delegates backup to the community. He releases everything open source, the community backs it up for him.
Re:Really? by fabien.bouleau · 2013-09-12 00:37 · Score: 1

No backup?
No git push? I daresay...
Re:Really? by jsepeta · 2013-09-12 00:45 · Score: 1

maybe "free" linux should be user-$upported so Linus can afford a backup drive.

--
Remember kids, if you're not paying for the service, YOU ARE THE PRODUCT THAT IS BEING SOLD.
Re:Really? by Zero__Kelvin · 2013-09-12 01:14 · Score: 1

OK. I take back what I said about you and stand corrected. In fact I apologize and feel a bit foolish. It is, in fact, the SSD designers that are on drugs. In my defense I wouldn't have believed it if I didn't read it for myself, frankly, because encryption of the data by the controller chip, as the default, is a phenomenally stupid thing to do. I can only assume this is market driven, and that the engineers know how stupid it is but are doing it anyway at the insistence of marketing.

--
Guns don't kill people; Physics kills people! - John Lithgow as Dick Solomon on Third Rock From The Sun
Re:Really? by Zero__Kelvin · 2013-09-12 01:19 · Score: 1

Data whitening serves no purpose. If you are thinking of wear leveling, one has nothing to do with the other. There is absolutely no advantage to encrypting the data from a design perspective. Also, you missed the point, which is that having encrypted data where you cannot know the key as the data owner is phenomenally stupid.

--
Guns don't kill people; Physics kills people! - John Lithgow as Dick Solomon on Third Rock From The Sun
Re:Really? by multi+io · 2013-09-12 01:34 · Score: 1

No backup?
Drive failure? Sounds more like a RAID1/5 thing. Someone should tell Linus about mdadm.
Re:Really? by Eunuchswear · 2013-09-12 02:20 · Score: 1

Jesus saves, but Keegan scores on the rebound.

--
Watch this Heartland Institute video
Re:Really? by petermgreen · 2013-09-12 02:29 · Score: 1

AIUI linux development works on the principle of "everything goes through linus". What level of attention linus gives it depends on how much he trusts the person sending it but every change to the main tree needs to be merged by him and then pushed to the public servers. I presume the requests to re-send pull requests were because he either wasn't making backups or his backups of the status of pull requests were out of sync with the backsup of the repo.
So it's like an employee workstation crash for the one employee that has the skills and authority to make your development "official".

--
note: i'm known as plugwash most places but i screwd up registering that here somehow in the past and now can't register
Re:Really? by Marxist+Hacker+42 · 2013-09-12 04:28 · Score: 1

What is so hard with "replace the drive, get a new pull from GIT"? This should have been a minor inconvenience- maybe a 4 hour delay at best.

--
SJW: a person who perceives an injustice, and while correcting it, commits a greater injustice.
Re: Really? by basecastula+ · 2013-09-12 07:23 · Score: 1

Still love clonezilla.
Re:Really? by metaforest · 2013-09-12 13:48 · Score: 1

[snip] trying to desolder 100 pins spaced 0.01" apart then resoldering them, unless you have a 0.1 mill precision soldering robot it is impossible, you can't even buy wire thin enough to do it by hand.
Bullshit. Full stop.
Re:Really? by Miamicanes · 2013-09-12 16:22 · Score: 1

The craziest thing about Sandforce is the way their marketing department is somehow able to spin some of the worst design flaws in the history of computing into alleged sales points.
Their controller chips are the worst crock of steaming shit and snake oil to have disgraced the computer industry in *years*.
If The Onion's staff got together with a few sixpacks of beer, a pizza or two, and a mission to make up something funny for the "Tech" section of their next April Fool's Day edition, it would read like a Sandforce datasheet.
Let's be honest... if Wired Magazine got a press release proclaiming that Sandforce SSD controller chips maximized Windows performance by making users re-install it from scratch every 3-6 months, their editorial staff would have to call Sandforce's marketing department to confirm that it was real, and wasn't just a bored member of the Syrian Electronic Army playing practical jokes on them again...
Re: Really? by TCM · 2013-09-12 23:17 · Score: 1

You'd be surprised how many people do backups but never test restore.

--
Of course it runs NetBSD. BTC: 1NT7QvbetmANwaMzhpVL6
Re:Really? by Guspaz · 2013-09-13 04:56 · Score: 1

To be fair, Intel's Sandforce-based SSDs have been quite reliable, as have the ones used in Apple products. Are you sure you're not meaning "OCZ" when you say "Sandforce"?
Re:Really? by columbus · 2013-09-13 05:10 · Score: 1

Whoosh.
https://groups.google.com/forum/#!msg/linux.dev.kernel/2OEgUvDbNbo/bTk-VE1zrnYJ

--
friends don't let friends teleport drunk
Re:Really? by zwarte+piet · 2013-09-13 10:42 · Score: 1

Yeah, why didn't he just set his systemclock back to just before the ssd failure and copy everything.
Re:Really? by Ravaldy · 2013-09-17 05:27 · Score: 1

SSDs are worth the money and the risk. Just be smart and have you're projects either on a raid 1 HD array or make sure you have backups. Considering who this guy is and the experience he brought to the community, it surprises me that his work would not be stored on a NAS or file server of some sort.
Also, does he not use source control for his projects? If so, does he not check-in the work as each milestone is reached?
Re:Really? by girlintraining · 2013-09-21 04:23 · Score: 1

I found spinning rust to at least give some clues prior to a crash and burn.
You know, I find this attitude to be both prevalent, and strange for supposed IT experts. Most of your computer doesn't run on "spinning rust". CPUs, memory, motherboards, power supplies... nobody says the lack of noise they make when they die (unless you count the screams of the souls that are released with the smoke) is a problem... but somehow, when it comes to SSDs, the "I can hear it dying" argument comes up. A lot.
I suspect this is a psychological attachment, with a healthy helping of overvaluation of personal experience instead of objective data. The weird part? When you point it out, geeks tend to dismiss it as "Well, they just aren't as good" as though 'goodness' was some kind of objective measure. I find this all the time amongst otherwise perfectly rational IT people: The belief that because the solution isn't perfect, it is therefore wrong, while ignoring the fact that the current solution they're supporting is also not perfect.
But the fact is, SSDs are many multiples faster than regular old "spinning rust" and more reliable. Ask any major manufacturer what their average warranty RMA rate is on their SSDs versus any other manufacturer's RMA rate on regular old "spinning rust". You'll find that SSD manufacturers regularly offer 3 and 5 year warranties. You're lucky to get a 90 day return policy on spinning rust bought off Amazon.
Now, all that said, dig into the data and you will find some new failure conditions that spinning rust doesn't have. For example, sudden power loss can cause a temporary loss of capacity, which will show up as bad sectors, in many SSDs. Very few IT professionals are aware of this; Or the fix: Physically disconnecting it for at least an hour, then wiping it (SATA command, not OS) and restoring the data. Many will RMA a drive claiming 'bad sectors' when there's nothing physically wrong with the drive... it's just buggy firmware.
Everyone points to write-exhaustion; The overly-focused on issue of repeated writes eventually 'wearing out' the drive. But guys... the average cycles here are 3,000 to 5,000 per cell. If you are writing 10GB a day to your drive, then a tiny 80 GB SSD will take 18.7 years before it gives up the ghost; Or about 68.5 TB of data written to the disk. If you opted for a 160GB drive, kick that out to 37.5 years. And that's for it to start showing physical loss of storage capacity.
The problems of SSDs is not electrical. It is not physical. It is entirely software. The firmware on many of these drives is buggy and this is covered up by the SATA / AHCI interfaces, which were designed for spinning rust, and thus have no direct way to signal the myriad of weird firmware glitches.
The electrical/physical part of SSDs is proven tech. It doesn't go bad, not under the usage conditions that the average computer user will put them into. And yes, I know, you don't think of yourself as average... but you are, ok? Even you, Mr. Programmer, Mr. Video Editor, and Mr. Super Linux Power User ZOMFG I Built My Own Raid In Mom's Basement. All of you are the 'average' case. The only time I've heard of mechanical drives being preferred is in usage conditions where data is being constantly written out -- such as a monitoring system like the Large Hadron Supercollider that collects terabytes upon terabytes of data, which is then processed and flushed, many times a day. SSDs would be bad in that environment. But unless you're building your own LHC in the garage... SSDs will work just fine.
That said... I have considered writing to OCZ and Intel and asking them if they could make their SSDs make the same noises as mechanical drives. There's a proven psychological value in this; Just like how your cell phone camera is programmed to emit a shutter snap sound... despite shutters not being around since the 80s. Because there are a lot of people that apparently need reassurances that their computer be making noise in the corner for them to feel good about it's performance and reliability. It may be too soon for geeks to live with silent computers.

--
#fuckbeta #iamslashdot #dicemustdie

Sounds like by Anonymous Coward · 2013-09-11 07:52 · Score: 1

He should listen to Steve Gibson and run Spin Rite.

Re:Sounds like by nucrash · 2013-09-11 08:16 · Score: 1

I was going to post something to this effect. I think the big issue is being in the middle of work more than anything else. The problem is if Linus is still using a Mac, Spinrite 6.0 won't work. Spinrite 6.1 will though it's in beta.

--
Place something witty here

Eggs, Basket by Sneakernets · 2013-09-11 07:55 · Score: 5, Funny

That's all that Ballmer needs to stop Linux? Just find Torvald's SSD?

--
"No freeman shall ever be debarred the use of arms." -- Thomas Jefferson

Re:Eggs, Basket by CastrTroy · 2013-09-11 07:58 · Score: 2, Insightful

Makes me wonder what would happen to Linux development if Torvalds was to get hit by a bus, or be incapacitated in some way. Is kernel development that reliant on one person that a single laptop breaking brings everything to a halt?

--

Anthropic principle: We see the universe the way it is because if it were different we would not be here to see it.
Re:Eggs, Basket by sjames · 2013-09-11 08:16 · Score: 1

Only in the short term. In the bus scenario, another leader would be chosen by the developers. There are several good choices there.
Re:Eggs, Basket by CastrTroy · 2013-09-11 08:18 · Score: 1, Troll

Yeah, but who takes control after he dies? Linux is already fragmented enough with all the distributions, I would hate to think what would happen when Linus dies or gets tired of programming, and then a bunch of companies/people decide to fork the kernel, because they all want to be in control of the "official" kernel.

--

Anthropic principle: We see the universe the way it is because if it were different we would not be here to see it.
Re:Eggs, Basket by bobbied · 2013-09-11 08:20 · Score: 2

Who needs that? You can always take the last source release and start your own build.
This guy only controls the Linux Kernel by convention (and because it is convenient). Anytime he is unable or unwilling to keep the kernel development going, any number of others can step up and take over.
It will be interesting to watch when it happens though. I suspect that unless Torvalds appoints a successor and willingly hands over the keys the Linux Kernel will fracture into 3 or 4 major branches. Even if he does appoint someone or some organization to take over there is a risk of the kernel fracturing into multiple efforts.

--
"File to fit, pound to insert, paint to match" - Aircraft Maintenance 101
Re:Eggs, Basket by You're+All+Wrong · 2013-09-11 08:41 · Score: 2

His laptop breaking brought about 0.0001% of the actual work on linux to a halt, if that. Every linux developer continued developing as normal. Every code reviewer continued reviewing code as normal. Every subsystem maintainer kept maintaining their subsystem as normal. Every automatic test built robot kept automatically doing build tests as normal. People who desperately needed the patches that Linus was going to push put, if they really were that desperate, would have just pulled them from linux-next, or the relevant subsystem maintainer's tree, or, *most likely*, would already have them!

--
Your head of state is a corrupt weasel, I hope you're happy.
Re:Eggs, Basket by Lord+Apathy · 2013-09-11 09:44 · Score: 1

Hey Hey people its just a dead SSD.. You people are getting ready to bury the head penguin with it.

--
Supporting World Peace Through Nuclear Pacification
Re:Eggs, Basket by CanHasDIY · 2013-09-11 09:51 · Score: 1

the head penguin
Spurred a thought: How do you think ol' Torvalds would react if people started referring to him as Tux?

--
An enigma, wrapped in a riddle, shrouded in bacon and cheese
Re:Eggs, Basket by blackiner · 2013-09-11 13:23 · Score: 1

I seem to recall him saying that Greg Kroah-Hartman already does pretty much everything he does, and could take over at any time if he were unable to continue.
Re:Eggs, Basket by MrDoh! · 2013-09-11 21:51 · Score: 1

I think it's not just a risk but pretty much certain. There'll be multiple forks for a bit, but long LONG term, that might be a good thing. People will make/test their code against the big ones, incompatible forks will die off. Then there'll be re-unification, or significant effort to keep compatibility "so... this app works on all versions APART From Microsoft Linux? hmm....". No, I'm not worried about 'The Great Fork'. It'll be tears and gnashing and wailing of teeth at first, but eventually it'll settle down and things will be better for it to have occurred.

--
Waiting for an amusing sig.
Re:Eggs, Basket by Ash+Vince · 2013-09-11 23:21 · Score: 1

Makes me wonder what would happen to Linux development if Torvalds was to get hit by a bus, or be incapacitated in some way. Is kernel development that reliant on one person that a single laptop breaking brings everything to a halt?
Actually the result would probably be pretty much the same in the short term. Greg Kroah Hartman or someone would send an email out letting everyone know the terrible news and asking everyone who has sent a patch to Linus for merging to resend it to him as it was problematic to get it out of Linus's email archive. Things would stop for a day or however long it took him to receive all the patches and check that each one had been merged.
Then in a day or so everything would be back to normal and Greg would start insulting people for minor transgressions :)

--
I dont read /. to RTFA, I read /. to offend people in ignorance.
Re:Eggs, Basket by bobbied · 2013-09-12 02:04 · Score: 1

You may be right that we would eventually settle down to a single version as unsupported branches die off, but I think it would likely end up being the same situation we have with Linux Distributions. We have two or three leading distributions with groups of advocates for each. In fact, I'd wager that the current Linux distributions would simply assume control over their kernel branch as time moved on and then arranged for kernel innovation to suit their unique focus.
Personally, I'm inclined to think that would be a good thing for Linux for a number of reasons. The chief reason is the personality foibles of the current kernel manager. I think that having such a temperamental guy controlling the Linux Kernel has held it back on some fronts. It certainly has limited acceptance and development of otherwise worthy features, just because he decided it was not what he wanted. (To be fair, he has also squashed things we really are better off not having. I just think that it would be better if we had more than one authority making such decisions.)

--
"File to fit, pound to insert, paint to match" - Aircraft Maintenance 101
Re:Eggs, Basket by Lord+Apathy · 2013-09-12 17:17 · Score: 1

Well I'm not sure since Tux isn't his name. But he is the head penguin, which is more of a title than a name. Other titles would be the Big Penguin Kahuna , Chief Penguin in Charge, and my favorite One Penguin to Rule Them All.

--
Supporting World Peace Through Nuclear Pacification

Next project - backups! by ruiner13 · 2013-09-11 07:55 · Score: 2

Maybe Linus needs to create a backup program like he did when he wanted a better version control system and created git? Also, why is the only copy of the changes on his local workstation and not a server with redundancy? This seems rather amateurish.

--

today is spelling optional day.

Re:Next project - backups! by geek · 2013-09-11 08:03 · Score: 5, Funny

Allow me to channel Linus Torvalds a minute:
"What do you mean there wasn't a backup disk? Fucking kill yourself with a pipe wrench. I hate you, your mother was a whore and your dad was the neighbors dog. People like you make me sick."
Re:Next project - backups! by PRMan · 2013-09-11 08:34 · Score: 4, Insightful

It's comments like these that make me wish Slashdot mods could go to 10 instead of 5. Nicely done.

--
Peter predicted that you would "deliberately forget" creation 2000 years ago...
Re:Next project - backups! by Digicrat · 2013-09-11 09:15 · Score: 1

Or perhaps he'll enhance GIT with a way to automatically sync/push working changes to a remote 'backup' repository or temporary/private branch.
From the description, it sounds like he was in the midst of a large merge. So of course everything on his system is version controlled ... the changes just haven't been committed yet.
Re:Next project - backups! by mynamestolen · 2013-09-11 09:17 · Score: 1

I'd mod you up if I had points

--
work in progress

Linus said something... by IMarvinTPA · 2013-09-11 07:57 · Score: 3, Interesting

Linux said "So I don't want to necessarily blame the harddisk, since it's just ten
days since I upgraded the rest of my machine, after it worked years in
the previous one. That just makes me go "hmm". As far as I know, all
the fans etc were working fine, but.."

There's his problem: "after it worked years in the previous [machine]."

His SSD died a natural death of old age.

IMarv

--
Trusting software vendors is no smarter than trus

Re:Linus said something... by kwalker · 2013-09-11 08:27 · Score: 2

That's not how drives die of old age. A sudden and permanent drive failure like what is described is almost always a controller failure. When mechanical drives die of old age, they generally develop bad sectors and read-errors accumulate on the platter, but you can still read from the un-damaged areas. When SSDs die, those worn-out sectors go read-only or begin throwing similar read/write errors, depending on the firmware.
After having a 40GB IBM Deathstar suddenly go down in flames, and dozens of "salvage my data!" calls from friends and family, I don't trust any single drive of any age or provenance. ALWAYS have backups.

--
... And so it comes to this.
Re:Linus said something... by couchslug · 2013-09-11 09:12 · Score: 1

Why anyone working on critical software would bother reusing (rather than copying) old hard drives is a mystery to me.
They are cheap enough to throw away, even at a grand a pop for the 400GB Intels.

--
"This post is an artistic work of fiction and falsehood. Only a fool would take anything posted here as fact."
Re:Linus said something... by citizenr · 2013-09-11 13:59 · Score: 2

His SSD died a natural death of old age.
IMarv
there is NOTHING natural about a drive that disappears without a notice with all of your data.

--
Who logs in to gdm? Not I, said the duck.
Re:Linus said something... by MobyDisk · 2013-09-11 15:06 · Score: 1

His SSD died a natural death of old age.
But this isn't the pattern for how SSDs die of old age. When they age, they start to run out of sectors to relocate data, and they report SMART errors indicating the wear leveling is running out of room. Then they enter a read-only state where you can read data for eternity, but they refuse to allow writes.
Re:Linus said something... by sonamchauhan · 2013-09-11 15:18 · Score: 1

Or more likely, his new motherboard chipset (or new drivers), caused the problem.
For example:
http://en.wikipedia.org/wiki/Trim_(computing) ...
The RST (Rapid Storage Technology Option Rom ) and drivers are only allowing Trim to pass to the controller onto the drive in Intel 7 series chipsets using driver versions post 11.2.0.0.
Re:Linus said something... by jones_supa · 2013-09-11 23:30 · Score: 1

But how many SSDs actually implement that strategy properly?
Re:Linus said something... by greg1104 · 2013-09-12 04:23 · Score: 1

Old age can cause controller failures, usually due to the capacitors wearing out. That's becoming increasingly common now that better SSDs are adding larger "supercaps" and similar battery mechanisms, for clean shutdown when power drops.
I've never seen or heard a credible report of a SSD that has a well implemented read-only mode when it starts to fail. What actually happens when you look at endurance experiments is that the drives stop retaining data for very long when they wear down. Wear out the sectors enough, power off the machine, and the data will be gone by the next day. No manufacturer handles it as gracefully as a solid read-only mode would be yet.
Re:Linus said something... by greg1104 · 2013-09-12 04:27 · Score: 2

That SSDs go read-only as they wear down is a myth. I've never seen a single credible report of it happening. If you read real wear-down tests, what actually happens is that the drives stop retaining data when powered off as they get very old. Not a single drive tested there failed gracefully at the end.
Re:Linus said something... by MobyDisk · 2013-09-12 08:34 · Score: 1

Very interesting! I've been wondering about that.
Re:Linus said something... by toddestan · 2013-09-12 14:08 · Score: 1

Approximately none of them. The drives don't usually check to see if the data wrote correctly, so it doesn't actually realize that the data didn't write when this happens. Except, thanks to wear-leveling mechanisms, the data you wrote was probably scattered all over the media, and some of it wrote, and some of it didn't. So now you have massive data corruption. To add to the hilarity, the drive often will be, behind the scenes, moving data around due to things like wear leveling and trim, so long as the drive is powered, even if you aren't doing anything with it, you can still watch as your data 'decays' away.

Platters of spinning rust by MrEricSir · 2013-09-11 07:57 · Score: 1

The one (personal) thing storage-related that I'd like to re-iterate is that I think that rotating storage is going the way of the dodo (or the tape). "How do I hate thee, let me count the ways". The latencies of rotational storage are horrendous, and I personally refuse to use a machine that has those nasty platters of spinning rust in them.

Bet you regret knocking those platters of spinning rust now, don't you Mr. Torvalds?

--
There's no -1 for "I don't get it."

Re:Platters of spinning rust by PortHaven · 2013-09-11 08:04 · Score: 1

No the real question is why he didn't simply have two mirrored SSDs.
Re:Platters of spinning rust by hawguy · 2013-09-11 08:42 · Score: 1

No the real question is why he didn't simply have two mirrored SSDs.
Yeah.. i was wondering the same thing... My work has far less impact than his (and I'm certain that I get paid far less), but my computer has a pair of SSD's in RAID-1, and I make snapshots to an external hard drive every hour (keeping daily snaps for a month, weekly snaps for a year), *and* I use a cloud backup service that stays relatively in real-time sync.
SSD's are so cheap that there's little reason not to RAID them on a desktop machine unless you don't value your data. They take up so little space that even if you don't have a free hard drive cage, you can squeeze it in somewhere.
Re:Platters of spinning rust by Overzeetop · 2013-09-11 08:55 · Score: 1

Actually with a continuous backup system, you may only be out several minutes of work. AutoCAD, which I work in, autosaves every 5 minutes, and I work on a local-with-cloud-backup that backs up as files are changed. Worst case, I'm out 7-8 minutes of work.
The delay is, of course, if the SSD craps out that the work environment is gone and has to be rebuilt from scratch or from backup (which, honestly, can take hours even in the best scenario).

--
Is it just my observation, or are there way too many stupid people in the world?
Re:Platters of spinning rust by hawguy · 2013-09-11 09:30 · Score: 1

Actually with a continuous backup system, you may only be out several minutes of work. AutoCAD, which I work in, autosaves every 5 minutes, and I work on a local-with-cloud-backup that backs up as files are changed. Worst case, I'm out 7-8 minutes of work.
The delay is, of course, if the SSD craps out that the work environment is gone and has to be rebuilt from scratch or from backup (which, honestly, can take hours even in the best scenario).
In theory the cloud backup should only be minutes out of date, but I'm not willing to rely on it entirely since there are lots of things that can delay the cloud backup - internet problem/congestion between me and the cloud provider, hardware/software problem/maintenance at the cloud provider, spending hours transferring a 60GB datafile that I accidentally created and don't really need to be backed up, etc. That's why I do hourly snapshots to a local disk, since I'm certain that those will succeed, and rely on the cloud backups more for disaster recovery.

RAID? by Anonymous Coward · 2013-09-11 07:59 · Score: 1

Surely he's not working on a single drive system?

No RAID? No backup? by Nick · 2013-09-11 08:00 · Score: 2, Funny

Was he too busy treating people horribly to audit his DR procedures?

--
Fuck Ajit Pai

Re:No RAID? No backup? by samjam · 2013-09-11 08:30 · Score: 4, Funny

His SSD gave up out of shame for all the threats and abuse it had been forced to witness

--
blog.sam.liddicott.com
Re:No RAID? No backup? by Obfuscant · 2013-09-11 09:04 · Score: 1

Any guesses whether the controller for the SSD had an ARM processor with hidden busses involved?

Hmmm.... by gr0wler · 2013-09-11 08:01 · Score: 1

....I hear rsync is lovely this time of year....

Intel? by stkris · 2013-09-11 08:03 · Score: 1

So Linus got bitten by the same Intel SSD bug that bit me and many others?

Re:Intel? by stkris · 2013-09-11 08:29 · Score: 3, Informative

More info here: http://goran.krampe.se/2013/01/02/ssd-nightmare/
"So power cycling can apparently trigger this - and the disk for some odd reason (self protection?) decides to decapitate itself and set accessible cylinders down to 16 instead of 16384."

why this news? by Laxori666 · 2013-09-11 08:04 · Score: 4, Insightful

Why is this news... is this our version of People magazine, where instead of hearing about all the details of the Kardashians' lives, we hear about every email or event that happens to Linus?

Re:why this news? by Bill,+Shooter+of+Bul · 2013-09-11 08:25 · Score: 1

A hard drive failure is proof that he's a jerk?
Interesting personality test you have there...
My mouse died last week, does that mean that I'm a bad person, or am I just a lush?

--
Well.. maybe. Or Maybe not. But Definitely not sort of.
Re:why this news? by rsborg · 2013-09-11 09:04 · Score: 1

Why is this news... is this our version of People magazine, where instead of hearing about all the details of the Kardashians' lives, we hear about every email or event that happens to Linus?
Did you just call Linus a Kardashian? This impacts timing on the next release of Linux, which impacts many companies and geeks. News for nerds doesnt' mean it doesn't involve actual people.

--
Make sure everyone's vote counts: Verified Voting
Re:why this news? by Princeofcups · 2013-09-11 10:26 · Score: 2

Why is this news... is this our version of People magazine, where instead of hearing about all the details of the Kardashians' lives, we hear about every email or event that happens to Linus?
It shows that the best or at least most respected in the business can still be stupid when it comes to simple things like backups. Seriously, there is no reason in this day or age to lose more than a couple of transactions if you are careful. Someone kick Linus in the ass for being so sloppy and lazy.

--
The only thing worse than a Democrat is a Republican.
Re:why this news? by Fwipp · 2013-09-11 10:41 · Score: 1

oops, mismodded.
Re:why this news? by BitZtream · 2013-09-11 14:15 · Score: 1

Well, if you are a lush, that could certainly be a reason your mouse died. How many drinks have you spilled on it?

--
Persistent Volume manager for Kubernetes - https://github.com/dwimsey/openshift-pvmanager
Re:why this news? by BitZtream · 2013-09-11 14:17 · Score: 1

Any company that desperate for a patch can get it from the subsystem maintainer that has it already anyway. No code was lost, and he hasn't been the primary source of code in years, he just curates it. I use just only due to conversation context, its no small task.

--
Persistent Volume manager for Kubernetes - https://github.com/dwimsey/openshift-pvmanager
Re:why this news? by Bill,+Shooter+of+Bul · 2013-09-11 14:49 · Score: 1

Not sure. The puddle it was sitting in was 65 proof. So I'm not sure if that was a Manhattan gone wrong, or scotch done right.

--
Well.. maybe. Or Maybe not. But Definitely not sort of.
Re:why this news? by bill_mcgonigle · 2013-09-11 17:58 · Score: 1

nah, look at the comments - most people are debating the SSD Hot/Crazy Scale and arguing about the best backup strategies.

--
My God, it's Full of Source!
OUTSIDE_IP=$(dig +short my.ip @outsideip.net)
Re:why this news? by jsepeta · 2013-09-12 00:47 · Score: 1

lol, that's pretty funny. for how Linus has recently been tearing off people's heads lately, i had just assumed his problems were more "Get offa my lawn" kind of squawking.

--
Remember kids, if you're not paying for the service, YOU ARE THE PRODUCT THAT IS BEING SOLD.

Backups? RAID? by pubwvj · 2013-09-11 08:07 · Score: 1

I find it amazing to consider that he is not working on a redundant and well backed up machine. Where's last hour's backup? Yesterday's backup? Even pig farmer's know to backup their data.

Someone flame him... by HockeyPuck · 2013-09-11 08:08 · Score: 1

I'm no kernel maintainer but...

If his workstation is so important why doesn't he mirror the disks?
Back them up regularly?
Run a remote desktop to a server with the above conditions

Re:Someone flame him... by sjames · 2013-09-11 08:23 · Score: 5, Insightful

He has backups all over the world. But like with any backup, you can't actually restore from it until you replace the failed disk.
Re:Someone flame him... by HockeyPuck · 2013-09-11 17:37 · Score: 1

Like I said, why doesn't he run raid1 and mirror his computer's disk?
Linus flames an awful lot of people when they do something stupid by not understanding X or failing to do Y, so someone should flame him for failing to mirror his drive.
Re:Someone flame him... by sjames · 2013-09-11 18:10 · Score: 1

Because the added cost just isn't worth it for the limited number of cases where it solves a problem? It's not like this is one of those problems costing a zillion dollars per second of downtime. (Of course, in many of those cases it isn't actually costing what they claim).
Step back, take a deep breath, and ask yourself "what are the actual consequences of this?"

The obvious fix .. by Anonymous Coward · 2013-09-11 08:10 · Score: 1

Is to send a profanity-laced e-mail to the hard drive. Perhaps then it will see the error of its ways and begin working properly.

Re:Pathetic by Blaskowicz · 2013-09-11 08:10 · Score: 1

Rsync to a second drive with a cron job?

Re:Pathetic by CastrTroy · 2013-09-11 08:11 · Score: 1

First, there's RAID, so that the death of a single drive doesn't make you lose any time. Just replace the dead disk, and go about your work. Everybody likes to say that RAID isn't a backup, but it's better than nothing. If you have RAID, and pair it with nightly backups of everything that's changed, which is pretty easy to set up, then you're pretty well covered. The only important part being that you actually have to be connected to another machine with a reasonably fast connection in order for the backup to be done. For laptops, this often isn't possible, especially if you are on the road. But even then, you can make sure you copy the really important stuff off to a USB hard disk or some other medium. It's better than nothing.

--

Anthropic principle: We see the universe the way it is because if it were different we would not be here to see it.

Re:Pathetic by Dragonslicer · 2013-09-11 08:12 · Score: 1

If 30 minutes is near enough to real time for you, you could use rdiff-backup on a cron job. That's what I do at home, though I only run it once per day.

He deserves it. by Anonymous Coward · 2013-09-11 08:13 · Score: 1

What kind of an idiot uses a year-old SSD for critical work? Oh, yeah, the kind of idiot who routinely flames other people for trying to get him to make code changes he doesn't like, rather than explaining his position in a calm, rational fashion.

Karma's a bitch, no matter how famous you are.

BREAKING: Development was also held up.... by musth · 2013-09-11 08:13 · Score: 2

...for over an hour when Torvalds had to make an emergency run to Albertson's for some toilet paper and hostility medication.

Welcome to how SSDs fail. by Mike_EE_U_of_I · 2013-09-11 08:13 · Score: 5, Interesting

I've owned several hundred hard drives over the last 30 years. I've never had an active hard drive drive just blank out. I have had drives that had not been powered for a couple of years refuse to ever come back. But if I did not feel the need to even power the thing on for years, you can imagine how little I cared for what was on it.

In the last four years, I've owned around 20 SSDs. I've had five failures. Every single one was the drive just instantly lost everything. Amazingly, in four of the five cases, the drive still worked fine! It had simply lost all the data on it and believed itself to be a blank drive.

That said, the speed of SSDs makes them worth the risk to me. But I take backups far more seriously than I used to. I need them far more often.

Re:Welcome to how SSDs fail. by RichMan · 2013-09-11 08:25 · Score: 3, Informative

A hard shutdown of high-speed SSD is death. It takes really really good firmware to recover without reinitializing the drive.
The basic SSD "format" is susceptable to damage on power fails in a way that hard drives are not. The mapping and setup stables of the SSD are critical and constantly in flux unlike a harddrive where the mapping is only updated when a failure occures.
SSD drives need internal power fail control so they can gracefully shudown and firmware that supports it.
Re:Welcome to how SSDs fail. by Bill,+Shooter+of+Bul · 2013-09-11 08:30 · Score: 1

Oh man thats scary. Any *good* solution? I've heard Raid is a no no on SSD as it will shorten its life. Maybe regular BTRFS/LVM snapshots exported to a spinning disk ?

--
Well.. maybe. Or Maybe not. But Definitely not sort of.
Re:Welcome to how SSDs fail. by CastrTroy · 2013-09-11 08:32 · Score: 1

Personally, I'd rather just spend the money on a boatload of RAM. Modern OSes are good enough at caching that I very rarely find that I'm waiting on my workstation to access the disk. Sure boot-up is slower, but I generally only reboot my machine when there's updates that required it. Once you got most of your working data in memory, everything is snappy anyway. Perhaps if I was dealing with a much larger dataset (editing videos, or lots of different photos), I might see a reason for having an SSD. But as it stands now, I find that my computer responds quite quickly with a mechanical HDD.

--

Anthropic principle: We see the universe the way it is because if it were different we would not be here to see it.
Re:Welcome to how SSDs fail. by Lehk228 · 2013-09-11 08:33 · Score: 1

SSD isn't for your main file storage anyways, install speed sensitive software and OS on SSD for better performance, everything else goes on cheaper and bigger rust

--
Snowden and Manning are heroes.
Re:Welcome to how SSDs fail. by Anonymous Coward · 2013-09-11 08:45 · Score: 2, Interesting

My Crucial SSD has an issue where it craps out if power is unexpectedly removed. I discovered it has an undocumented "repair mode" where you connect it to power but not SATA for about twenty minutes and it repairs itself.
Still scares the crap out of me every time it happens, but I back up important stuff regularly; it's just there for the speed.
Re:Welcome to how SSDs fail. by EmagGeek · 2013-09-11 08:51 · Score: 1

Stop buying OCZ SSDs.
Re:Welcome to how SSDs fail. by RichMan · 2013-09-11 08:54 · Score: 1

see the long post by tlhIngan discussing flash translation layers (FTL) above for why. The thing is the FTL data has to be written from RAM in the SSD to Flash in the SSD at shutdown time.
Good drives have internal supercaps or batteries that allow them enough time to shutdown gracefully if the system power fails without warning. This would be invisible to the system. The big thing is "SHOULD". Firmware bugs, short on power cycles, defective caps/batteries and you are back to the big fail scenario.
Re:Welcome to how SSDs fail. by Overzeetop · 2013-09-11 09:02 · Score: 1

Bootup happens so infrequently now (even on W7 I might cold boot once a month) that time is fairly minor. When I switched from spinner to SSD, the difference in minute-to-minute operations was significant, and that's with 24GB of RAM installed. Then again, I work in CAD with large models, and very large printed (PDF) files (Often TIFF scans 30x42@300-600dpi), so I run through a lot more RAM than the typical person.

--
Is it just my observation, or are there way too many stupid people in the world?
Re:Welcome to how SSDs fail. by Silvrmane · 2013-09-11 09:24 · Score: 2

This isn't likely to happen on a laptop. They sort of have a built-in battery backup. :)

--
planet texture maps and more
Re:Welcome to how SSDs fail. by Mike_EE_U_of_I · 2013-09-11 09:38 · Score: 1

LOL, how did you know? Actually, only my very first SSD was OCZ. It did the insta-blank thing something like 5-6 times before I threw it in the trash. Most of my drives these days are Intel drives, and I've only had one of them insta-blank.
Re:Welcome to how SSDs fail. by RichMan · 2013-09-11 10:12 · Score: 1

> Write FTL changes to a log, make changes according to log, mark log as completed. On power up, check log, if it's not completed
You mean you log to the drive you are trying to manage? Where do you put the changes to the FTL due to your writing the log? Back in the log? That is self defeating circular.
In any case maintaining a log results in -
a) amplyfying writes, this has horible performance overheads
b) always something in the not committed log cache, which is the problem in the first place
c) the need to cache stuff before the log gets processed and marked
d) the need to wait for a log flush or search the write cache before you can service a read
SSD's are fast all this slows it down considerably.
Re:Welcome to how SSDs fail. by complete+loony · 2013-09-11 11:50 · Score: 1

Adding to the above response, on an SSD even the log needs to be wear levelled, so where do you keep the log of where your log is?

--
09F91102 no, 455FE104 nope, F190A1E8 uh-uh, 7A5F8A09 that's not it, C87294CE no. Ah! 452F6E403CDF10714E41DFAA257D313F.
Re:Welcome to how SSDs fail. by Dracos · 2013-09-11 12:56 · Score: 2

This describes several of the reasons why I will not buy an SSD any time in the near future. Sketchy reliability, indeterminate longevity, inexplicable data loss. Mirroring a turd just means you have multiple turds. I have a few 10+ year old DeskStar drives that I still use and have never given me problems.
Re:Welcome to how SSDs fail. by drinkypoo · 2013-09-11 15:03 · Score: 1

Ultrabooks are about compromise. You make tradeoffs for the form factor and battery life. One of those tradeoffs is reliability. Save early, save often.

--
"You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
Re:Welcome to how SSDs fail. by hawk · 2013-09-11 17:02 · Score: 1

>OS on SSD f
Or ZFS on BSD . . .
[*duck*]
hawk
Re:Welcome to how SSDs fail. by WuphonsReach · 2013-09-11 20:07 · Score: 1

RAID is fine on the enterprise / data-center quality SSDs. Granted, you'll be paying $2-$3/GB for those (Intel DC 3500, Intel DC 3700, etc.) instead of under $1/GB.

The data-center SSD ship with "super-caps" (large capacitors) inside that store enough power for the SSD to do what it needs to do to flush tables to the flash chips when power is lost.

They are also under-provisioned. While a consumer level "128GB" drive will advertise 128GB of capacity, data center drives will only advertise 120GB of capacity. Which gives them more "spare" blocks. And you can help this along by only using say 100GB of a 120GB SSD. By leaving the last 20GB untouched, you're giving the wear-level algorithm more "unused" blocks to play around with.

The Intel DC series SSDs are pretty well designed with a very long life.

--
Wolde you bothe eate your cake, and have your cake?
Re:Welcome to how SSDs fail. by klui · 2013-09-11 22:33 · Score: 1

When the notebook is new. Then when the battery dies, lots of people just leave it plugged into the AC and they're back to square one.
Re:Welcome to how SSDs fail. by petermgreen · 2013-09-12 03:17 · Score: 1

Of course many laptops do emergency hibernate when the battery status gets critical.
If emergency hibernate works it means you don't lose the work you had in progress when the battery fails but if it triggers too late then I could see it ending up with the drives power being cut while under heavy write load.

--
note: i'm known as plugwash most places but i screwd up registering that here somehow in the past and now can't register
Re:Welcome to how SSDs fail. by blackraven14250 · 2013-09-12 09:33 · Score: 1

Totally true. Some drives have internal components designed to let them flush the cache out to the NAND before it dies. If you keep it on a UPS, then you're basically safe whether it has the extra circuitry/capacitors or not.
Re:Welcome to how SSDs fail. by toddestan · 2013-09-12 14:13 · Score: 1

Generally speaking, many RAID controllers don't pass on the trim information to the drive so it can't run its wear leveling mechanisms as optimally as if it wasn't in a RAID. With that said, since this is Linux just use the software RAID, which IMHO is a much better solution than the cheap RAID controllers like the ones built into motherboards.
Re:Welcome to how SSDs fail. by toddestan · 2013-09-12 14:17 · Score: 1

So put it into sleep mode? Sleep mode has more or less just worked now since about 2006 or so. Yes, even on computers assembled from components. I mean, the computer I'm on now was last rebooted 45 days ago, but has only been running for slightly less than 13 days.
Re:Welcome to how SSDs fail. by nightside · 2013-09-13 06:37 · Score: 1

Same here. Had a Vertex 2 croak on me with no warning, it fell into some internal maintenance mode and that was it. The data was still there, of course, but the disk was just refusing to do anything at all. After my experience with the customer support at OCZ where they told me they did't even have the software tools to get the disk out of maintenance mode and all they had from the vendor of the controller was an utility that simply did a factory-reset on the drive, I had enough. Don't know what would be worse, them lying to me or them really being unable to debug their own damned hardware. Either way, I have a Samsung ever since and had no further problems.

Re:Pathetic by jabuzz · 2013-09-11 08:22 · Score: 1

They have. You use something like inotify or a DMAPI enabled file system generate a queue of things that need to be backed up and constantly run through it. To get good performance however is going to cost $$$

Re:Pathetic by timeOday · 2013-09-11 08:28 · Score: 1

Looks intriguing, although I see there hasn't been a release in over 4 years. Is it just that good, in your experience?

Re:But I thought Git was great! by swan5566 · 2013-09-11 08:28 · Score: 1

Well without any pushing to a remote repo it's as mortal as any other source control.

--
In debates about Christianity, there are two groups: those looking for answers, and those looking to just ask questions.

Re:None of that mattered, because by Zero__Kelvin · 2013-09-11 08:28 · Score: 4, Informative

That is correct. In fact he wrote the code that is the industry standard and uses it every day. How else do you think he is going to continue completion of the project on his laptop.

--
Guns don't kill people; Physics kills people! - John Lithgow as Dick Solomon on Third Rock From The Sun

Kernel Panic!!!! by Cmdrx · 2013-09-11 08:28 · Score: 4, Funny

Now there a new meaning for Kernel Panic!

--
I could write something witty for my sig, but instead wrote this...

Re:Pathetic by Viol8 · 2013-09-11 08:29 · Score: 2

You beat me to it. Anyone with a vague clue about unix would have thought of that. Obviously vague clues are a rare thing for the parent poster.

Re:You trust Torvalds after this? by MouseAT · 2013-09-11 08:30 · Score: 1

We're not complaining because we know full well that Linus has backups. It's called the Git source code repository. All he's lost is time, and assuming that he's saved a whole load by not having to run backups, he's probably ahead on average.

He just needs to switch to his other machine, and re-download the sources from Git. Problem solved, system working as intended. Once he gets a new drive for his old machine, it's a case of re-install the OS, re-sync with Git, done. Maybe copy a few config files off his laptop to save time getting his preferences the way he wants them.

Re:Love it! by jedidiah · 2013-09-11 08:31 · Score: 1

Just how exactly does "that disk was too old" constitute defending Linus?

Sounds like he should have known better. That's probably the way most of the rest of us will take it.

--
A Pirate and a Puritan look the same on a balance sheet.

Re:Pathetic by Dragonslicer · 2013-09-11 08:34 · Score: 1

I haven't had any problems with it, but I also haven't tested the backups much. I know I've seen it get updated in Ubuntu more recently than four years ago, though. The lack of frequent updates are probably more because of how simple it is. There might be some more advanced features that it doesn't have.

Re:Pathetic by TheBig1 · 2013-09-11 08:35 · Score: 2

backintime. I am using it, works great, and the restore functions are quite easy to use as well.

Re:Lame by jedidiah · 2013-09-11 08:37 · Score: 1

Clearly he's just a cheap bastard and didn't want to pay for the extra drive to run mirrored.

--
A Pirate and a Puritan look the same on a balance sheet.

Sorry Fortunte 500 company, my SSD died... by JoeyRox · 2013-09-11 08:39 · Score: 1

Having a single point of failure disrupt something as essential as Linux kernel development doesn't instill confidence in the business world. Why are those pull requests not filtered through a separate system running GitHub, and one with some redundancy?

Re:Sorry Fortunte 500 company, my SSD died... by Robotbeat · 2013-09-11 12:11 · Score: 1, Interesting

Linus Torvalds himself is a single point of failure... People who rely on Linux being updated in a timely manner should figure out what the probability of him dying is or suffering a debilitating stroke. Then, calculate if it's worth bribing him not to take part in risky activities, pay for a safer car, etc.
Re:Sorry Fortunte 500 company, my SSD died... by Microlith · 2013-09-11 13:10 · Score: 1

Having a single point of failure disrupt something as essential as Linux kernel development doesn't instill confidence in the business world.
No it doesn't. The business world that utilizes Linux is never running on the bleeding edge, and silly non-events like this don't shake anyone's confidence.
Re:Sorry Fortunte 500 company, my SSD died... by Gallomimia · 2013-09-12 05:00 · Score: 1

In a world filled with NSA, CIA, FBI, Prism, Snowden, Schwartz, and all the other mysterious deaths and unexplained suicides, we should figure how long ago he should have died by those calculations. It's a wonder he hasn't been assassinated yet.

--
Sadly, a Libertarian cannot force his views on another, and freedom cannot spread as does the cancer known as religion.

Backups are your friend by AaronW · 2013-09-11 08:41 · Score: 2

I learned long ago after some close calls to back everything up. In my case for my desktop I store my data on a XFS partition stored on a RAID 5 hard drive array. I also am using Crashplan to back up all of my data, both to a removeable hard drive and to the cloud with over 3TB of data backed up. The nice thing about Crashplan is that it continually backs up, taking periodic snapshots so I can restore a previous version of a file if I wish. The main drawbacks of Crashplan are that it runs on Java and can be a memory pig. I pay $6/month for unlimited backup of up to 10 machines and have several computers backed up with them now. With the proper settings on my router I don't even notice all the backup traffic running in the background.

Since I have had sudden SSD failures in the past I also dump my root XFS filesystem weekly onto my RAID array (it takes under a minute to run xfsdump) and incremental backups nightly and those dumps get backed up on the cloud as well.

I have found the XFS tools to be quite good at recovery when things go really bad. When running software RAID 1 I had problems where drives would drop out of the array for apparently no reason and I have had several occasions where while rebuilding the other drive would pop out of the array. Switching to an Areca hardware raid controller with battery backed DRAM ended those problems (besides seeing a big performance improvement).

I have found the RAID controller to work well when drive failure occurs and it even recovered after human error (I accidentally disconnected one of the active drives while it was rebuilding and reconnected it).

I won't use btrfs yet. The last time I tried it about 6 months ago it was quite slow and I have a lot of concerns about the storage filling up due to COW that have not been adqeuately addressed as far as I could tell. I tried setting it up for a Cyrus IMAP server on an Intel SSD and it was unusably slow just untaring all the files so I ended up going back to XFS.

SSDs are still relatively new. I have had issues with some firmware versions and had one fail catastrophically after only 2 weeks of use. I have also had compact flash and SD devices suddenly fail. My experience is that usually mechanical hard drives give some warning (i.e. SMART) and they tend to last years. I have a server I just retired where the hard drive had 10 years on the clock according to SMART.

--
This post is encrypted twice with ROT-13. Documenting or attempting to crack this encryption is illegal.

Re:You trust Torvalds after this? by hawguy · 2013-09-11 08:44 · Score: 2

I don't feel anything but shame for someone losing data in a hard drive crash who has or should have network backups available to them. If this happened to anyone but Linus the majority of the comments would be calling the coder a n00b. If it was Balmer there would be an absolute riot of anti-MS venom....

I guess the great Linus has fallen into shadow.

As someone who's taken over server administration from very talented developers a number of times, I've found that being a great developer doesn't mean that you're a great sysadmin. Developers may understand conceptually that RAID and backups are important (but sometimes think that RAID is a backup), but that doesn't mean that they actually set them up.

Well.. by nitehawk214 · 2013-09-11 08:53 · Score: 1

I think I speak for everyone here when I say... this

--
I'm a good cook. I'm a fantastic eater. - Steven Brust

He has jumped on his laptop. by ralphaostrander · 2013-09-11 08:54 · Score: 1

And is back at work Jesus h Christ everyone on the planet has a copy.

Does this mean all Linux Development will stop... by Dareth · 2013-09-11 08:54 · Score: 1

Does this mean all Linux Development will stop until they come up with a way to prevent or recover quickly from SSD failures?

--

I only look human.
My mother is a halfling and my dad is an ogre, so that makes me an Ogreling

RAID by Larry_Dillon · 2013-09-11 08:57 · Score: 5, Interesting

I'm not nearly as much of a believer in RAID for the home environment. If you (accidentally) delete something on one drive it's gone from both. Better to buy two drives and do a daily rsync. That way you have a window of opportunity to recover data. Personally, I use rsync without --delete until the 2d drive starts getting full, then I use the --delete flag to clean up.

--
Competition Good, Monopoly Bad.

Re:RAID by countach74 · 2013-09-11 09:08 · Score: 1

Likewise, except I keep incremental backups for anywhere from 2 weeks to 2 months and as such, I do use --delete. I was rather shocked that Linus actually lost substantial data.
Re:RAID by michrech · 2013-09-11 09:09 · Score: 2, Interesting

I know I'll probably see negative moderation as a result of what I'm about to post (being as I'm about to talk up WHS2011 in a Linux related thread), however...
I stopped using RAID in any of my systems after I started using WHSv1. WHS2011 has the same feature -- live system backups. If a drive fails, I pop in a new one (of any type/size), boot a CD that came with WHS (essentially a WinPE environment with a recovery software baked in), select my backup (I save 7-10 days -- I forget what it's set to), and in about an hour my system is back to the state of the last backup. WHS is set to perform the system backups between 00:00 - 02:00 every night. The very first system backup is a 'full' backup, the rest are 'diffs'. I've had to use this feature on two of my systems, so far, and both were because of crappy WD drives (OOOOHhhh, I hate that brand soooo much). It came in really handy when I switched both my primary desktop and my laptop from mechanical HDD's to SSD's. I forced a backup, swapped the drives, and then restored...
This way, either my WHS storage pool (based on StableBit's DrivePool product) or my workstation HDD's can fail, and I can easily recover. It's automagic, manageable via a single UI, and because of DrivePool, I can easily increase the storage space at any time (without interrupting other users of the storage pool). /me puts on his asbestos underpants

--
bork bork bork!
Re:RAID by Blackknight · 2013-09-11 09:11 · Score: 2

RAID 1 with a nightly rsync to an off-site server has worked for me for several years now. The remote server runs zfs so I also take weekly snapshots in case I need to restore something older than last night.
Re:RAID by jekewa · 2013-09-11 09:51 · Score: 2

Accidental deletion is a whole different beast. If you accidentally delete something created between rsync copies it's gone for good, too, and rsync can't save you.
Unless your tool does some incremental storage for you. For example, Eclipse saves each save in a local history, including deletions, so you can go back in time even if all you did is change the file (which would also have "not there" impact between rsync copies)..
if you need that kind of assurance, you'll need more than rsync or RAID.

--
End the FUD
Re:RAID by Trogre · 2013-09-11 10:09 · Score: 5, Informative

You guys should really look at the --backup and --backup-dir options in rsync.
I use them in conjunction with --delete to always have a "current" copy of the data, along with any old files (ie that have been updated or deleted) in a separate backup folder, named after the current day of the month.
That way you get a directory structure as follows:
01
02
03
04 ...
31
Current
You can restore the up-to-date set from Current at any time, and if you want to retrieve a file you deleted or over-wrote five days ago, go look in folder 06.

--
"Nine times out of ten, starting a fire is not the best way to solve the problem." - my wife
Re:RAID by cheater512 · 2013-09-11 10:10 · Score: 1

Have you considered something like Duplicity? Pretty much the same (still uses some of rsync's guts) but compresses and makes it convenient to send off site as well.
Re:RAID by Nevo · 2013-09-11 10:16 · Score: 2

WHS = Windows Home Server
Re:RAID by countach74 · 2013-09-11 10:46 · Score: 1

This is more or less what I do. I forget why I didn't use these specific flags in rsync (the whole process is handled by a Python script I wrote that does a few other things as well).
Re:RAID by Miamicanes · 2013-09-11 10:50 · Score: 5, Interesting

The thing that really sucks about SSDs (at least, Sandforce-based drives) is the fact that 99% of their failures are due to firmware bugs that can be simultaneously triggered on an entire array at once (especially the sleep-related bugs). It's a mode of failure the creators of RAID 1, 5, and 10 never anticipated.
IMHO, the worst thing about SSDs (at least, those with Sandforce controllers) is the fact that they have mandatory full-drive encryption that can't be disabled, using a key you aren't allowed to set or recover, and gets blown away whenever you reflash the firmware. This means, among other things, if the drive's controller gets itself confused:
* You can't reflash data-recovery firmware onto the drive. The act flashing it would blow away the encryption key and render the data gone forever.
* If the drive decides you're trying "too hard" to systematically extract data from it while it's in a confused state, it'll go into "panic mode" by blowing away the encryption key. If this happens, your data is gone forever AND you have to send the drive back to OCZ or whomever you got it from in order to get it unlocked. For your protection, of course. And Hollywood's. Among other things, dd_rescue/ddrecover can trigger panic mode.
* You can't even do the equivalent of removing the platters from a conventional drive in a clean room and mount them to another drive for reading, because the data on the flash chips is all encrypted, and the key is unrecoverable.
This is BULLSHIT, and it's why I refuse to buy any more SSDs. I, as an end user, should be able to download a utility from somewhere, reflash the drive to firmware that includes an offline recovery mode that simply dumps the flash chip content from start to finish, and either disable the encryption or set it to a key *I* control, so the 99.99999% of the data on the drive that's good when the embedded firmware freaks out can be dumped and recovered offline.
If there's a God, Linus will go NUCLEAR over this, get a few seconds on CNN & other networks to rant about the unreliability of SSDs, and scare enough consumers to hit the industry HARD where it'll hurt the most... their bank accounts.
It might not be possible to make SSDs reliable, but DAMMIT, they should at least be RECOVERABLE. There were goddamn hard drives with recoverable data pulled out of laptops left in safes in the Vistamark hotel when a tower sheared it in half and buried it under flaming rubble, yet a SSD that dies if you so much as look at it the wrong way due to firmware bugs ends up being fundamentally unrecoverable for no hard technical reason.
And yes, I'm bitter about having my hard drive commit suicide for no reason besides Sandforce Business Policy. As long as they keep making controllers that cause drives to self-destruct at the drop of a hat, I'll keep doing my best to talk people out of buying drives tainted by their controller chips. Sandforce sucks.
Re:RAID by Blackknight · 2013-09-11 10:54 · Score: 1

I am far too boring for the NSA to care.
Re:RAID by Nerdfest · 2013-09-11 11:27 · Score: 1

I use CrashPlan. Nothing like off-site backups for the truly paranoid.
Re:RAID by Solandri · 2013-09-11 11:39 · Score: 5, Informative

I stopped using RAID in any of my systems after I started using WHSv1. WHS2011 has the same feature -- live system backups. If a drive fails, I pop in a new one (of any type/size), boot a CD that came with WHS (essentially a WinPE environment with a recovery software baked in), select my backup (I save 7-10 days -- I forget what it's set to), and in about an hour my system is back to the state of the last backup.
There's the operative phrase. RAID is for systems where you can't have or don't want an hour of downtime while restoring from a backup. The R in RAID stands for redundant. As in you can have a failure and keep going.

Note that this is the converse of "RAID is not a backup!" Just like RAID is not a replacement for a backup, a backup is not a replacement for RAID either. They do different things (and if you're smart, you will also backup your RAID). From your own description, you wanted a backup. RAID was never the correct solution for your needs.
Re:RAID by hobarrera · 2013-09-11 11:54 · Score: 1

RAID is for redundance and availability, rsync is for backups, totally different things.
Re:RAID by nullchar · 2013-09-11 12:04 · Score: 1

Yes, --link-dest is how to make incremental backups with rsync so you don't actually use --delete.
If you deleted a file, then the newest incremental simply won't have it in it's directory tree.
Re:RAID by Billlagr · 2013-09-11 12:30 · Score: 1

Nobody is too boring, citizen. Go about your life now.
Re: RAID by afidel · 2013-09-11 12:32 · Score: 1

I use Crashplan with a local copy, I figure enough cloud backup providers have gone bust it's worth having a local copy if I need it but in the event my house burns down or get hit by a tornado I have Crashplan to restore from.

--
There are 4 boxes to use in the defense of liberty: soap, ballot, jury, ammo. Use in that order. Starting now.
Re: RAID by Nerdfest · 2013-09-11 12:37 · Score: 1

I do as well. The local copy is for convenience ... the cloud copy is for safety.
Re:RAID by GigaplexNZ · 2013-09-11 13:27 · Score: 2, Insightful

That's just asinine. You should never rely on recovery of data from a broken drive to avoid data loss. Even if you do recover data from a broken HDD you shouldn't trust it hasn't had some form of corruption. Always have a backup. If you have backups, who cares if the drive is recoverable?

Also, don't buy Sandforce SSDs. There are plenty of alternatives that are faster and more reliable.
Re: RAID by MightyYar · 2013-09-11 13:43 · Score: 1

I'm even more paranoid and use Crashplan for the "cloud" and a native backup (Windows Backup, Time Machine, etc) for local backups to a local server. That way, if Crashplan screws up I still have the other backup method going.

--
W..w..W - Willy Waterloo washes Warren Wiggins who is washing Waldo Woo.
Re:RAID by BrokenHalo · 2013-09-11 13:56 · Score: 1

The R in RAID stands for redundant. As in you can have a failure and keep going.

Or it should be. I have met several people who use RAID-0 who don't seem to have worked out that that level offers no redundancy whatsoever. [Sigh...]
Re:RAID by mysidia · 2013-09-11 14:42 · Score: 1

I'm not nearly as much of a believer in RAID for the home environment. If you (accidentally) delete something on one drive it's gone from both.
This is where copy-on-write snapshots come in as in zfs snapshot pool/dataset@2013-09-11-0001
type stuff.
Daily rsync still doesn't address latent corruption that gets discovered a long time later --- the data corruption propagates to the backup.
On the other hand.... a filesystem with copy on write snapshots and filesystem level block checksums address that
For bonus points.... use a setup that can do asynchronous replication or transfer of snapshots with minimal network usage... ala zfs send and zfs recv
Because all the local backups in the world are great for restores --- but horrible for disaster recovery: a natural disaster can take out an entire building, and of course.... all the disk drives in a RAID array
Re:RAID by omnichad · 2013-09-11 14:45 · Score: 1

Accidental deletion is a much simpler matter on a hard drive. The blocks are marked as free, but the data's really still there. With SSD, you don't have long before TRIM comes along and zeroes out the data.
Re:RAID by mysidia · 2013-09-11 14:53 · Score: 1

There's the operative phrase. RAID is for systems where you can't have or don't want an hour of downtime while restoring from a backup.
That applies to my home workstation. I would rather not have 5 minutes of downtime; with the possible exception of the sort of natural disaster that happens once every 100 or 1000 years or so.
A mirrored pair of drives IS a backup. It's just an operational backup of every bit right beside a partner copy of the bit in another disk drive package.
The backup provided by drive mirroring is useful in many respects for reducing the impact of a hardware failure; however, there are many possible correlated failures that will affect both disk drives simultaneously: or enable a failure to spread.
There are stronger types of backups you should have; and it is most cost effective to make sure you have the strongest kind of backup, a good offsite backup with a fairly good retention period and periodic verification of restorability, as top priority: before you start planning the additional weaker kind of backups you should have --- such as a local backup on separate media (to speed up restore times and recovery, for local mishaps, by allowing you to avoid taking a few hours drive to go visit your offsite location) ------ and then synchronous mirroring, the weakest kind of backup, but with the Shortest possible delay restore time, for the incidents it can help with (The data's already on the media, duh.... you just need to insert an additional drive later to repair the synchronous mirroring arrangement).
Re:RAID by drinkypoo · 2013-09-11 15:06 · Score: 3, Insightful

You're right in that you should never rely blah blah blah, but he's right in that you should be able to attempt recovery. And he's more right, because he never said you shouldn't make backups.

--
"You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
Re:RAID by bingoUV · 2013-09-11 16:54 · Score: 1

before you start planning the additional weaker
Yes, so that having difficult work to do first causes people to postpone the whole backup project until a data loss. Just to protect against once in a millenium events. I heard somewhere that the perfect is the enemy of the good.
Or, adopt the data protection scheme with the best cost-benefit ratio first - which for most people is a local backup. And then worry about once in a millenium events.

--
Bingo Dictionary - Pragmatist, n. A myopic idealist.
Re:RAID by fnj · 2013-09-11 17:04 · Score: 3, Informative

Why not do it right?
Re:RAID by bemymonkey · 2013-09-11 19:19 · Score: 4, Insightful

So... stay the fuck away from Sandforce controllers? This has been common knowledge for years...
Re:RAID by semi-extrinsic · 2013-09-11 19:43 · Score: 1

rsync is designed to send things off site (with ssh), and it does compression. At least if mean compression for transmission over the network.

--
for i in `facebook friends "=bday" 2>/dev/null | cut -d " " -f 3-`; do facebook wallpost $i "Happy birthday!"; done
Re:RAID by JImbob0i0 · 2013-09-11 20:15 · Score: 1

I've gone with BTRFS drives in a RAID1 profile for data storage with a daily (keep past three days) subvolume snapshot (COW so minimal space used for it) for the accidental deletion events.
Re:RAID by TheRaven64 · 2013-09-11 22:54 · Score: 1

The R in RAID-0 does stand for redundant. As in 'the person who made the decision to use RAID-0 on our infrastructure has just been made...'

--
I am TheRaven on Soylent News
Re:RAID by cheater512 · 2013-09-11 23:30 · Score: 1

Duplicity does at rest compression, and also lets you retrieve a backup from days ago even if you backup daily.
Re:RAID by oernii · 2013-09-12 01:02 · Score: 1

or just use rsnapshot
Re:RAID by QBasicer · 2013-09-12 01:59 · Score: 2

I have two Linux machines and two NASes.
The first Linux machine, my laptop, rsyncs itself to the other Linux machine and to a QNAP NAS that's in RAID5.
The second Linux machine (desktop) backs itself up to the QNAP as well.
The DNS323 gets backed up to the QNAP NAS and to the desktop Linux machine
The QNAP nas gets backed up once a quarter to an offsite location.
I figure in my plan, I have enough redundancy and backup that I can recover to most failures.

--
x86, oh yes, I'm pro.
Re:RAID by QBasicer · 2013-09-12 02:05 · Score: 1

Offtopic, but Vistamark Hotel? I can't seem to find anything on that. Just curious.

--
x86, oh yes, I'm pro.
Re:RAID by steveg · 2013-09-12 08:40 · Score: 1

Same here. And although I do have a subscription and back up to Crashplan's servers, I also back my home machines up to a USB drive on my work computer and to a friend's machine across the continent.
That, in my opinion, is the really nice thing about Crashplan -- the ability to do offsite backup for free if you can round up other people willing to share space and bandwidth.

--
Ignorance killed the cat. Curiosity was framed.
Re:RAID by kiatoa · 2013-09-12 11:54 · Score: 1

WHS sounds interesting but a bit complicated. Also, obviously not available on Linux. After many years of trying (in no particular order) rsync, unison, bup, ugarit, btrfs raid, afs (that was the worst install ever) and probably a few others I found Moosefs and for my situation it seems to work pretty well. My requirement is that I can slap a few components together for a new system, boot from USB stick and be up and running in under 15 minutes of my time (the install process may well run much longer but I can get other things done). I have a script to install Moosefs and install all the packages I normally require. Moosefs replicates my data across multiple machines keeping N copies of each file where N is: /home = 3, /mythtv = 1, photos = 3 and so forth. Machines and drives seem to break every year or so and it is annoying but easy to build the new machine and install everything needed. This has been a low stress solution because the machines are very minimally configured. No LVM, no volumes etc. Just two partitions and a default install. Oh, and because I use fossil every time I commit my changes are auto synced off site which lowers my stress/fear from losing data even more. I use bup (will switch to ugarit when I get time since I'm a scheme fanboy) to keep offsite backups as I don't feel comfortable with my data in the cloud (adjusts tin hat carefully). Oh, and mythtv seems to be just fine writing/reading shows to moosefs. If I have simultaneous recording, watching and commercial flagging I do get stuttering but I can live with that. Moosefs has a feature where you can use a remote server for storing a copy of your data without slowing down access but I haven't tried it yet.

--
90% of the wealth is in 2% of the pockets. Bummer to be in the majority.
Re:RAID by Miamicanes · 2013-09-12 14:21 · Score: 1

Oops. Brain fart. It was the "Vista International" (at least, until Marriott bought it at some point after I was in Middle School). It's what happens when you grow up in a state like Florida with hotels literally everywhere, and your brain ends up using lossy compression to (somewhat) remember them. Occasionally, you'll remember a hotel whose name is partly right, and partly random fragments of other hotel names you've seen over the years.
Re:RAID by przemekklosowski · 2013-09-12 15:51 · Score: 1

Why not LVM or btrfs snapshots?
Re:RAID by gottabeme · 2013-09-12 18:57 · Score: 1

Houses burn down and are burglarized much more often than that.

--
"Those who consume the bulk of goods are those who make them. We must never forget this secret of our prosperity."
Re:RAID by gottabeme · 2013-09-12 19:02 · Score: 1

Here's a handy tool to watch for bitrot: http://onethingwell.org/post/20775845211/chafifi

--
"Those who consume the bulk of goods are those who make them. We must never forget this secret of our prosperity."
Re:RAID by bingoUV · 2013-09-13 00:18 · Score: 1

Citation needed. Which would be pedantic and yet my point stands. Do backup with better cost benefit ratio first.

--
Bingo Dictionary - Pragmatist, n. A myopic idealist.
Re:RAID by Jon_S · 2013-09-13 03:01 · Score: 1

Or use backuppc (http://en.wikipedia.org/wiki/BackupPC). Does it all for you with a nice web-based front end.
Pain in the ass to set up, but once it is up and running, works flawlessly. Saved my ass a few times.
Re:RAID by fnj · 2013-09-13 06:16 · Score: 1

I imagine he feels he doesn't need the trivial wrapper around rsync and can just drive it manually?
Yet his solution is nowhere near as sophisticated as rsnapshot's. Rsnapshot is far from a trivial wrapper.
Re:RAID by gottabeme · 2013-09-13 07:37 · Score: 1

You're demanding a citation for my assertion that houses are burglarized and burn down more often than once a millennium? ...
I'll play the same card against you: Citation needed that local backups are more cost effective. I think most people are better served with automatic, online, remote backups like CrashPlan, Dropbox, etc. for the simple reason that no further interaction is required. The user doesn't have to maintain backup hardware, replace it when it fails, copy old backups to new hardware, deal with obtuse backup software settings, etc.
I've helped people with this issue before, and they often end up with 5 different local backup schemes on 5 different devices--some internal, some external--using 5 different kinds of backup software, and none of them are up-to-date or even working properly anymore. On top of that, they install something like QuickBooks, which by default wants to backup its own data with its own backup app, which expires yearly.
Then I come in, install Dropbox and/or CrashPlan, and they don't have to mess with it anymore. It just works. On the rare occasion they need to restore data, it being remote and online is not a big deal--it would probably take them twice as long to cobble together a coherent restore set from all their "local backups." And just as important, they won't get bitten by a failing backup drive just when they need to restore, because they haven't been monitoring the SMART data on their external USB drive.
In short, I think your point falls flat on its face for the large majority of computer users.

--
"Those who consume the bulk of goods are those who make them. We must never forget this secret of our prosperity."
Re:RAID by bingoUV · 2013-09-13 19:44 · Score: 1

Ok, I see 3 issues with your post :
1. Incorrect or at least hugely unsubstantiated :

You're demanding a citation for my assertion that houses are burglarized and burn down more often than once a millennium
I live for 10 years in a city with population that has increased from 4.5 million 10 years ago to 6 million now. There have been precisely 4 big incidents of fire - one in a very low tech market, one in an office, 2 in residential houses. The office fire was all about smoke - building + furniture survived but basement fire released such smoke that some people were suffocated / jumped to death. Out of about million houses in 10 years. Expectation value of a particular house being destroyed in fire - less than once in a million years. NOT taking into account people burning their cake in an oven in an attempt to bake it - as I strongly recommend not keeping backup hard disks inside an oven in use.
Burglaries are more common, but complete burglaries are nearly unheard of. Burglars are scared - so they quickly grab cash, gold, and sometimes small electronics that they understand - like mobile phone. So yes, I am not sure data loss due to burglaries or fire are more than once a millenium events.
The post I was replying to already admitted those events happen once in 100 or 1000 years, so if you argue about whether it is 1000 or 100 years, it is pedantic in that regard. Without even the rigour of citation expected of a good pedant. My saying "once in a millenium events" was merely a way to refer to the events that parent post was referring to while admitting they happen in 100 to 1000 years.
2. Strawman, or at least misunderstanding the gist of my post :

I'll play the same card against you: Citation needed that local backups are more cost effective.
I was arguing that the policy of "safest backup first", to the extent of protecting against once a millenium events (admitted in the post I was replying to) results in no backup at all due to people postponing difficult activities. This is human nature, and your misunderstanding it will not make it better. You don't even assert that this is not how human nature works, so your argument is addressing the strawman in that regard.
3. While arguing about remote backup being "cost effective", you completely ignored cost? Short term memory loss? Calculate the cost of dropbox account as compared to a USB drive per year and get back to me. The cost comparison is so laughable that I guess I don't need to compare but just mention that you forgot to take the primary motivation into account. It is like going for shopping and forgetting to shop and coming back home, but maybe it happens to you due to a medical condition. Hope it doesn't happen too frequently.
You also didn't take into account the upload bandwidth that would be required for online backups that most people don't have. And I know multiple people who depended on MegaUpload for backup.
Again, I remind you of an aspect of human nature - the following don't happen together :
1. spending on cloud backup,
2. even after MegaUpload and Snowden
3. by a non-geek
4. before a data loss actually happens
5. In spite of internet bandwidth and data limits
Reduce the cost drastically by asking them to get a USB drive and backup manually/TimeMachine/windows backup, in a few months - can happen much more easily while saving from less damage vectors.

--
Bingo Dictionary - Pragmatist, n. A myopic idealist.
Re: RAID by nobodie · 2013-09-14 02:57 · Score: 1

I do a daily to a second 2T hard disk, so I could lose a few hours. But then I am doing nothing important and unreproducible. Oh, and then monthly I do a bkup to an external drive that I keep in a separate location. That way even weather or fire or zombies would have to strike two separate locations to get everything.

--
Subversion of spatial scale luxury decoration ideas.
Re:RAID by Twinbee · 2013-09-14 03:18 · Score: 1

If you took 5 minutes to look at reviews on Amazon, you'd see the latest Intel, Samsung and Crucial drives are much higher rated. 5 minutes of research can spare you up to 100x in pain and time when something goes wrong.

--
Why OpalCalc is the best Windows calc
Re:RAID by mysidia · 2013-09-14 07:33 · Score: 1

Computers are expensive electronics, and have significant resale value. Crooks will happily go after Televisions, Computers, Monitors, iPads, and other similar big-ticket items.
People have laptops stolen with loss of data all the time. This is especially common at airport security checkpoints, or, when a laptop has been left in a vehicle; or briefly left unattended in a public place such as a student union or a library. I know 2 different long time friends who had laptops stolen, and I know of some people who lost their hard drive data to water damage, when there was a flood; I also know of a case where a hot water leak, dripped water from a second floor onto a computer lab, and ruined a whole bunch of machines together..
See this article
Computer Theft and Computer Loss 15% of households annually experience burglary or theft according to the Bureau of Justice. While statistics are not available for what was stolen, when a home is burglarized, a computer is a likely target.
Re:RAID by gottabeme · 2013-09-15 03:40 · Score: 1

Ok, I see 3 issues with your post :
1. Incorrect or at least hugely unsubstantiated :

You're demanding a citation for my assertion that houses are burglarized and burn down more often than once a millennium
I live for 10 years in a city with population that has increased from 4.5 million 10 years ago to 6 million now. There have been precisely 4 big incidents of fire - one in a very low tech market, one in an office, 2 in residential houses. The office fire was all about smoke - building + furniture survived but basement fire released such smoke that some people were suffocated / jumped to death. Out of about million houses in 10 years. Expectation value of a particular house being destroyed in fire - less than once in a million years.
1. Nice strawman. "Big incidents of fire" are not the point at hand. The issue is property being damaged or destroyed by fire.
Or if you honestly believe that in a city of 6 million people with 1 million "houses" (and using that term rather than more accurate ones suggests you haven't done any research) that property has only been damaged by fire 4 times, you must be delusional or lying. Go look up some actual records and come back to me. I live in a town of around 20,000 and I hear fire trucks from just the closest fire station going to calls nearly every day. Even if half of those were not fire-related, and even if only half of fire-related calls were actual fires, that would still be 91 fires per year, and if only half of those resulted in actual property damage, that'd still be 45 cases per year. And that doesn't even count the other fire stations in town. Your city is 300 times larger than mine. It doesn't take a genius to do the math. Seriously man, if fires were so rare, we wouldn't have such stringent building codes. What are you thinking?
2. It's absurd to even make an argument about "a particular house being destroyed in fire less than once in a million years." That's just silly. Be realistic and make a rational argument--at least, if you want anyone to take you seriously.

Burglaries are more common, but complete burglaries are nearly unheard of. Burglars are scared - so they quickly grab cash, gold, and sometimes small electronics that they understand - like mobile phone. So yes, I am not sure data loss due to burglaries or fire are more than once a millenium events.
1. Another irrelevant strawman. What even is a "complete burglary"? Does that include the refrigerator? the toilet? the house itself?
2. News flash: most criminals--especially smash'n'grab types--are not smart. They don't care whether they "understand" what they steal--they aren't interested in using what they take. They only care if it has resale value--and all computers do. They'll happily snap up your laptop, desktop, and any external drives that may be attached--or they'll snap up your laptop bag and everything in it. Another, albeit less common, type of criminal is one interested in identify theft, and those types will most definitely want to steal your data, so they'll take anything that looks like a data storage device.
3. Again, I live in a small town, and I can find reports of petty theft and break-ins in the paper anytime I look. A month ago the gas station down the street was held at gunpoint by three men--and this is a small town with no more than average crime! And your city is 300 times larger. You're deluding yourself.

The post I was replying to already admitted those events happen once in 100 or 1000 years, so if you argue about whether it is 1000 or 100 years, it is pedantic in that regard. Without even the rigour of citation expected of a good pedant.
Allow me to be more pedantic for you:
* 100 -> 1000 is an order of magnitude--it is significant.
* What does it even mean to say "those events happen once in 100 or 1000 years"? Those events

--
"Those who consume the bulk of goods are those who make them. We must never forget this secret of our prosperity."
Re:RAID by funky_vibes · 2013-09-15 12:51 · Score: 1

The whole SSD market is another OEM style catastrophe where you have few chipmakers who want SSD drive mfgs, basically just using reference designs to sell the same stuff under many brand names.
Which means the manufacturer doesn't fully understand how their devices operate, and can only offer limited support.
The FTL layer itself is a failure, as long as it pretends to be a block device. Because are there plenty of common failure modes that cannot be accounted for if the OS doesn't understand flash.
When time passes flash blocks will wear out and fail.
If the flash blocks run out, what should the FTL do?
Overwrite existing data? Return corrupt data? Fail to write, leaving the system in an inconsistent state?
None of these behaviours make sense from a block device point of view, and put the system in an inconsistent state no matter how you look at it.
Not to mention the wasted memory.
Let's say you have a 80Gb flash drive, it may in reality contain 120Gb of flash memory. It must fail to function when the amount of memory decreases below 80Gb of perfectly good flash memory.
Actually, the situation might be better if there was a filesystem built in to the drive. You just fetch and write to file names over serial. Query disk usage, file size etc.
But let's not go this way, we already tried it in the 80s.
An SSD is very expensive, why can't it have pluggable memory modules?
And why can't it offer direct access to the flash bus?
A proper flash file system can offer storage throughout the whole lifetime of the flash memory, and easily verify writes and increase performance way beyond what SSDs with FTL can offer. If tied to a memory bus, we can use execute in place to save RAM and energy.
Re:RAID by bingoUV · 2013-09-15 17:12 · Score: 1

In context of your reply, my post can be said to have 2 parts:
1. Do the more cost-effective (your words) backup first.
2. Local backup is more cost-effective.
If you are saying that dropbox is more cost-effective and hence should be done first, you agree with the first part, where I am also saying more cost-effective backup should bedone first. There, you are addressing a strawman that doesn't agree with this.
As for your contention against the second part, that local backup is more cost-effective, unfortunately you forgot to mention anything concrete about cost AGAIN. Empathy about your shopping life stops me from ridiculing.
PS : I am sure your forgetfulness is aided significantly by the fact that cost factor utterly destroys your argument for dropbox being more cost-effective.

--
Bingo Dictionary - Pragmatist, n. A myopic idealist.
Re:RAID by gottabeme · 2013-09-19 03:32 · Score: 1

1. This "first" fallacy. As we discussed, online backups sometimes take a while to complete the initial backup. You can do this while continuing to make local backups to external media, and you should. So this whole point is silly and moot.
2. Cost effectiveness. Ok, you want concrete costs. Here we go: I can buy a 1 TB external drive for about $100 on NewEgg. It has a 3 year warranty. So that's about $33/year for 1 TB.
But that's only one drive. Dropbox and CrashPlan, just two examples I'm very familiar with, have redundancy. So let's buy 3 of those external drives. That's $300, raising the cost to $100/year. Note that I've also increased the chance of having a drive failure and having to replace one before 3 years is up.
But I've got all those drives in one place. So let's put one in a safe-deposit box at the bank. I honestly don't know how much that costs. Let's say $15/month. $180/year. (If this figure is way off, please correct me.) Now we're up to $280/year total cost.
But now I have to drive to the bank every so often and swap drives. That's a big hassle, and most people will not keep up with this regularly. (Again, average users.) Time is also "money."
Also, now I have less redundancy for current data, unless I buy a fourth drive to keep at home. That would raise the yearly cost to about $310/year.
On top of all that, I have to manually monitor the drives, maintain them, make sure their filesystems are intact, free space not run out, etc. I also have to make sure they are in a secure environment where kids or pets or minor environmental disasters like leaky roofs won't damage them. And average users don't even know what a filesystem is or what SMART is.
And I also have to guard against theft. If I don't swap drives with my safe deposit box regularly, I risk losing all my recent data, having to restore from an old backup.
On top of all that, if any drive fails prematurely or is stolen or otherwise damaged, I have to replace it at my expense. This will almost surely happen now and then, even if not every 3 years, so that further increases the yearly cost beyond $280/310.
Or...
I could use CrashPlan. For $60 a year I get unlimited capacity. They handle redundancy. It's off-site. They fix and replace hardware at no extra charge. They guard against theft and environmental problems. I don't have to manually swap drives. The software is automatic and guards carefully against corruption and bitrot. So far it's 20-25% the cost of the local, manual backup strategy.
The tradeoff is having to fit the data through a thin pipe over the Internet. Depending on how much data you have and how fast your Internet connection is, this can be a minor issue or a major one. It won't suit everybody's needs.
And for most users, 1 TB is way overkill anyway, so other providers like Dropbox can also be effective and even simpler.
So for the average user, it seems quite obvious to me that the most cost-effective solution is online backup. And one can throw in an external drive, too, for local backups, and still save a ton of money vs. going all local.
So I've countered your assertions and ridicule with data and logic. I think this "utterly destroys your argument."

--
"Those who consume the bulk of goods are those who make them. We must never forget this secret of our prosperity."
Re:RAID by bingoUV · 2013-09-19 16:54 · Score: 1

1. This "first" fallacy. As we discussed, online backups sometimes take a while to complete the initial backup. You can do this while continuing to make local backups to external media, and you should. So this whole point is silly and moot.
One has to setup the backup systems. Sign up for dropbox. Choose plans. Directories to backup. Schedule, if not automatic, depending on bandwidth costs at particular times.
Or setup the local backup - get local hard drives. Choose and configure backup program.
This setup is what causes postponing for people. And the majority of costs. And this is not what can be done in parallel - unless you can delegate it to minions. In which case it ceases to be "home" use for most people.
So you can misinterpret the post and call it a fallacy. Or you can put effort in understanding what I posted. Your wish.

Dropbox and CrashPlan, just two examples I'm very familiar with, have redundancy. So let's buy 3 of those external drives. That's $300
It is a BACKUP. To be used when original data is lost. By using 3 disks for backup, you are again preparing for a situation when 4 disks crash SIMULTANEOUSLY. Again once a century , if not a millenium, event.
And I already said local backup while cheaper saves from less vectors but being much much more cost effective should be done first, and will not encourage postponement. So rest of your post is based on a flawed reading of my posts.
This might help.

--
Bingo Dictionary - Pragmatist, n. A myopic idealist.
Re:RAID by gottabeme · 2013-09-19 18:42 · Score: 1

1. This "first" fallacy. As we discussed, online backups sometimes take a while to complete the initial backup. You can do this while continuing to make local backups to external media, and you should. So this whole point is silly and moot.

One has to setup the backup systems. Sign up for dropbox. Choose plans. Directories to backup. Schedule, if not automatic, depending on bandwidth costs at particular times.
Or setup the local backup - get local hard drives. Choose and configure backup program.
This setup is what causes postponing for people. And the majority of costs. And this is not what can be done in parallel - unless you can delegate it to minions. In which case it ceases to be "home" use for most people.
So you can misinterpret the post and call it a fallacy. Or you can put effort in understanding what I posted. Your wish.
Yeah, it's so much more complicated to sign up for Dropbox and put your files in ~/Dropbox than it is to shop for a disk, order it, wait for it, unpack it, set it up, and then install it, choose backup software, set it up... And you complain about me not putting effort into understanding what you posted.

It is a BACKUP. To be used when original data is lost. By using 3 disks for backup, you are again preparing for a situation when 4 disks crash SIMULTANEOUSLY. Again once a century , if not a millenium, event.
You're missing the point, again. Blah blah...reading comprehension...blah blah...hypocrisy...blah blah... It's actually fairly common for disks of the same make and model to fail near the same time, anyway.
But the point is that online backups and local backups are like comparing apples and oranges, because in order to give the same benefits as online backup, you'd have to spend 4-5 times as much for local backup. Otherwise, one burglary, house fire, or kid who trips over the cable, and your backups are useless.

And I already said local backup while cheaper saves from less vectors but being much much more cost effective should be done first, and will not encourage postponement. So rest of your post is based on a flawed reading of my posts.
Again, this "first" fallacy. Besides, by the time an average user has shopped for, bought, installed, and setup a local backup system, he could have already backed up many if not most of his files to an online backup.

This might help.
I showed that online backup is more cost-effective. Instead of admitting you were wrong, or showing how I'm wrong, you're simply asserting that you're right. Again, instead of arguing rationally with the "concrete data" I provided when you demanded it, you fall back on ridicule--as you've done in every reply you've made. Is it even possible for you to have a rational argument without resorting to childishness? I rest my case.

--
"Those who consume the bulk of goods are those who make them. We must never forget this secret of our prosperity."
Re:RAID by bingoUV · 2013-09-19 21:17 · Score: 1

Yeah, it's so much more complicated to
That is where most of the postponement happens. That it costs money, and a realization of ongoing expenditure to maintain a backup, helps the postponement.

It's actually fairly common for disks of the same make and model to fail near the same time, anyway
And why would you get disks of the same make?

But the point is that online backups and local backups are like comparing apples and oranges
Completely agreed.

because in order to give the same benefits as online backup, you'd have to spend 4-5 times as much for local backup
And I showed that at much much lower cost, local backups can protect against events that can reasonably be expected in a lifetime by everyone. Lots of people have spent lifetimes without getting their houses burgled or burnt. Reaching adulthood without at least a hard drive failure / accidental file deletion / virus corruption is nearly impossible (assuming data storage from birth, of course).
So the same benefits are NOT NECESSARY to START backing up. Which is what my original post in this thread was about, parent post of which was suggesting to not even START backing up until you have an all comprehensive backup protecting against once (100,1000) year events.

I showed that online backup is more cost-effective.
By dishonestly using bad practices (all drives of the same make? WTF?). By cunningly changing the objective (3 disk failure in the same month is once in 3888 year event, given an average drive fails in 3 years) to saving against once in millenium events rather than reasonably expected events.
And I was arguing about STARTING with local even if remote appears expensive.

--
Bingo Dictionary - Pragmatist, n. A myopic idealist.

Linus needs a personal sysadmin... by FridayBob · 2013-09-11 09:01 · Score: 1

... to protect him from this kind of calamity, for example by using mirrored disks (even with SSDs) whenever possible and by ensuring that regular backups are made of all of his important data. Of course, privacy-wise it would be better if he did it himself, but apparently it's just not a priority. I've long found it curious that so many excellent programmers are comparatively inept when it comes to looking after their own machines and data.

Re:Linus needs a personal sysadmin... by jellomizer · 2013-09-11 09:28 · Score: 1

But he is a software developer. You know Software Developers are excellent system admins by default!
That is why everyone asks the software developer on how to fix their PC.

--
If something is so important that you feel the need to post it on the internet... It probably isn't that important.
Re:Linus needs a personal sysadmin... by Gallomimia · 2013-09-12 04:56 · Score: 1

LAZY. It's called lazy.

--
Sadly, a Libertarian cannot force his views on another, and freedom cannot spread as does the cancer known as religion.

And that's why.... by stox · 2013-09-11 09:03 · Score: 2

I have a mirrored set of SSD's on all my important machines, and RAID 6 for bulk storage.

Unlike Linus, I can't afford to lose work.

--
"To those who are overly cautious, everything is impossible. "

Re:And that's why.... by mjwx · 2013-09-11 16:37 · Score: 1

I have a mirrored set of SSD's on all my important machines, and RAID 6 for bulk storage.
Unlike Linus, I can't afford to lose work.
Then pray none of those RAID controllers never fails.

Unless you mean to say that you keep proper backups.

--
Calling someone a "hater" only means you can not rationally rebut their argument.
Re:And that's why.... by Ash+Vince · 2013-09-11 23:35 · Score: 1

I have a mirrored set of SSD's on all my important machines, and RAID 6 for bulk storage.
Unlike Linus, I can't afford to lose work.
It's worth reading his post to find out he hasn't actually lost any kernel work. He just was finding it too painful to retrieve the pull requests sent to him from his email archive so just asked people to resend them.
This is another perk of being the boss, if something is difficult (and exceedingly boring) you can just ask an underling to do it for you. In this case he is relying on the fact that most people are probably impatiently waiting for their pull request to appear in the latest development version of the kernel with baited breath so they are probably quite happy to resend it.
After having been through a major recovery process recently you will be amazed at how much work this entails even when you have backups. Him asking for help is no bad thing.

--
I dont read /. to RTFA, I read /. to offend people in ignorance.

Re:Pathetic by Cassini2 · 2013-09-11 09:05 · Score: 1

rsync to the backup drive with the --backup switch. That way if the file system decides to overwrite a bunch of key files with 0 length files, your backup keeps the originals. It might take you a while to restore, but at least the originals are still there.

The backup heirarchy:
- Natural disasters occur rarely.
- Hardware failures occur infrequently.
- Software failures occur more frequently.
- User failures occur often.
- The unexpected happens all the time.

Satan! How Convenient! by PoconoPCDoctor · 2013-09-11 09:05 · Score: 1

http://www.wired.com/wiredenterprise/2012/10/linus-torvalds-hard-disks/

--
"Let us raise a standard to which the wise and honest can repair" - George Washington

Controller failure by ArchieBunker · 2013-09-11 09:07 · Score: 1, Informative

So buy a new drive with the same rev boards and swap them out. Problem solved.

--
Only the State obtains its revenue by coercion. - Murray Rothbard

Good ol' Linus and his aversion to backups by Anonymous Coward · 2013-09-11 09:18 · Score: 2, Insightful

According to a speech of his, that's how Linux got started. He accidentally wiped his MINIX partition.

Interpretting SMART results by dargaud · 2013-09-11 09:23 · Score: 1

I run smartctl regularly to check on my disks (SSD or spinning) but I find the info difficult to interpret. Is there a service where I can upload the reult and it distills it to: fine OR dying ?

--
Non-Linux Penguins ?

Re:You trust Torvalds after this? by hawguy · 2013-09-11 09:23 · Score: 3, Insightful

As someone who's taken over server administration from very talented developers a number of times, I've found that being a great developer doesn't mean that you're a great sysadmin. Developers may understand conceptually that RAID and backups are important (but sometimes think that RAID is a backup), but that doesn't mean that they actually set them up.

And as a sysadmin, I'm tired of hearing that. RAID1,5,6,10,Z is a backup. It's not an archive. An archive is what you go to when you want the old version. A backup is generally one of two things:
1) Something that lets you keep chugging through a failure (raid5, a backup generator with automatic cut-over, etc)
2) A standby spare (tape, NAS/usb drive, secondary location with desks/computers/etc.

RAID (other than 0) is absolutely a backup. It's not the perfect backup but it is a backup. What it is NOT is an archive - last night's/week's/month's/quarter's data.

No, RAID is *not* a backup, RAID's only purpose is to improve reliability/uptime by letting you ride through hardware failures, but it does nothing to protect you from all of the rest of the things that can destroy your data, like file corruption, fat fingering a "rm -rf / home/someuser", a virus, a website hack attack, etc. That's what your backups are for, but you can call them archives if you like, but don't call RAID a "backup" because it's not. Depending on what the problem is and when you discover it, you may need to go back through several archives before you find the data you're looking for.

Re:None of that mattered, because by Zero__Kelvin · 2013-09-11 09:24 · Score: 4, Insightful

". Now people have to redo a lot of effort, because he was too lazy or arrogant to install one of the many effortless backup systems available."

That is a ridiculous statement. Work is lost every time a drive fails unless it happens to fail immediately after a backup. Full backups take lots of time. If you understood git better you would realize that a lot less work is lost the git way than with old school backups. I'm sure that every time Linus does a successful merge he pushes it to a git repo elsewhere. All history is in the git logs. I am certain the work he lost is minimal, and is much less than if he was relying on nightly backups and the failure happened near the end of the work day. Just the effort of trying to determine what was done and what has been lost would be far more time consuming without git.

--
Guns don't kill people; Physics kills people! - John Lithgow as Dick Solomon on Third Rock From The Sun

Re:Pathetic by MightyYar · 2013-09-11 09:25 · Score: 1

I'm starting to look like a shill in the comments for this story, but what the hell. I've been happy with Crashplan. They charge you to store on the cloud, but it's free to your own computers, or friends computers. It also runs a hash against everything it backs up, so data integrity is supposedly "assured". Another commenter has been going on about Bacula, but I've never tried it.

--
W..w..W - Willy Waterloo washes Warren Wiggins who is washing Waldo Woo.

Re:BREAKING: Development was also held up.... by Atzanteol · 2013-09-11 09:25 · Score: 1

I don't care what the mods say - this is funny as hell.

--
"Ignorance more frequently begets confidence than does knowledge"

- Charles Darwin

Reduce flash rewrite wear with noatime by redelm · 2013-09-11 09:27 · Score: 2

This might be [electrolytic] capacitor or some other component-level magic-smoke release. There is also the dreaded, much-discussed "wear" from re-writing flash memory -- worse than you think because blocks of 64 KB [typically] have to be erased and re-written to change any byte therein.

Linus, of all people, ought to know his kernel has options to minimize the re-writes, many of them developed to optimize laptops (like delaying writes). Another thing is to mount partitions (/etc/fstab anyone?) with `noatime` as an option (maybe 'nodiratime` too). Un*x and other Linux-like systems by default will re-write the access time for any disk inode read. Turning it off reduces disk write load (and seeks on slow disks). I've had it off for over ten years an not noticed any malperformance, althrough there are rumored to be some, somewhere.

Re:None of that mattered, because by Zero__Kelvin · 2013-09-11 09:46 · Score: 1

"Hasn't Apple's Time Machine sort of set the bar a little higher on frequency of backup? Even if his working directory just sat in a Dropbox (Sparkleshare if you prefer) folder, he'd be better off."

It is pretty clear you haven't thought this through. What happens to the performance of git, or any software for that matter, when the data it accesses is somewhere else on the internet? How much work do you suppose is lost every day doing it that way?

--
Guns don't kill people; Physics kills people! - John Lithgow as Dick Solomon on Third Rock From The Sun

Re:You trust Torvalds after this? by sexconker · 2013-09-11 09:46 · Score: 1

As someone who's taken over server administration from very talented developers a number of times, I've found that being a great developer doesn't mean that you're a great sysadmin. Developers may understand conceptually that RAID and backups are important (but sometimes think that RAID is a backup), but that doesn't mean that they actually set them up.

And as a sysadmin, I'm tired of hearing that. RAID1,5,6,10,Z is a backup. It's not an archive. An archive is what you go to when you want the old version. A backup is generally one of two things:
1) Something that lets you keep chugging through a failure (raid5, a backup generator with automatic cut-over, etc)
2) A standby spare (tape, NAS/usb drive, secondary location with desks/computers/etc.

RAID (other than 0) is absolutely a backup. It's not the perfect backup but it is a backup. What it is NOT is an archive - last night's/week's/month's/quarter's data.

So much fucking wrong.

RAID provides redundancy. It is not a backup.

A spare (hot or cold) is not a backup. It's a spare.

When you're talking about DATA, a backup is a COPY of that DATA on a physically-separate device. Ideally, all backups should be UNPOWERED and MOVED THE FUCK AWAY after being created. For my home use I don't do either of these, but for work we have 3 or 4 layers of backups and separation depending on how you want to count it.

Re:You trust Torvalds after this? by gagol · 2013-09-11 09:50 · Score: 1

It not like the GIT repo is hosted on his home machine...

--
Tomorrow is another day...

Where's the problem? by Trogre · 2013-09-11 09:54 · Score: 1

So an SSD in a desktop computer died. So what? Just run the array in degraded mode until the damaged drive is replaced.

Who uses hard drives (SSD or otherwise) in desktops in anything other than a RAID configuration? In Linux where software RAID-1 is trivial there's even less of an excuse.

In a laptop I can understand it, since there's often only space for one drive and even then you expect everything on it to be volatile.

I don't get it.

--
"Nine times out of ten, starting a fire is not the best way to solve the problem." - my wife

Re:You trust Torvalds after this? by Score+Whore · 2013-09-11 09:57 · Score: 1

You're confusing the English with technology. A backup hamburger -- in case the hamburgler steals my first one -- is not a backup of my data. A backup is a short term copy of my data. An archive is a long term copy of my data. Neither one changes after they've been created. A good backup has left the building. RAID, UPSes, generators, etc. are availability measures, not backups.

Re:None of that mattered, because by jedidiah · 2013-09-11 10:03 · Score: 1

> Hasn't Apple's Time Machine sort of set the bar a little higher on frequency of backup?

No, not really.

--
A Pirate and a Puritan look the same on a balance sheet.

Re:Really? Naa by Psyko · 2013-09-11 10:07 · Score: 5, Interesting

trying to desolder 100 pins spaced 0.01" apart then resoldering them, unless you have a 0.1 mill precision soldering robot it is impossible, you can't even buy wire thin enough to do it by hand.

SMT rework by hand isint rocket science, but takes more tools than the average garage has.

Desoldering you use a custom tip for that socket/package type (one tip per package & they're not cheap). It's essentially a metal ring that heats the solder on all the pins at once. In the center of the assembly is a vacuum probe. You heat all the pins, melting all the solder & hit the button on the handpiece to suction the chip up off the board. Then clean up the pads on the board. Careful with the heat because you dont want to lift pads off the board, if you do then you have to either fix them, or make a new pads. And then if you manage to trash a via (conductivity path to a different board layer), then you've got to drill out a new one and you have to use a esd safe conductive drill with a resistance cutoff. You put a clip from the drill in contact with the layer you're trying to get to, drill down and when the drill tip makes contact with that layer the drill turns off because the circuit is complete. But it still sucks and if you don't know how all the board layers are put together you may end up trashing a trace a couple layers into the board and wrecking the whole thing.

Soldering it down you do this. Align all the chip legs on the pads. Then you can either run a small bead of solder paste across all the pins or use a wave soldering tip (small cup, uses surface tension to hold the solder in place) and drag the tip over all the pins. Heat on the pin & pad draws the solder down into the joints. If you put too much solder you might have to vac it back up and redo it if you've made bridges etc. Alignment is key, and keeping the part in position is key. I used to try and avoid using glue underneath because that made it difficult to get it back off if you needed to down the road.

Doing hand rework on that kind of stuff the hardest thing for me was dealing with smt chip caps, little bastards will crack if you heat em to fast, so you have to get a temp regulated hot plate, heat em up slow, then pick and place em quick with tweezers/needlenose & solder em down quick.

--
01:36AM up 426 days, 2:46, 1 user, load average: 0.14, 0.11, 0.05

Re:None of that mattered, because by MightyYar · 2013-09-11 10:17 · Score: 1

You'll notice that I singled out frequency. :)

--
W..w..W - Willy Waterloo washes Warren Wiggins who is washing Waldo Woo.

RAID != Operating System by dutchwhizzman · 2013-09-11 10:17 · Score: 5, Interesting

You have a software feature in a server OS that supports certain client OSes to do backups to the server. RAID may be a software feature, but even if it's "software raid", you often have BIOS bootable raids that even work with one of the drives missing. This essentially means that you can work OS agnostic on a lower level than "I have a backup system that works". For Linux, you can have a backup system too that will restore from a LiveCD/USB stick and stores on a remote server. The same amount of time roughly will be needed to backup and restore, differential, incremental, full backups, the works. The solution you are providing is really nothing comparable to RAID. It's fundamentally different because it works on a totally different layer, doesn't prevent downtime and it's not OS agnostic. RAID should prevent downtime, making working backups should prevent data loss. Maybe WHS is the shizniz, you rock for making actual backups, but other than that, your post is totally offtopic in this context and doesn't even begin to solve a problem that Linus was facing with his desktop.

I'm not modding you down, even though I have mod-points, but I'm telling you exactly why I think you shouldn't have posted this. I hope you learned something from it and in the future will implement both backups and RAID when unscheduled downtime is important. Maybe you would even implement a system that works for all relevant OSes in the environment you have to do it for, without relying on a single vendor that offers a closed source product. It's a risk that means you'll have to support their product and licencing and other requirements until the data isn't relevant anymore, even after you have migrated to a competing product.

--
I was promised a flying car. Where is my flying car?

Re:RAID != Operating System by jhol13 · 2013-09-11 14:01 · Score: 1

I use ZFS (OpenIllumos) server with raidz2. All user data is in the server. With snapshots that is beautiful system for my needs (I also do offsite backups, but not that often). AND WOULD HAVE SOLVED THE PROBLEM.
I already have had a situation where two disks were failing, but that was entirely my fault. When first disk failed I thought "it can handle the second failure easily, no need to rush". It did without any data loss[1], but for obvious reasons I should have replaced the disk immediately. But then this gave me time to move from 500G disks to 2T disks (i.e. replace all disks) quadrupling storage. The transition was unbelievably simple: replace one disk (first the failing disks), run "zfs resilver", rinse and repeat.
I would never use HW RAID, for my needs it would be very bad idea. Now if motherboard dies, I just buy new one - any brand, no need to have identical raid controller.
[1] I know this for sure as everything is checksummed in ZFS
Re:RAID != Operating System by Electricity+Likes+Me · 2013-09-11 14:36 · Score: 1

ZFS in the Linux Kernel now!
Re:RAID != Operating System by AmiMoJo · 2013-09-12 00:17 · Score: 2

RAID is not at all comparable or better than a network backup. It's not even a backup. If your PSU dies or there is a power surge you could easily lose both drives. If something hoses the filesystem you can't just roll back to yesterday's image.
A network backup onto a separate machine, preferably protected with a UPS, is far more secure and likely to save you from a variety of hard and soft disasters.

--
const int one = 65536; (Silvermoon, Texture.cs)
SJW, n: "Someone I don't like, and by the way I'm a fuckwit" - AC
Re:RAID != Operating System by gottabeme · 2013-09-12 18:54 · Score: 1

No, it's out-of-tree. Important distinction.

--
"Those who consume the bulk of goods are those who make them. We must never forget this secret of our prosperity."
Re:RAID != Operating System by Electricity+Likes+Me · 2013-09-13 05:33 · Score: 1

There's a motion on the ZoL lists currently to figure out what would be required to get it into the tree without technically violating CDDL. I personally hope it eventually happens, since I think if ZFS got more users we might finally get enough momentum to clear it's problems (shrinking filesystems, reshaping arrays i.e. block pointer rewrite).
Re:RAID != Operating System by gottabeme · 2013-09-13 07:38 · Score: 1

I hope so too! I'd like to try it, but I'm not interested enough to deal with out-of-tree modules, especially for workstations and laptops. In the meantime, Btrfs and Tux3 look promising.

--
"Those who consume the bulk of goods are those who make them. We must never forget this secret of our prosperity."
Re:RAID != Operating System by Wolfrider · 2013-09-14 04:07 · Score: 1

--It will prolly never be in the main kernel tree due to licensing issues, but:
http://zfsonlinux.org/

--
.
== WolfriderV6 == I'm willing to admit that *I just might* be wrong... Are you??

Will never work with modern drives by dutchwhizzman · 2013-09-11 10:20 · Score: 5, Informative

Modern drives for the last five years at least, have calibration factors for platter/head packs on the EEPROM on the controller board. If you swap boards, the board most likely won't be able to read the data on the disk, since it's not calibrated to the head/platter kit.

--
I was promised a flying car. Where is my flying car?

Re:Will never work with modern drives by mysidia · 2013-09-11 15:04 · Score: 2

Modern drives for the last five years at least, have calibration factors for platter/head packs on the EEPROM on the controller board.
If it's an EEPROM chip; then that means... in principle, you could capture a dump of the EEPROM content from the old board, and then erase the EEPROM on the replacement board, and use a programmer to reload the old content
Re:Will never work with modern drives by GuB-42 · 2013-09-12 01:17 · Score: 2

If the EEPROM is not damaged and is a discrete component (usually a 8 pin SOIC) you can unsolder it from the old board and put it on the new board. It's not that difficult.
Re:Will never work with modern drives by QBasicer · 2013-09-12 02:19 · Score: 1

Great idea, but I'd be way to lazy to do something like that. Nor do I have the equipment.

--
x86, oh yes, I'm pro.
Re:Will never work with modern drives by petermgreen · 2013-09-12 10:08 · Score: 2

Which is why you take the eeprom off the original controller board and put it on the new board. In my experiance the EEPROM is in an 8 pin package with a 1.27mm pin spacing (either a SOIC or a similar sized leadless package). Pretty easy to pop off with hot air (and yes I have done this several times).

--
note: i'm known as plugwash most places but i screwd up registering that here somehow in the past and now can't register
Re:Will never work with modern drives by QuesarVII · 2013-09-12 16:54 · Score: 1

A $15 radio shack soldering iron is all you need. I've done this and it worked to recover 4 drives that all had controller board failures (customer system) from a fried power supply.
Re:Will never work with modern drives by QBasicer · 2013-09-17 02:18 · Score: 1

Curious, how do you wire up a programmer?

--
x86, oh yes, I'm pro.
Re:Will never work with modern drives by QuesarVII · 2013-09-17 02:52 · Score: 1

I didn't, I just moved the chip from the failed board to the working one.

Re:None of that mattered, because by MightyYar · 2013-09-11 10:23 · Score: 1

Huh? Dropbox is a local folder. A daemon monitors the folder for changes and then uploads those changes to the Dropbox servers. It keeps a version history for each file, and so you can restore to a point in the past. In my experience, it does not use much in the way of system resources. Sparkleshare does the same thing, but to a server of your choosing and it is completely open source - and in fact based upon git.

--
W..w..W - Willy Waterloo washes Warren Wiggins who is washing Waldo Woo.

Re:You trust Torvalds after this? by PlusFiveTroll · 2013-09-11 10:33 · Score: 1

RAID is NOT a backup.

Raid does only a few things...

Depending on level, allows you to survive $X number of disks failing.

Allows you to 'stripe' a file system across multiple drives for increased performance or size.

It does not protect any other problem, such as raid card failure, filesystem corruption, rm -rf, or any other number of problems that can occur.

Re:None of that mattered, because by Zero__Kelvin · 2013-09-11 10:34 · Score: 1

So you are saying why not use a wrapper around git. Well, we've really come full circle now haven't we. Me: git already solves this problem. You: why not use git to solve it! (Sparkleshare) Now, let me tell you why asynchronous use of git is a bad idea. git is making changes on the drive, and commits are atomic. You can easily set up git hooks to automate the process synchronously. There is no reason, and no advantage to, using a separate git repo and storing your whole git repo in a git repo. It is at best going to slow things down, and at worst it will hose your repo. I can never decide if it is hilarious or sad that people think they could teach Linus a thing or two about software development. The likelihood that it is true is very close to 0x00;

--
Guns don't kill people; Physics kills people! - John Lithgow as Dick Solomon on Third Rock From The Sun

Re:You trust Torvalds after this? by PlusFiveTroll · 2013-09-11 10:36 · Score: 1

I agree with everything you said, other then "RAID's only purpose'. RAID also allows you to create volumes of much larger sizes then the original disks and allows greater read/write and IOPS by treating many disks as a single device.

Linus is a newb ... by drnb · 2013-09-11 10:39 · Score: 1

Linus needs a personal sysadmin to protect him from this kind of calamity ...

All newbs do. And newbs don't usually learn to appreciate backups until they suffer a catastrophic loss. This just proves, yet again, that brilliance in one domain does not imply brilliance in any other domain. Of course the clueless don't recognize this and think a very successful person must have all the answers, when in fact they are equivalent to newbs in many domains.

I'm sorry. by Virtucon · 2013-09-11 10:39 · Score: 1

This is retarded. In this day and age to have somebody the "master of the universe" for doing merges on one machine is not only retarded but it's bad practice. Can you imagine a large company shipping a release and suddenly having to stop because a system or a drive failed? People would be answering up the food chain in a hurry. Shit, forget all of the best practices in terms of backup and redundancy to protect your business in the "real" world that would apply here but we have to account for Linus' imaginary world where he rails on and on over crap that really doesn't matter and pontificates about how "dumb" other people are.

What we have here is a Nelson moment... To all that have been belittled by his rude, arrogant nature in the past you have the right to say "Ha! Ha!"

--
Harrison's Postulate - "For every action there is an equal and opposite criticism"

Re:I'm sorry. by BitZtream · 2013-09-11 22:09 · Score: 1

So when you're composing an email, do you do it on two machines at once? How about when you login? Do you type each character of your username and password on two different machines?
You're an idiot.

--
Persistent Volume manager for Kubernetes - https://github.com/dwimsey/openshift-pvmanager

Re:You trust Torvalds after this? by hawguy · 2013-09-11 10:41 · Score: 1

I agree with everything you said, other then "RAID's only purpose'. RAID also allows you to create volumes of much larger sizes then the original disks and allows greater read/write and IOPS by treating many disks as a single device.

Yeah, sorry, I was speaking only in the context of redundancy, and misspoke when I declared that the *only* purpose of RAID is redundancy.

Brilliance is domain specific ... by drnb · 2013-09-11 10:46 · Score: 1

Maybe Linus needs to create a backup program like he did when he wanted a better version control system and created git? Also, why is the only copy of the changes on his local workstation and not a server with redundancy? This seems rather amateurish.

Brilliance in one domain does not imply brilliance in any other domain. Only the clueless think very successful people have all the answers. Only the delusion-ally arrogant think they have all the answer. Only the foolish think it can't happen to them.

Yep, Linus was acting as a true newb if anything more than a small number of hours worth of work was lost. Especially so given the known characteristics of SSD drives. Not very far removed from the person who types in a word processor for hours without saving.

Re:None of that mattered, because by MightyYar · 2013-09-11 11:41 · Score: 1

There is no reason, and no advantage to, using a separate git repo and storing your whole git repo in a git repo.

Except, you know, that it would be backed up if the drive died.

I can never decide if it is hilarious or sad that people think they could teach Linus a thing or two about software development.

I don't think "running a backup application" is part of "software development", and it's clearly something he needs help with. The man is clearly not as infallible as you seem to think he is. His mistake now requires work from other people to clean up.

--
W..w..W - Willy Waterloo washes Warren Wiggins who is washing Waldo Woo.

NSA tinfoil hat time! by phoenix_orb · 2013-09-11 12:14 · Score: 1

I hope that all of these resubmitted patches are exactly the same as they were before. I would hate to see that this was used as a vector to add a backdoor into the kernal.

--
Blah Blah Blah.

Re:NSA tinfoil hat time! by BitZtream · 2013-09-11 22:08 · Score: 1

Right, because he would suddenly not see it this second time around, but magically they wouldn't try it originally.

--
Persistent Volume manager for Kubernetes - https://github.com/dwimsey/openshift-pvmanager

Re:None of that mattered, because by Zero__Kelvin · 2013-09-11 12:15 · Score: 2

"I don't think "running a backup application" is part of "software development""

Do you work for me, because if you do, you're fired.

"The man is clearly not as infallible as you seem to think he is. His mistake now requires work from other people to clean up."

Wow. I mean seriously. Wow. You are just hell bent on being a moron. You offered "solution" after "solution" that is no solution at all, and seem to have missed the first line of his post: "I had pushed out _most_ of my pulls today, so realistically I didn't lose a lot of work." I have to assume you don't work in the real world, because if you did you would realize how truly asinine you sound.

--
Guns don't kill people; Physics kills people! - John Lithgow as Dick Solomon on Third Rock From The Sun

Rant by Cus · 2013-09-11 12:59 · Score: 1

Who's going to be the one to rant at *him* on the mailing list for this?

Re:Really? Naa by Psyko · 2013-09-11 13:35 · Score: 1

Err... nearly no one does contact rework anymore professionally, and even half-serious hobbyists nowadays at least have a AOYUE and a high temp vac pickup.
Then just clean the site, apply mini stencil, squeegee paste, remove stencil, place new part, reflow.

Yeah, personally I haven't done SMT rework in about 15 years, Aoyue sure has brought the price down on rework stations, that's less than I paid for my Metcal, and that's just a basic iron. I don't want to remember how much we paid for some of the larger Metcal & Pace hand rework systems back in the day.

--
01:36AM up 426 days, 2:46, 1 user, load average: 0.14, 0.11, 0.05

Re:None of that mattered, because by MightyYar · 2013-09-11 13:39 · Score: 1

You offered "solution" after "solution" that is no solution at all,

You can say that all you want, but if he were running Crashplan, he'd have lost maybe 20 minutes of his working day. Switch to one of his other computers, restore his working directory to a new directory and keep on truckin'. Instead he's struggling to recover data from a dead drive. Dropbox or Sparkleshare wouldn't even require a restore step - just walk over to the other computer and pick up where you left off. Do you dispute this fact?

and seem to have missed the first line of his post: "I had pushed out _most_ of my pulls today, so realistically I didn't lose a lot of work."

And you seem to have missed the part where he's been trying to recover the dead drive and told people that they will need to re-push their changes over the next couple of days. Does that sound like a trivial amount of work lost to you? Days of recovery?

--
W..w..W - Willy Waterloo washes Warren Wiggins who is washing Waldo Woo.

Re:None of that mattered, because by TangoMargarine · 2013-09-11 13:56 · Score: 1

Because doing a git push takes so much effort? They should still have all the code in their local repos.

--
Unity? Screw that: XFCE. Slashdot Beta? Screw that: SoylentNews. Australis? Screw that: Pale Moon. UX developers DIAF

Re:None of that mattered, because by TangoMargarine · 2013-09-11 13:57 · Score: 1

Okay yes, I have to admit git can get gnarly. But if they do it in the right order it should theoretically be fine.

--
Unity? Screw that: XFCE. Slashdot Beta? Screw that: SoylentNews. Australis? Screw that: Pale Moon. UX developers DIAF

Re:None of that mattered, because by TangoMargarine · 2013-09-11 14:01 · Score: 1

Do you work for me, because if you do, you're fired.

Wow. I mean seriously. Wow. You are just hell bent on being a moron.

if you did you would realize how truly asinine you sound.

Pot: Kettle.

--
Unity? Screw that: XFCE. Slashdot Beta? Screw that: SoylentNews. Australis? Screw that: Pale Moon. UX developers DIAF

Re:You trust Torvalds after this? by BitZtream · 2013-09-11 14:12 · Score: 1

You need to learn about ZFS, as it does protect against raid card failure (you don't use one), filesystem corruption (multiple levels of checksuming AND verification), rm -rf (hourly snapshots)

What else do you have?

--
Persistent Volume manager for Kubernetes - https://github.com/dwimsey/openshift-pvmanager

Re:Pathetic by BitZtream · 2013-09-11 14:14 · Score: 1

And that is still nothing like Time Machine.

TimeMachine is not just a copy of your latest files, its snapshots of your machine. Let it 0 byte a bunch of files, just go find the last snapshot before they were 0 bytes and restore.

--
Persistent Volume manager for Kubernetes - https://github.com/dwimsey/openshift-pvmanager

Re:None of that mattered, because by MightyYar · 2013-09-11 14:31 · Score: 1

I'm sure all of the data still exists somewhere - git is very nice for that. However, there is time and work involved with corralling all of it and getting back to where he was. Had he been using even the most brain-dead, consumer-level backup schemes like Dropbox he wouldn't have lost any work at all. If you read through those kernel mailing lists, the git pushes often seem quite eventful :)

--
W..w..W - Willy Waterloo washes Warren Wiggins who is washing Waldo Woo.

What do the manufacturers say? by MobyDisk · 2013-09-11 15:10 · Score: 1

This is always the story with SSDs. They suddenly die, even though all the literature from the manufacturers says this is not a mode of failure that ever happens. Supposedly, the drives enter a read-only state where they can only be read, because they are out of space to wear-level. Has anyone ever had a drive actually do this? I want Linus to send this drive back, and get an answer from the SSD manufacturer as to what happened.

Re:What do the manufacturers say? by DarkXale · 2013-09-11 22:44 · Score: 1

What? Its exactly the type of failure that -always- happens. Its wear based failure that doesn't normally happen, since most SSDs have enough endurance to last a typical user a hundred years or more. The electronics (controller chip) however don't have that endurance, and its the electronics that cause these sudden deaths. HDDs are as susceptible to them as SSDs are - the difference is HDDs also encounter mechanical failures, so when a HDD dies - its not as likely to be from this problem.
Re:What do the manufacturers say? by MobyDisk · 2013-09-12 08:32 · Score: 1

Right. We are in agreement.

This is always the story with SSDs. They suddenly die
and

Supposedly, the drives enter a read-only state where they can only be read, because they are out of space to wear-level. Has anyone ever had a drive actually do this?
So far, I know of no one who has had a drive die from lack of wear leveling, which is why I want to know what really happened.

Re:Really? Naa by __aajfby9338 · 2013-09-11 15:35 · Score: 1

Doing hand rework on that kind of stuff the hardest thing for me was dealing with smt chip caps, little bastards will crack if you heat em to fast, so you have to get a temp regulated hot plate, heat em up slow, then pick and place em quick with tweezers/needlenose & solder em down quick.

Maybe your iron doesn't have good enough thermal control, and/or is at too high a temperature? I used to swear by Weller irons, until I was coerced into trying a Metcal. I was amazed at what a difference a really good iron makes. I routinely hand-solder down to 0402 size components without any problems at all. I often have to rework 0201 sized ones, and those are harder. But with a good iron, appropriate use of flux, good minimum-size tips that you don't use for anything larger than 0201, treat carefully, and definitely don't let anybody else borrow, it's not that bad. When I need to tune up an RF path at work, I'll end up changing the same few 0402 or 0201 passives a dozen times or more, without ever cracking a component or lifting a pad.

Oh, and one other thing: If the board was built with lead-free solder, wick that crap off and do your rework with proper 63/37 tin/lead solder! It'll make better joints, and it melts at a lower temperature.

Re:You trust Torvalds after this? by mjwx · 2013-09-11 16:34 · Score: 1

No, RAID is *not* a backup, RAID's only purpose is to improve reliability/uptime by letting you ride through hardware failures,

This, RAID is high availability and in no way a backup.

Why?

Because if data gets corrupted or accidentally deleted it gets corrupted or accidentally deleted across the entire array. RAID will keep you going if a disk dies (most of the time) but it wont recover data. Also remember RAID is vulnerable to the RAID controller dying (had that happen to an Adaptec years ago, it was a good thing we had backups).

--
Calling someone a "hater" only means you can not rationally rebut their argument.

...what the fuck by Arancaytar · 2013-09-11 17:13 · Score: 1

I know I'm less than careful about backups myself, and only push an incremental backup to cloud storage about once a week, but then again my laptop is not a single point of failure for the development of the largest operating system in the world.

(As for desktop machines, it's pretty trivial to set up a RAID-1 and not have to rely on periodic backups at all.)

Re:...what the fuck by Arancaytar · 2013-09-11 17:14 · Score: 1

(And I swear I meant to type kernel back there.)

Re:Really? Naa by jkflying · 2013-09-11 18:30 · Score: 1

Why are you using an iron? Use hot air and an oven, it makes SMT a completely different game. Just make sure you use lots of flux.

--
Help I am stuck in a signature factory!

Re:Really? Naa by __aajfby9338 · 2013-09-11 18:38 · Score: 1

Old habits die hard, I guess. I rework existing boards a lot more often than I do assembly. For changing out discrete resistors/caps/inductors, a pair of good irons works very well. An iron is also preferable for tacking wires onto test points.

Re:None of that mattered, because by JImbob0i0 · 2013-09-11 20:43 · Score: 1

If you read the detail you would see he lost a grand total of three commits ... that he then took from LKML and promptly reapplied to his new instance.

Modernization needed by DrYak · 2013-09-11 22:27 · Score: 2

"Only wimps use tape backup: real men just upload their important stuff on ftp, and let the rest of the world mirror it ;)" - Linus Torvalds[1]

Pfff... That's soooo last century!

Let me fix that for you, Mr. Torvalds
"Only wimps use tape backup: real men just upload their important stuff on git, and let the rest of the world clone it"
Now that sounds more typical for the current decade.

Oh, and for the MasterCard-Ads like finish:
"For everyone else, there's the NSA."

----

The funniest part is that he is the actual author of the git scm system which served him as backup this time.

--
"Sufficiently advanced satire is indistinguishable from reality." - [Tips: 1DrYakQDKCQ6y52z6QbnkxHXAocMZJE61o ]

Re:Pathetic by Builder · 2013-09-11 22:28 · Score: 2

rsync is nothing at all like time machine.

Look a little harder into the versioning aspects of time machine and let me know how to make rsync do that.

And when you've done that, please let me know how to get rsync to manage space and remove the oldest backup first when a full backup cannot complete.

Re:Pathetic by Builder · 2013-09-11 22:29 · Score: 1

Not sure how to raid my laptop without voiding the warranty, opening it up and replacing the DVD drive with a second SSD...

Re:Pathetic by Builder · 2013-09-11 22:31 · Score: 1

I looked into crashplan early this year and decided I didn't want it because they can't guarantee that my data will be safe from US authorities.

With some of what we've learned this year, I'm glad I chose not to do that ...

Until the day.... by DrYak · 2013-09-11 22:35 · Score: 1

Until the day when an obscure bug* triggers a cascade of events flagging you as a "Pedo-Terrorist Pirate Pronographer".

*: Or a fly lands in a typing machine

--
"Sufficiently advanced satire is indistinguishable from reality." - [Tips: 1DrYakQDKCQ6y52z6QbnkxHXAocMZJE61o ]

git by Captain_Chaos · 2013-09-11 22:42 · Score: 1

If only he'd been using a centralised version control system... ;-)

Not one but two single points of failure by ThatsNotPudding · 2013-09-11 23:36 · Score: 1

Not just Linus himself, but also his sole hard drive!

This is what concerns me about Linux: a single egg-basket so threadbare as to be nearly see-through.

SSDs die too by 1s44c · 2013-09-11 23:37 · Score: 1

Just so this is totally clear - SSDs die too!

Now we all know we can see the point of RAID-1 (inclusive)OR frequent backups in any critical system.

Re:You trust Torvalds after this? by 1s44c · 2013-09-11 23:39 · Score: 1

PlusFiveTroll was dead on when he said RAID isn't a backup, that ZFS isn't just RAID, it's a whole lot more. ZFS>RAID.

Re:Pathetic by MightyYar · 2013-09-12 00:02 · Score: 1

How can anyone make such a claim and be taken seriously? A warrant will always let the government have your data.

--
W..w..W - Willy Waterloo washes Warren Wiggins who is washing Waldo Woo.

Re:None of that mattered, because by MightyYar · 2013-09-12 00:09 · Score: 1

Where does it say that? I see in the summary and linked article where it says he's trying to recover the drive and if he can't then people will need to re-submit over the next couple of days.

--
W..w..W - Willy Waterloo washes Warren Wiggins who is washing Waldo Woo.

Re:You trust Torvalds after this? by jsepeta · 2013-09-12 00:49 · Score: 1

RAID is not a backup BUT it can help when you absolutely must recover data. For mission critical work, you need both.

--
Remember kids, if you're not paying for the service, YOU ARE THE PRODUCT THAT IS BEING SOLD.

Re:None of that mattered, because by Zero__Kelvin · 2013-09-12 01:29 · Score: 1

"You can say that all you want, but if he were running Crashplan, he'd have lost maybe 20 minutes of his working day. "

How many minutes of his working day did he lose?

" Instead he's struggling to recover data from a dead drive."

Where did you get that idea?

"And you seem to have missed the part where he's been trying to recover the dead drive and told people that they will need to re-push their changes over the next couple of days. Does that sound like a trivial amount of work lost to you? Days of recovery?"

Oh my God! They will each have to execute a single push command! The horror!

--
Guns don't kill people; Physics kills people! - John Lithgow as Dick Solomon on Third Rock From The Sun

Re:None of that mattered, because by MightyYar · 2013-09-12 02:52 · Score: 1

How many minutes of his working day did he lose?

I have no idea - what is your estimate? I can only go by the information in the linked article, that it will take a few days to sort out:
"Torvalds is attempting to recover the dead hard disk but at the moment it doesn't appear easy and subsystem maintainers who have outstanding pull requests may need to re-submit their requests in the coming days."

Oh my God! They will each have to execute a single push command! The horror!

Making other people pay for your own mistakes is bad karma.

--
W..w..W - Willy Waterloo washes Warren Wiggins who is washing Waldo Woo.

Single drive? by Leofcwen · 2013-09-12 04:25 · Score: 1

Considering the importance of what he does, why didn't he use a multi-drive RAID or ZFS system?

Three Words to choke upon by Gallomimia · 2013-09-12 04:45 · Score: 2

Rack Mounted Server. I just gotta know, why is all this mission-critical operational stuff taking place on a workstation with workstation grade hardware and no backups or raids? Everyone's talking about oh raid at home isn't good, just use backup drives. Look: This is LINUX. If there's need for additional hardware and compile farms, people will probably donate. To have a single SSD failure cause so much calamity for any project, least of all *THE* open source project, is just embarrassing. Worse than swearing at your devs on a mailing list read by the whole world.

--
Sadly, a Libertarian cannot force his views on another, and freedom cannot spread as does the cancer known as religion.

Re:Pathetic by CastrTroy · 2013-09-12 04:59 · Score: 1

Lenovo has some laptops that support changing out the optical drive for a second hard drive. They have some really nice stuff. I'm considering them for my next laptop purchase.

--

Anthropic principle: We see the universe the way it is because if it were different we would not be here to see it.

Re:You trust Torvalds after this? by g1zmo · 2013-09-12 05:21 · Score: 1

RAID is a backup scheme for hardware, not for data.

--
I have found there are just two ways to go.
It all comes down to livin' fast or dyin' slow. -REK, Jr.

Re:Both RAID+Back-Up by hawguy · 2013-09-12 05:27 · Score: 1

No, RAID is *not* a backup, RAID's only purpose is to improve reliability/uptime by letting you ride through hardware failures, but it does nothing to protect you from all of the rest of the things that can destroy your data, like file corruption, fat fingering a "rm -rf / home/someuser", a virus, a website hack attack, etc. That's what your backups are for, but you can call them archives if you like, but don't call RAID a "backup" because it's not. Depending on what the problem is and when you discover it, you may need to go back through several archives before you find the data you're looking for.

And yet... If your drive fails just before your scheduled "Backup" starts then if it was part of a redundant RAID then guess what? Your RAID just saved that data yes? It acted as a back-up for at least an entire day's work where-as your official "Backup" did nothing for you in regards to that data.

So yes, RAID "by itself" is not a reliable back-up system in every case. But then, neither is back-up software a 100% reliable back-up system in every case. Clearly both together are actually required in order to have a truly effective back-up system, not just back-up software by itself.

That's why you should tune your Recovery Point Objective appropriately. There are failures that can take out an entire server and/or RAID array, so if you can't stand a day of lost data, you should be making backups more frequently.

RAID is not part of a backup strategy, only backups are a backup strategy. RAID protects you only against a particular type of hardware failure (i.e. hard drive failure) and is part of your high availability strategy. Don't count on RAID to replace a sane RPO for your backups.

Backups and online? by Lord+Chaos+EOG · 2013-09-12 05:53 · Score: 1

I have multiple backups in seperate locations and online redundancy (saving 50 past versions of the file, in case of damage, deletes or errors). One offsite backup is a passive one (in case of a systemwide hack) that cannot be accessed from the outside through the net. And this is just for my personal stuff...I'd expect an important person like Linus Thorvald to have a minimum of a basic backup or at least a NAS or online storage.

Poor, poor drive by Anonymous Coward · 2013-09-12 11:49 · Score: 1

I don't want to know the tirade that drive is receiving right now.

Re:None of that mattered, because by MightyYar · 2013-09-12 15:24 · Score: 1

I know that - I use git. It's great, but it is not a complete computer backup system. By its nature, it does a good job keeping copies of Linux all over the world, so that even when the very smart but also apparently very cocky maintainer drops the ball, it only costs the project a few days. However, had he run any of the very, very simple to setup backup applications available out there, we would not have even heard about his SSD dying. No one should know that Linus's drive died - it should be a non-event.

--
W..w..W - Willy Waterloo washes Warren Wiggins who is washing Waldo Woo.

some models of HD are bad, brand irrelevant by raymorris · 2013-09-12 15:36 · Score: 1

FYI, in a study Google did of thousands of drives, they found that while certain models of drive were good and some bad, all manufactures had similar failure rates. Western Digital makes some good models and some bad ones, as do all of the other manufacturers .

That said, I run 40 drives at a time. In my environment, at least, HGST (formerly Hitachi) has had the best track record for me.

backup technology has progressed, slightly by SuperBanana · 2013-09-12 18:52 · Score: 1

That is a ridiculous statement. Work is lost every time a drive fails unless it happens to fail immediately after a backup. Full backups take lots of time. If you understood git better [SNIP]

Full backups? LOL, son.

It's not a ridiculous statement. Our backup system backs up 350+ machines every 15 minutes by default, as long as they have a working network connection, anywhere in the world that can reach our server; the client works by watching what files are changed, and periodically (every 2-3 days) doing a full scan in case it missed something. We dialed it back to once an hour based on user feedback - people felt an hour was more than acceptable in terms of lost productivity. We retain those revisions for about a week, and they're progressively paired down. Restores take seconds and are self-service, as is adding another machine to your account.

Furthermore, we use IMAP for email, so even if your workstation or laptop dies in a big puff of smoke, your email isn't lost.1995 called, wants to know why the fuck Linus is apparently using POP3.

If I had a dollar for every prick developer who thinks they know how to do IT, I'd be rich (and a lot saner.) Programmers are the worst to support by far because they have absolutely zero humility. Everyone else generally either asks how something should be done, or at least has the graciousness to ask if what they have in mind will work. Programmers just charge ahead and assume they know what they're doing because they've got a Mythbuntu box and a Linux NAS box at home...

--
Please help metamoderate.

Re:backup technology has progressed, slightly by Zero__Kelvin · 2013-09-13 01:50 · Score: 1

You don't seem to be able to get your story straight. You say that backups happen periodically and that there is an acceptable loss of work, but go on to complain that Linus only pushes his changes periodically and due to this he has lost work, which is entirely unacceptable.

If you want to know why programmers are the worst it is because you are over-generalizing. You are also trying to enforce a false dichotomy. For example I am both a programmer and a seasoned system administrator. I have certainly encountered programmers who were clueless as well as those who were highly competent, just as I have met highly competent IT support professionals and others who were more like you. Ones who actually think that being an IT drone makes them more knowledgeable than all the programmers. Ones who can't get their story straight, and can't learn from people who are much smarter than they are because they already think they have all the answers about a domain in which they have exactly zero experience.

". Programmers just charge ahead and assume they know what they're doing because they've got a Mythbuntu box and a Linux NAS box at home..."
This statement makes you seem as clueless as you no doubt are about technology. You can't properly solve a problem you don't understand, and you have criticized a person you don't know with absolutely zero knowledge of the situation. I have had to deal with plenty of incompetents like you over the years, and thank my luck starts I don't have to deal with you now. Plonk.

--
Guns don't kill people; Physics kills people! - John Lithgow as Dick Solomon on Third Rock From The Sun

ironic biggest hater of ignorance... by Anonymous Coward · 2013-09-12 21:38 · Score: 1

find it ironic that the biggest and loudest hater of ignorance himself was victim. if anyone else in that circle had that happen it would be message after message of "what a stupid cocksucker you are". what a moron/asshole

Re:Really? Naa by TheGratefulNet · 2013-09-13 01:46 · Score: 1

short hint: search for 'chip quick' (mouser and similar places sell it).

its like solder that stays liquid for a long time and lets you remove chips without special equip.

(you're welcome).

--

--
"It is now safe to switch off your computer."

Re:Really? Naa by funky_vibes · 2013-09-15 13:05 · Score: 1

Nowadays I actually prefer SMD over thru-hole... with one exception, BGA.
Hand reworking of them is seldom rewarding, especially the now more common FBGA.

And, It's likely that most of the chips on a device like an SSD are FBGA.

The pitch really isn't within the reach of human precision, so you *need* a pick & place machine.
You also need an IR heater, and X-ray to check the connections.

SandForce SSDs out there have been clear of firmwa by RyanEvans5885 · 2013-09-15 18:32 · Score: 1

We can all hear the frustration in your message for having lost data, and for that I am sorry you are experiencing that. You made some interesting statements that I have not seen myself. All the current SandForce SSDs out there have been clear of firmware bugs like this for more than a year from what I see. Mine have all been fine. Are you running with some old FW maybe? The problems I see are more often physical issues that are unlreated to the controller. It is important to select an SSD manufacturer with high quality manufacturing, and don't just go for the cheapest solution. Intel uses SandForce almost exclusively now, so I find it hard to believe what you are describing is 100% true.

Re:Pathetic by Builder · 2013-09-18 02:58 · Score: 1

I'm fine with information being gained under a warrant because then I know someone is looking at me.

With the US warrantless searches and gag orders where warrants are used, I can't be told that I'm being investigated. That doesn't sit well with me, so I'm avoiding US based services for new stuff and moving my existing services out of the USA.

Re:Pathetic by MightyYar · 2013-09-18 04:41 · Score: 1

Ah, now I see your point. Crashplan has an option where you can supply the encryption key. That means the government would have to come to you (or compromise the Crashplan app on your computer) in order to snoop your stuff.

--
W..w..W - Willy Waterloo washes Warren Wiggins who is washing Waldo Woo.

Re:Pathetic by Builder · 2013-09-18 09:08 · Score: 1

I wasn't aware of that... That makes crashplan much more appealing actually. Thanks - I'll read up on it!

Slashdot Mirror

SSD Failure Temporarily Halts Linux 3.12 Kernel Work

386 of 552 comments (clear)