Can SSDs Be Used For Software Development?
hackingbear writes "I'm considering buying a current-generation SSD to replace my external hard disk drive for use in my day-to-day software development, especially to boost the IDE's performance. Size is not a great concern: 120GB is enough for me. Price is not much of a concern either, as my boss will pay. I do have concerns on the limitations of write cycles as well as write speeds. As I understand, the current SSDs overcome it by heuristically placing the writes randomly. That would be good enough for regular users, but in software development, one may have to update 10-30% of the source files from Subversion and recompile the whole project, several times a day. I wonder how SSDs will do in this usage pattern. What's your experience developing on SSDs?"
I'm using the Intel SSD and I think it's great - fast and silent. Will it last? I'd argue you never know about any particular model of hard drive or SSD until a few years after it is released. On the other hand, I'd also argue it doesn't matter much. Say one drive has a 3% failure rate in the 3rd year and another has a 6% rate. That's a huge difference percentage-wise (100% increase). And yet it's only a 3% extra risk - and, most importantly, you need a backup either way.
The main difference is a good SSD is much, much faster than any hard drive. If discussions about the topic don't give that impression, it's only because people fixate on sustained transfer - where there is still some competition between slower SSDs and hard drives - rather than seek time, which is often more important, and where SSDs blow the doors off hard drives. To me, suddenly widening the biggest bottleneck in PC performance for the first time in a couple decades is pretty exciting.
I use SSDs for my (both) development systems--the first was for the work system, and after seeing the improvements I decided I would never use spinning-platter technology again.
The biggest performance gains are in my IDE (IntelliJ). My "normal" sized projects tend to link to hundreds of megs of JAR files, and the IDE is constantly performing inspections to validate the code is correct. No matter how fast the processor, you quickly become IO-bound as the computer struggles to parse through tens of thousands of classes. After upgrading to SSD, I no longer find the IDE struggling to keep up.
I ended up going with SSD after reading this suggestion for increasing IDE performance. The general jist: the only way to improve the speed of your programming environment is to get rid of your file access latency.
Before we start, let me make a prediction: You never asked about the MTBF of your hard disk, right...?
http://www.intel.com/design/flash/NAND/mainstream/
a) When Intel says "new level of ... reliability", maybe it means they thought about this problem when they designed the drive.
b) When they say "NAND flash", maybe it means they're not using the cheapest MLC memory as mentioned in that scary wikipedia article.
c) When their datasheet says "Minimum useful life of five years, assuming 20Gb/day of writing", maybe they got those numbers from real engineers, with degrees.
d) When their datasheet also says, "Should the host system attempt to exceed 20 GB writes per day by a large margin for an extended period, the drive will enable the endurance management feature to adjust write performance, this feature enables the device to have, at a minimum, a five year useful life", maybe they were really really paranoid about saying 'five years' because they know people will start class-action lawsuits if it doesn't work out.
So, um, how this even got greenlighted in 2009 is beyond me. It's like 1999 called wanting its flash-myths thread back.
No sig today...
The English language has syntax, too. It concerns things like proper placement and use of apostrophes.
Disagree. This problem went away for the most part.
First, performance isn't nearly the problem it used to be. We aren't using anymore the kind of hardware that needs the programmer to squeeze every last drop of performance out of it. In fact, we can afford to be massively wasteful by using languages like Perl and Python, and still get things done, because for most things, the CPU is more than fast enough.
Second, we're not coding as much in C anymore. In C I could see this argument, lazy programmer writing bubble sort or something dumb like that because for him waiting half a second on his hardware isn't such a problem. But most of this has been abstracted these days. Libraries, and high level languages contain highly optimized algorithms for sorting, searching and hashes. It's a rare need to have to code your own implementation of a basic data structure.
Third, the CPU is rarely the problem anymore, I/O is. Programs spend most of their time waiting for user input, the database, the network, or in rare cases, the hard disk. A lot of code written today is shinier versions of things written 20 years ago, and which would run perfectly fine on a 486. Also for web software the performance of the client is mostly meaningless, since heavy lifting is server-side.
Also, programming has a much higher resource requirement than running the result. People code on 8GB boxes because they want to: run the IDE, the application, the build process with make -j4, and multiple VMs for testing. On Windows you're going to want to test your app on XP and Vista, on Linux you may need to try multiple distributions. VMs are also extremely desirable for testing installers, as it's easy to forget to include necessary files.
I'd say that giving your developer a 32 core box would actually be an extremely good idea, because the multicore CPUs have massively caught on, but applications capable of taking advantage of them are few. Since coding threaded code is not lazy but actually takes effort, giving the programmers reasons to write it sounds like a very good idea to me.
Cheaper drives (which mgmt is sure to require) have 1,000 write cycles (assuming the worst). For certain high-traffic files, that means (assuming 30 writes in a day) a whole 33 days of use.
If that were true. Then an SSD hard drive couldn't run a linux mail server for a small business for more than a couple minutes thanks to the various log files.
1) The maximum write cycles for a block was around 10,000 in 1994. And about 100,000 in 1997. But in 2009 you think 1000? No. Its currently in the millions, even for the cheap SSDs.
2) Look up wear levelling.
3) Look up the MTBF on an SSD vs a spinning platters type.
I've seen studies that have calculated that modern drives will could write continuously at maximum speed for 50+ years before exhausting wear levelling and hitting write cycle limits.
The odds of it failing from something else long before then are much greater. Getting a mere 5+ years of life and easily beating your average spinning disk hard drive is a no brainer.
This is why it's almost pointless to ask a question on Slashdot. You get 100s of replies in a 50/50 distribution of random tech-word ramblings and flat out useless contempt, leaving you feeling stupid and your question unanswered.
The whole "millions" thing may be true for SLC parts. MLC parts (which are much cheaper) have much lower write counts. The best MLC flash I'm aware of is only rated for a million write cycles. Thousands or tens of thousands is more typical for MLC flash parts. Write amplification makes this even more fun, since it means that a write of one disk block can require rewriting many, many blocks that otherwise would not have been written. If the wear leveling algorithm is optimal, then it's a moot point. If the wear leveling is nowhere near optimal, you can create artificial workloads that will burn out a few cells on the flash part in hours, which is a bit problematic. There is no clear-cut answer for this sort of question, unfortunately, at least not with the current crop of MLC tech.
Consider a log-structured filesystem, perhaps....
Check out my sci-fi/humor trilogy at PatriotsBooks.
A similar argument was used in World War II to keep bolt action sniper rifles in use in some countries instead of 'upgrading' to 'auto-loading' rifles. With bolt action, after shooting, you had to physically lift the bolt, cock it in place, and push it down again before you could fire another shot.
The argument was, if the snipers knew they couldn't fire again immediately, they would be more careful lining up and aiming that first shot. With an 'auto-loading' rifle, you could keep your eye in the scope and fire off more rounds.
It seems quite obvious, that if you're in the field, the seconds after that first shot are very important. If you need to take your eye away from the scope, and spend the time reloading the chamber, the outcome could be completely different than if you were able to fire off a few rounds immediately.
A good sniper would have aimed that first shot up carefully no matter what rifle they were using, in the same way a good programmer will make efficient, elegant algorithms no matter what machine they're using. You'd only have to 'limit' your programmers if you think they're bad programmers. If a supervisor is thinking along these lines, they've already hired bad programmers and are setting both themselves and their team up for failure. The faster the machines, the less time wasted. You don't need forced limits reminding them about efficiency, because any decent programmer will already be thinking about it.