The interesting thing about computers is that assumptions need to be continually adjusted as new realities come into being.
This year, we're seeing a serious convergence:
multi-core is the norm now, even on lower-end computers;
you assume 64-bit unless someone says otherwise;
4 gigs of ram is now the bare minimum, and 6 gigs is the new 2 gigs;
anything less than half a terabyte is "huh? are you kidding?"
I have to admit, seeing that Wallyworld flier with a 17.3" laptop, dual core, 64 bit, 6 gigs of ram, 640 gig hd, 1600x900 screen - that's a game changer.
Most of the people I know would never have to delete a file if they kept it for 5 years. That seriously changes the way you work with a computer. And if it gets "full", less than $100 gets you a second internal drive. You could throw a gig a day onto it every day for 3 years and never delete anything. Empty the recycling bin? What recycling bin?
So what do you do with all that? One obvious thing is to get rid of your swap file - you no longer need it. 6 gigs not enough? Swap out the two 1-gig chips for 2 more 2-gig chips to bring it up to 8 gigs.
What else can you do? A ram disk for your temporary files is kind of obvious... it would speed up a lot of things. So would a massive disk cache - much more than a non-volatile-ram SSD.
Building file versioning directly into the file system, rather than having applications manage it, is also an option. I don't mean journaling file systems, but true versioning, so that you can see the changes, and fork of any previous version to a new file, or revert changes, etc.
Cheap ram, multiple cores, and oodles of disk space - it's not the same "user space" it was even a couple of years ago. What are you going to do when a 8-core machine with a nice display, 16 gigs of ram, and 6 TB of disk space is at the same price point? By then, you're going to not just expect, but NEED the OS to be able to file things by a simple set of rules, since it's going to be like a closet that you throw everything into, and it magically hangs everything up in the right place.
Eventually, we'll go all solid-state, but rust has a good decade left in it - or more. After all, what if we didn't have to spin it to read/write tracks?. No more head crashes...
that doesn't make a difference. I just manually checked my/lib and there are only 2 files smaller than 4k that are not symlinks - one's a chk file and one is a shell script to call gcc
The same for/usr/lib - loads of files in the 6-figure size. There's also plenty of smaller files for html documentation (I have about 5,000 packages installed on this box), but they don't get read when you load a program library...
Also, according to your own statements, any file over 512 bytes that was contiguous wouldn't benefit from a faster IOPS for the 2nd and subsequent sectors.
Take a file that's read in one shot at 20k. That's 40 sectors (5 blocks) that have been merged into one read. A drive that does that 100 times a second will show pretty much the same performance as an SSD reading those same files that's rated at 4000 IOPS - because in the SSD case, 39 out of 40 (more than 97%) of all reads aren't random.
So take my desktop, which normally runs with 4 drives. If all 4 drives have a crappy 100 random IOPS, that's still a combined 400 IOPS capacity. Now throw in that same multiplier, and they can compete with an SSD that does 16,000 IOPS,
Of course, the HD situation is actually better than theory, because since the drives are separated as to work, there's more likelyhood that files will not be fragmented (for example, a piece of a log file being written to a block between two pieces of an ISO), and this is what we see on/var/log, where the average continuous read is 70 blocks (560 sectors). So only 1 out of 560 sectors is a "random read". That's less than 2 tenths of 1 percent.
But let's forget all that and look at the near future.
People want their computers to work for them. They don't want to waste time organizing files, moving files, deleting files to free up space. With terabyte drives now pretty much the norm, people are going to end up with file systems that are completely free of fragmentation. After all, if you never delete anything, and there's always enough free space to save a file without fragmenting it when you edit it (and you keep track of "holes"), what's the problem?
People keep their systems for 3 to 6 years. It takes a LOT for the average user to fill up even a pair of 320 gig hard drives, and todays cheapie (sub-$500) laptops are shipping with 6 gigs of ram and 640 gigs on a single drive.
It's going to be like gmail - never delete a file. When your laptop gets full, buy another one with 4 times the storage for less.
4 years ago, this computer had twin 250 gig hds... and they were getting kind of full, because of multiple backups on each drive. Quad 320 gig hds, well, there's still lots of space... even with multiple copies of important stuff on multiple drives.
One of my friends bought a new PC last Christmas - quad core, 640 gig hd - and he added 3 x 1TB hard drives, for/,/var, and/home. He can add a gig of data a day, 5 days a week, for the next decade, never delete a thing, and not fill it up.
That's the future we're looking at - massive storage will change people's habits. People already can't file things properly, so they'll leave "finding stuff" to the computer.
And when their box gets "full up", they'll just go buy another one twice as fast and 4x as much storage for half the price, and keep on chugging along.
SSDs will have to be able to compete on price, because otherwise, user laziness and convenience trumps speed when it comes to the whole value proposition.
Besides, admit it, the thought of 4 terabytes in a laptop sometime in the next couple of years is just... WOW! You want it just as much as I do:-)
And with file systems that large, we can integrate file versioning right into the OS, same as VMS. Think of it - outline.txt:1, outline.txt:2, etc. Simple, no special tools or programming required, available to all programs, and easy to do a diff between any two versions.
That's where the future is, and for the next while, it's not doable with SSDs.
The interesting thing is that not one of your files fits within one sector (512 bytes), so, since linux does a very good job of making sure files stay contiguous, even a one-block chunk (4k) is going to result in a read of 8 contiguous sectors.
Now that swap is fast becoming obsolete (gotta love Wallyworld starting a price war with FutureSh*t by selling 6gigRam/640gigHD laptops for $498) it's going to be cheaper just to buy a few extra gigs of real ram and make a ram disk.
IOPS count when making random reads/writes, and the most often to occur is for swap - after all, it's catch as catch can - there's no way to predict it - and it's called "thrashing" for a reason.
On sequential reads, when you don't have to move from track to track, and the file is contiguous, there is no advantage to SSDs. Even Intel admits as much (which is why they emphasize random reads/writes).
But even if every 4k block is fragmented (you'd have to work hard to do that), you still end up with the 8 sector at a time read for a 4k block, so you still end up with less of an advantage to SSDs than you'd think..
Throw in a couple of huge (640gig or 750gig) hard drives, so the user NEVER has to delete anything, and all new files are written in one nice long data stripe, a couple of gigs for a ram disk (which is WAY faster than any SSD on the planet that is using non-volatile ram - flash memory cannot compete with system ram) and the balance tilts back to the "old tech."
Much as I don't exactly like Microsoft, they're finally getting it right - pre-loading the most used stripes off the drive into ram so they can be mapped into the right address space when the user wants them, instead of being loaded from disk. Write all that to one long data stripe for the next reboot, and your next boot has no random disk seeks. You just read that whole chunk into ram - almost like resuming after hibernating.
No question about it - if money and capacity were non-issues, the debate would be moot. However, when a terabyte of SSD costs 30x what a terabyte of rust-spinning bit-bucketry costs, for most people it will be an issue.
People want storage over speed. Jobs has it wrong. A computer with massive amounts of storage (so you never have to delete anything) is what people want. It's why so many like gmail - they never have to delete an email again.
SSDs aren't in that price range yet where most people can stuff a terabyte into a laptop.
For laptops, the drives are fast enough that two do the job nicely. It's one reason servers are moving to laptop HDs.
But I have a better deal for you - instead of a SSD, why not invest the money in more real ram, and make a big ramdisk. That will be MUCH faster than any SSD on the planet, so you will get the same speed bump with a much smaller "disk" size.
Ever try running apps from a ramdisk? It's not fast - it's CRAZY fast.
So throw in an extra 2 gig to upgrade that $498 6 gig laptop to 8 gig of ram, devote 4 gig to a ramdisk, and go crazy. Just remember to copy the files to hd before shutting down:-) Faster, cheaper, quicker than an SSD - what's not to like?
So what - you only log in between reboots anyway... oops sorry, you're running Windows... my bad.:-)
Seriously, will you be that much more productive now that you have no time to grab a cup of coffee and say good morning to your co-workers? I doubt it.
And you could always help yourself by killing iTunes. Or at least make it run faster
2 Grant of license. (1) Oracle grants you a personal, non-exclusive, non-transferable, limited license without fees to reproduce, install, execute, and use internally the Product a Host Computer for your Personal Use, Educational Use, or Evaluation. “Personal Use” requires that you use the Product on the same Host Computer where you installed it yourself and that no more than one client connect to that Host Computer at a time for the purpose of displaying Guest Computers remotely. “Educational use” is any use in an academic institution (schools, colleges and universities, by teachers and students). “Evaluation” means testing the Product for a reasonable period (that is, normally for a few weeks); after expiry of that term, you are no longer permitted to evaluate the Product.
I'm sorry, but for me, a keyboard without a number pad is not a "full-sized keyboard" any more than an ice cream sundae without the toppings is an ice cream sundae. And that ENTER key leaves a lot to be desired.
Now I understand that, to get everything to fit, and to keep things cheap, they can't have different keyboards for the larger models... but that's a design flaw, the same as the "mighty mouse".
I don't disagree - it's just that today's consumer is being "educated" to never have to delete anything again - ever.
About 6-7 years ago, a laptop with a 50-60 gig hd and a half-gig of ram for $999 was considered a good deal. But 20 gigs of music and a dozen games would put a serious dent in it.
Around 3-4 years ago, a laptop with 200 to 250 gig hd and two gigs of ram for $800 was considered a good deal.
Today, $498 buys you a 17" laptop with a 640 gig hd, 6 gigs of ram, and a 1600x900 screen. They'll never delete anything again. When the drive gets full (it will), they'll just stuff a second one in the second drive bay.
When THAT gets full, quad cores with 12 gigs of ram, and 3 tb of disk space will be what, $399?
People can't manage large quantities of data. That's the real reason online email is popular - you never delete anything. It's why consumers want huge hard drives - they never have to worry about cleaning it up.
Jobs knows this - it's one reason iPads aren't upgradeable - you don't want to mess with your files? Fine - buy a new iPad with more features and a bigger storage capacity.
Of course, that backfires when the same consumer sees that, for the same $500 that bought an iPad with 16 gigs of space and you had to be careful not to fill it up, you can buy a laptop with 40x the space... and doubled it's storage at a future date for under $100, no external drive required...
Think of it - add a gig of data every day for 3 years without outgrowing it or deleting anything... that's what consumers want. Brain-dead convenience. Organizing all that data? "That's the computer's job." And they're absolutely, 100%, right - it *is* the computer's job.
So someone trying to sell them on a 32gig SSD that costs more than a 640 gig hd... they don't care if it takes a few seconds more to boot. They don't care if their word processor opens up a second or two quicker. They want storage. It's why gmail was such a success.
They also are getting older, so they want bigger displays. And they want others to be able to share those displays without having to crowd in. And their hands are getting older and maybe a bit arthritic, so they don't want the "one size fits all - if you're from Lilliput" keyboard on Apple laptops.
When you have all your data in 512-byte sectors and the majority of the time in normal usage your data is NOT sequentially laid out
Start with a false assumption... most modern file systems are smart enough to lay out data sequentially.
And most consumers I know want more disk space, not less. They don't want to bother ever having to clean up their drives. That's why on large drives, file fragmentation is not such an issue, even on dumb file systems (like the one you mentioned:-) Nobody ever deletes anything. Don't believe it? Go look at their desktop. It gets CRAZY!:-)
As for boot times, if you're only booting once a month (most people just suspend), who cares?
My office machines stay on 24/7 because one is also hosting the company wiki and a few test databases, and another is an iCrap that basically anyone can grab and use if they need to - it's used to hold a local copy of a bunch of files that we have backed up on a raid. Better you screw up the local copy:-)
My home desktop and my laptop get turned off, but this is for energy consciousness. Boot times there are also irrelevant - I turn them on, do a few things, then come back to them to check the weather, etc. They could take 10 minutes to boot, and it wouldn't make much of a difference.
A linux disk block isn't a hd physical block, it's a logical block. Your 17.11 blocks/read is actually 17.11*4k per read. (dumpe2fs gives 4k per logical disk block on my machines - on a smaller drive, you might have set this to 1k, but bigger logical blocks gives better performance overall on large files).
So that's 70k average per read.
Quick way to tell - "ls -l", and see what your directory size is - it should be either 1k or 4k.
My point was I wouldn't buy a mac. I'd buy (did buy) a non-mac 17" with a real keyboard. and a real number pad.
You'd think that for the price, Apple wouldn't be so cheap as to use the smaller keyboard on their biggest model. Then again, Apple isn't about functionality. We saw that with the Mighty Mouse, the Hockey Puck Mouse, AntennaGate, the iMac keyboards - one of the guys at work was complaining about the poor design of his current ipod - it's like nobody actually tests these things in the real world, just in the Steve Jobs Multiverse with Non-Optional Reality Distortion Field.
You are forgetting:
- Rotational latency
- Directory access
- File fragmentation
- Swapping
Nope.
If you're still using swap, it's time you got a new computer. Swap is like disco - it's dead. And you don't need ATIME, so use NOATIME when formatting the fs.
File fragmentation - get a modern OS. Even a fragmented file will have large chunks that are laid out sequentially, so even in those scenarios, sequential read/write speed is more important than raw IOPS.
Directory entries are cached.
Rotational latency is almost non-existent - and IS non-existent for sequential reads. I think you meant track-to-track latency, which is also greatly reduced by large drives with large data stripes, big hardware caches, and splitting your data and apps among multiple drives.
Note - these are drives, not partitions. Any read on/home doesn't cause a seek away from the open log file on/var, or pollute that drive's cache.
Now throw in multiple cores...
Your single SSD is now an (expensive) bottleneck in comparison, and a much smaller capacity.
The rust bin buckets are FAR from dead. Not when you can buy 30x the storage, with real-world performance that is just as good if not better, for the same price.
The cheapest ways to give an older machine new life are the same as always - more ram, bigger, faster drives, and a better video card. While an individual SSD will be faster, it will be much smaller for the same price, and not much faster, if at all, when you get into 4 or more drives.
As I pointed out, MY home usage is an exception that tends to back up what I say even more than the regular user. Then again, go into any user's desktop directory... most have LOTS of big files there.
Or do like I did -go look in/lib, where most of your programs actually live. The only files at 4k or under are symlinks and directory entries.
Or go look in someone's document directory. Ever see a word doc under 4k?
Ditto their music...
In other words, for all the stuff people actually DO with computers, the speed of sequential reads and writes of large files is the most important factor, not IOPS.
And now that we no longer need swap files thanks to cheap ram, random IOPS is even less important. Maybe it's time for Intel and AMD to offer cpus that save some complexity and juice by no longer supporting physical swap.
You will almost NEVER read or write just 1 block of data. Use a real-world scenario, instead of doing what you accused me of doing - lying. Go look at how large the files are in/lib. The average is over 100k (the symlinks are small, but you still end up having to read the file it points to).
No operating system in the last 2 decades has had an average file size of under 4k.
And then we have things like Word docs... ever see one of those under 4k? Even an empty one? Your scenario never happens in the real world, where sustained throughput is more important.
BTW - the main advantage of high IOPS is for swapping - if you're still running a swap file, you need to upgrade. Even a cheap $498 laptop comes with 6 gigs of ram standard.
Intel's own stats only claim that SSDs are 4x the speed of hds in actual use - and only 2x the speed of a 15krpm drive. You can increase the effective speed of an array of HDs by removing most of the need to seek from track to track.
No swap, and remember that NOATIME is your friend.
Each drive has its own 32meg cache, and since there's no cache pollution from reads on other drives, is more likely to get a hit. Also, the OS and drive both implement read-ahead as well as the elevator algorithm for head movement, resulting in further improvements in overall read and write speeds.
So even the 4-disk "ordinary" drive setup of today can equal the SSD - and 4 raptors will eat it for lunch.
Think of it - on your single SSD, copying a large file from one directory to another in/home kills it. On this setup, the other 3 drives are totally unaffected. A process that needs to load something from/lib doesn't have to share I/O with your file copy. Neither does the process writing to/var/log. And none of them care that someone is downloading from/srv/ftp.
Throw in a multi-core setup, and you can easily see the advantages over the bottleneck a single SSD has.
And then there's the price. For 1 TB, SSDs are 30x the price. No thanks.
Remove your swap file - you don't need it when even a $500 laptop comes with 6 gigs of ram. There goes the #1 advantage of SSDs - no disk thrashing on swapping.
So now, instead of IOPS, your primary goal is sustained throughput. A 4-drive setup gives you the same read/write throughput as an Intel X25 SSD (which claims 4x the throughput of a regular hd, and 2x the throughput of a fast hd, so the math is really simple), but much more bang for the buck.
Copying a large file in home no longer affects/srv or/var - and remember, each of the drives has a 32meg hdd cache. Combine that with look-ahead, elevator algorithm head movement, NOATIME, and you have a system where, unlike the single SSD, copying a file in/home to another directory has ZERO effect on the performance of the other drives.
This is great for web servers, because writes to the log file no longer generate much head movement, and reads to serve up data no longer move the heads away from the log file. Throw in that now, each drive is also much more likely to score a cache hit on it's particular data than would happen with one big 128meg hd cache, and it's not just a serious competitor to SSDs - if you need faster performance, you can beat that single SSD by a factor of two (Intel's numbers) by switching to 15k rpm HDs.
My point isn't that SSDs are bad - who wouldn't want one - IF...? They have their advantages - but real-life performance on a dollar-for-dollar basis (or even 10-to-one basis) isn't there when you get near a terabyte, and even those $498 laptops and desktops in this weekends flier have 640gig hds and 6 gigs of ram.
IF they were comparable in price.
IF they were comparable in capacity.
THEN I'd use them. They're neither, so I don't, and I don't see anyone around me making the switch either. Not when each new machine is already so much faster than our previous one. "Good Enough Computing" - I'll spend the savings elsewhere.
The advantage of SSDs is no heads to move, so there's no time lost seeking track-to-track. By sticking/var on it's own drive,/srv on it's own drive,/home on a third, and / on a fourth., and using NOATIME when creating the file system (after all, if the SSD doesn't use it, why penalize the rust(bit)buckets:-), you greatly reduce the need to seek for reads and writes - and most of the time, your next read will be in the HD cache, so it's the same speed as an SSD. (and the HD will also cache your next few writes, so same benefit).
Everything benefits, from compile times to web serving - the drive is no longer the bottleneck - it's the rest of the system. So you end up with much larger capacities, and a compile or file copy in your/home drive/directory doesn't affect the web server on/srv, or the writing of the logs to/var/log/apache2/
The benefit of SSDs to be able to handle task swapping for virtual memory is now pretty much moot - $500 gets you a system with 6 gigs of ddr3. Remove your swap or page file - you don't need it, you don't need the overhead of managing it, and you won't see any disk thrashing - if Firefox or Openoffice or Opera become too much of a hog, they should be killed off and restarted anyway.
So, given the choice between a single SSD and 4 much larger rustbuckets, I pick the rustbuckets - the combined bandwidth exceeds the single SSD, much higher capacity for less money, when I buy bigger ones, the older ones become backups or can be gifted to someone else, and the performance and flexibility of a multi-drive setup is nice.
And yes, hard drives ARE getting faster - combine 4 drives with 32mb disk cache each, a hardware-implemented elevator algorithm, more data per track (so fewer seeks), and today's drives outperform previous drives by a wide margin.
I guess you aren't a developer, if you can get by with only one box.
Compatibility testing, backing up to the second machine, running my own svn/ftp/http/ssh servers (separate from the ones I run on my main machine), keeping personal stuff (email, etc) isolated on one machine, business email on another, these are all valid reasons to have a second computer (or in my case, since nobody else at the office wants to use the iCrap, a third). Having access to an extra machine so I can do a quick fix on a problem, rather than disturb the workflow on my current box, is also convenient.
My important points about 17" laptops, and ones everyone seems to have overlooked, are that
they almost all come with an empty second drive bay - this makes it VERY easy to set up a dual-boot system
the screen is large enough to share conveniently - no fumbling with a projector and room lights,
they also tend to run cooler (more free space inside the casing, so the fan comes on less often, so it's less drain on the batter)
bigger battery
Yes, they weigh a bit more. The extra weight is irrelevant when it's sitting on a desk, which is where I work. I have yet to see anyone spend much time with a laptop actually on their lap, unless it was off/hiberating/sleeping/broken...:-)
Are you aware that all the Apple laptops (even the 11.6" MacBook Air) use the same size keyboard?
That is so sad. A keyboard without a number pad and full-size keys is not a very convenient keyboard. That's one of the reasons I like 17" laptops - the others being the bigger display and the room for a second HD.
I want it for work - not as a fashion accessory. If I want help generating typos, I'll get my dog to lick the keyboard - I don't need to spend several times the price for an undersized one.
my home directory, the average file size is 19,065,740 bytes
Given that you said "average", I presume you mean mean, which is not a good indicator of the most frequently present file. Median would be better, if you want to say "50% of my files are under this size".
No - I said average because I meant average.
Pretty much the only files under 4k in size were symlinks - and they point to much bigger files. The argument in favour of SSDs fails because pretty much any file access, even to a symlink, will end up reading one of those larger files, so sequential read speed definitely becomes a factor in almost every file access.
Just going to multiple hard drives (forget raid5 - that sucks the speed out of a system in comparison to either separate or mirrored drives) will give you all the speed advantages of SSDs, and a lot more storage space, for a lot less money.
So it's irrelevant if the median is 195,000 bytes, and the average is 250,000 bytes - they're BOTH well outside the original poster's claim that the OS uses mostly files under 4k. That has never been true, except perhaps under Microware's OS9 (not the same product at all as the much later mac 0S9), where modules could be made in a few hundred bytes of position-independent code, and the whole system could fit on two single-sided 160k floppies - quite an achievement for a multi-tasking OS that supported multiple text and graphical terminals.
Most people's home directories are exceptions to this - configuration files tend to be smaller, and so are startup scripts - but the code they call to run is much larger - and it has to be loaded into memory to run, which means it has to be read off the disk - so that 100-byte script can still end up resulting in several megs of disk activity.
It's just that it happens so fast, that people forget all that other activity.
Now tell me, are you going to be more productive with 4 crappy laptops or 1 really good laptop?
I'll certainly be more productive with my current laptop running linux than with any laptop on the planet running osx.
I'll be even more productive with 2 laptops
My home setup is my 17" laptop, and twin 26" screens on my desktop (or sometimes one of the 26" screens running as a secondary on the laptop), so yes, I can use 3 laptops.
My setup at work is my laptop, a desktop with dual monitors, and another desktop. So I can definitely use 4 laptops.
There are plenty of use cases where multiple computers not only beats a single one, especially if you're a developer - its pretty much mandatory. So yes, 4 $500 laptops (especially since 17", 4 gigs of ram, etc., is "good enough") is WAY more productive than even the best, most pimped out, mac air. Bbesides, they don't make a 17" mac air, and I won't work without a full-sized keyboard and a decent-sized screen - 17" is the minimum, even on a laptop.
I see too many people futzing around with smaller screens, and they don't realize how hunched-over they are, trying to see the details on a smaller screen. Extended use of a smaller screen is a health hazard.
my home directory, the average file size is 19,065,740 bytes (there's always a few tarballs sitting there waiting to be filed away...)
/usr/bin - 196.506.23 average file size - what good is an OS without programs?
/usr/lib - 173,865.68 bytes average - how can you run without libraries?
my download directory - 475,798,512 bytes average (there's a few linux dvd isos in there...
There are very few places on my system that would have an average size under 1k, unless you want to count symlinks - but they end up resolving to much larger files, so you end up with the same end result - sequential read speed is important, and especially so for large files (there's not much fragmentation on my system since I have LOTS of space).
The interesting thing about computers is that assumptions need to be continually adjusted as new realities come into being.
This year, we're seeing a serious convergence:
I have to admit, seeing that Wallyworld flier with a 17.3" laptop, dual core, 64 bit, 6 gigs of ram, 640 gig hd, 1600x900 screen - that's a game changer.
Most of the people I know would never have to delete a file if they kept it for 5 years. That seriously changes the way you work with a computer. And if it gets "full", less than $100 gets you a second internal drive. You could throw a gig a day onto it every day for 3 years and never delete anything. Empty the recycling bin? What recycling bin?
So what do you do with all that? One obvious thing is to get rid of your swap file - you no longer need it. 6 gigs not enough? Swap out the two 1-gig chips for 2 more 2-gig chips to bring it up to 8 gigs.
What else can you do? A ram disk for your temporary files is kind of obvious ... it would speed up a lot of things. So would a massive disk cache - much more than a non-volatile-ram SSD.
Building file versioning directly into the file system, rather than having applications manage it, is also an option. I don't mean journaling file systems, but true versioning, so that you can see the changes, and fork of any previous version to a new file, or revert changes, etc.
Cheap ram, multiple cores, and oodles of disk space - it's not the same "user space" it was even a couple of years ago. What are you going to do when a 8-core machine with a nice display, 16 gigs of ram, and 6 TB of disk space is at the same price point? By then, you're going to not just expect, but NEED the OS to be able to file things by a simple set of rules, since it's going to be like a closet that you throw everything into, and it magically hangs everything up in the right place.
Eventually, we'll go all solid-state, but rust has a good decade left in it - or more. After all, what if we didn't have to spin it to read/write tracks?. No more head crashes ...
-- Barbie
The same for /usr/lib - loads of files in the 6-figure size. There's also plenty of smaller files for html documentation (I have about 5,000 packages installed on this box), but they don't get read when you load a program library ...
Also, according to your own statements, any file over 512 bytes that was contiguous wouldn't benefit from a faster IOPS for the 2nd and subsequent sectors.
Take a file that's read in one shot at 20k. That's 40 sectors (5 blocks) that have been merged into one read. A drive that does that 100 times a second will show pretty much the same performance as an SSD reading those same files that's rated at 4000 IOPS - because in the SSD case, 39 out of 40 (more than 97%) of all reads aren't random.
So take my desktop, which normally runs with 4 drives. If all 4 drives have a crappy 100 random IOPS, that's still a combined 400 IOPS capacity. Now throw in that same multiplier, and they can compete with an SSD that does 16,000 IOPS,
Of course, the HD situation is actually better than theory, because since the drives are separated as to work, there's more likelyhood that files will not be fragmented (for example, a piece of a log file being written to a block between two pieces of an ISO), and this is what we see on /var/log, where the average continuous read is 70 blocks (560 sectors). So only 1 out of 560 sectors is a "random read". That's less than 2 tenths of 1 percent.
But let's forget all that and look at the near future.
People want their computers to work for them. They don't want to waste time organizing files, moving files, deleting files to free up space. With terabyte drives now pretty much the norm, people are going to end up with file systems that are completely free of fragmentation. After all, if you never delete anything, and there's always enough free space to save a file without fragmenting it when you edit it (and you keep track of "holes"), what's the problem?
People keep their systems for 3 to 6 years. It takes a LOT for the average user to fill up even a pair of 320 gig hard drives, and todays cheapie (sub-$500) laptops are shipping with 6 gigs of ram and 640 gigs on a single drive.
It's going to be like gmail - never delete a file. When your laptop gets full, buy another one with 4 times the storage for less.
4 years ago, this computer had twin 250 gig hds ... and they were getting kind of full, because of multiple backups on each drive. Quad 320 gig hds, well, there's still lots of space ... even with multiple copies of important stuff on multiple drives.
One of my friends bought a new PC last Christmas - quad core, 640 gig hd - and he added 3 x 1TB hard drives, for /, /var, and /home. He can add a gig of data a day, 5 days a week, for the next decade, never delete a thing, and not fill it up.
That's the future we're looking at - massive storage will change people's habits. People already can't file things properly, so they'll leave "finding stuff" to the computer.
And when their box gets "full up", they'll just go buy another one twice as fast and 4x as much storage for half the price, and keep on chugging along.
SSDs will have to be able to compete on price, because otherwise, user laziness and convenience trumps speed when it comes to the whole value proposition.
Besides, admit it, the thought of 4 terabytes in a laptop sometime in the next couple of years is just ... WOW! You want it just as much as I do :-)
And with file systems that large, we can integrate file versioning right into the OS, same as VMS. Think of it - outline.txt:1, outline.txt:2, etc. Simple, no special tools or programming required, available to all programs, and easy to do a diff between any two versions.
That's where the future is, and for the next while, it's not doable with SSDs.
The interesting thing is that not one of your files fits within one sector (512 bytes), so, since linux does a very good job of making sure files stay contiguous, even a one-block chunk (4k) is going to result in a read of 8 contiguous sectors.
Now that swap is fast becoming obsolete (gotta love Wallyworld starting a price war with FutureSh*t by selling 6gigRam/640gigHD laptops for $498) it's going to be cheaper just to buy a few extra gigs of real ram and make a ram disk.
IOPS count when making random reads/writes, and the most often to occur is for swap - after all, it's catch as catch can - there's no way to predict it - and it's called "thrashing" for a reason.
On sequential reads, when you don't have to move from track to track, and the file is contiguous, there is no advantage to SSDs. Even Intel admits as much (which is why they emphasize random reads/writes).
But even if every 4k block is fragmented (you'd have to work hard to do that), you still end up with the 8 sector at a time read for a 4k block, so you still end up with less of an advantage to SSDs than you'd think ..
Throw in a couple of huge (640gig or 750gig) hard drives, so the user NEVER has to delete anything, and all new files are written in one nice long data stripe, a couple of gigs for a ram disk (which is WAY faster than any SSD on the planet that is using non-volatile ram - flash memory cannot compete with system ram) and the balance tilts back to the "old tech."
Much as I don't exactly like Microsoft, they're finally getting it right - pre-loading the most used stripes off the drive into ram so they can be mapped into the right address space when the user wants them, instead of being loaded from disk. Write all that to one long data stripe for the next reboot, and your next boot has no random disk seeks. You just read that whole chunk into ram - almost like resuming after hibernating.
People want storage over speed. Jobs has it wrong. A computer with massive amounts of storage (so you never have to delete anything) is what people want. It's why so many like gmail - they never have to delete an email again.
SSDs aren't in that price range yet where most people can stuff a terabyte into a laptop.
But I have a better deal for you - instead of a SSD, why not invest the money in more real ram, and make a big ramdisk. That will be MUCH faster than any SSD on the planet, so you will get the same speed bump with a much smaller "disk" size.
Ever try running apps from a ramdisk? It's not fast - it's CRAZY fast.
So throw in an extra 2 gig to upgrade that $498 6 gig laptop to 8 gig of ram, devote 4 gig to a ramdisk, and go crazy. Just remember to copy the files to hd before shutting down :-) Faster, cheaper, quicker than an SSD - what's not to like?
Seriously, will you be that much more productive now that you have no time to grab a cup of coffee and say good morning to your co-workers? I doubt it.
And you could always help yourself by killing iTunes. Or at least make it run faster
VirtualBox=free.
Have you read Oracle's licensing FAQ?
Didn't see that one coming, did you?
look at those small files in /lib - they're symlinks to larger files. An attempt to read them results in reading the much larger file.
Now I understand that, to get everything to fit, and to keep things cheap, they can't have different keyboards for the larger models ... but that's a design flaw, the same as the "mighty mouse".
About 6-7 years ago, a laptop with a 50-60 gig hd and a half-gig of ram for $999 was considered a good deal. But 20 gigs of music and a dozen games would put a serious dent in it.
Around 3-4 years ago, a laptop with 200 to 250 gig hd and two gigs of ram for $800 was considered a good deal.
Today, $498 buys you a 17" laptop with a 640 gig hd, 6 gigs of ram, and a 1600x900 screen. They'll never delete anything again. When the drive gets full (it will), they'll just stuff a second one in the second drive bay.
When THAT gets full, quad cores with 12 gigs of ram, and 3 tb of disk space will be what, $399?
People can't manage large quantities of data. That's the real reason online email is popular - you never delete anything. It's why consumers want huge hard drives - they never have to worry about cleaning it up.
Jobs knows this - it's one reason iPads aren't upgradeable - you don't want to mess with your files? Fine - buy a new iPad with more features and a bigger storage capacity.
Of course, that backfires when the same consumer sees that, for the same $500 that bought an iPad with 16 gigs of space and you had to be careful not to fill it up, you can buy a laptop with 40x the space ... and doubled it's storage at a future date for under $100, no external drive required ...
Think of it - add a gig of data every day for 3 years without outgrowing it or deleting anything ... that's what consumers want. Brain-dead convenience. Organizing all that data? "That's the computer's job." And they're absolutely, 100%, right - it *is* the computer's job.
So someone trying to sell them on a 32gig SSD that costs more than a 640 gig hd ... they don't care if it takes a few seconds more to boot. They don't care if their word processor opens up a second or two quicker. They want storage. It's why gmail was such a success.
They also are getting older, so they want bigger displays. And they want others to be able to share those displays without having to crowd in. And their hands are getting older and maybe a bit arthritic, so they don't want the "one size fits all - if you're from Lilliput" keyboard on Apple laptops.
Start with a false assumption ... most modern file systems are smart enough to lay out data sequentially.
And most consumers I know want more disk space, not less. They don't want to bother ever having to clean up their drives. That's why on large drives, file fragmentation is not such an issue, even on dumb file systems (like the one you mentioned :-) Nobody ever deletes anything. Don't believe it? Go look at their desktop. It gets CRAZY! :-)
As for boot times, if you're only booting once a month (most people just suspend), who cares?
My office machines stay on 24/7 because one is also hosting the company wiki and a few test databases, and another is an iCrap that basically anyone can grab and use if they need to - it's used to hold a local copy of a bunch of files that we have backed up on a raid. Better you screw up the local copy :-)
My home desktop and my laptop get turned off, but this is for energy consciousness. Boot times there are also irrelevant - I turn them on, do a few things, then come back to them to check the weather, etc. They could take 10 minutes to boot, and it wouldn't make much of a difference.
So that's 70k average per read.
Quick way to tell - "ls -l", and see what your directory size is - it should be either 1k or 4k.
And I already have two monitors plugged in ... :-)
I have no problems finding uses for 3 computers when working. BTW - even google supplies their devs with multiple boxes.
You'd think that for the price, Apple wouldn't be so cheap as to use the smaller keyboard on their biggest model. Then again, Apple isn't about functionality. We saw that with the Mighty Mouse, the Hockey Puck Mouse, AntennaGate, the iMac keyboards - one of the guys at work was complaining about the poor design of his current ipod - it's like nobody actually tests these things in the real world, just in the Steve Jobs Multiverse with Non-Optional Reality Distortion Field.
You are forgetting: - Rotational latency - Directory access - File fragmentation - Swapping
Nope.
If you're still using swap, it's time you got a new computer. Swap is like disco - it's dead. And you don't need ATIME, so use NOATIME when formatting the fs.
File fragmentation - get a modern OS. Even a fragmented file will have large chunks that are laid out sequentially, so even in those scenarios, sequential read/write speed is more important than raw IOPS.
Directory entries are cached.
Rotational latency is almost non-existent - and IS non-existent for sequential reads. I think you meant track-to-track latency, which is also greatly reduced by large drives with large data stripes, big hardware caches, and splitting your data and apps among multiple drives.
Try this:
drive 0: / /srv
/var
drive 1: / home
drive 2:
drive 3:
Note - these are drives, not partitions. Any read on /home doesn't cause a seek away from the open log file on /var, or pollute that drive's cache.
Now throw in multiple cores ...
Your single SSD is now an (expensive) bottleneck in comparison, and a much smaller capacity.
The rust bin buckets are FAR from dead. Not when you can buy 30x the storage, with real-world performance that is just as good if not better, for the same price.
The cheapest ways to give an older machine new life are the same as always - more ram, bigger, faster drives, and a better video card. While an individual SSD will be faster, it will be much smaller for the same price, and not much faster, if at all, when you get into 4 or more drives.
Or do like I did -go look in /lib, where most of your programs actually live. The only files at 4k or under are symlinks and directory entries.
Or go look in someone's document directory. Ever see a word doc under 4k?
Ditto their music ...
In other words, for all the stuff people actually DO with computers, the speed of sequential reads and writes of large files is the most important factor, not IOPS.
And now that we no longer need swap files thanks to cheap ram, random IOPS is even less important. Maybe it's time for Intel and AMD to offer cpus that save some complexity and juice by no longer supporting physical swap.
No operating system in the last 2 decades has had an average file size of under 4k.
And then we have things like Word docs ... ever see one of those under 4k? Even an empty one? Your scenario never happens in the real world, where sustained throughput is more important.
BTW - the main advantage of high IOPS is for swapping - if you're still running a swap file, you need to upgrade. Even a cheap $498 laptop comes with 6 gigs of ram standard.
Intel's own stats only claim that SSDs are 4x the speed of hds in actual use - and only 2x the speed of a 15krpm drive. You can increase the effective speed of an array of HDs by removing most of the need to seek from track to track.
drive 0: / /srv
/home
drive 1:
drive 2:
drive 3: / var
No swap, and remember that NOATIME is your friend.
Each drive has its own 32meg cache, and since there's no cache pollution from reads on other drives, is more likely to get a hit. Also, the OS and drive both implement read-ahead as well as the elevator algorithm for head movement, resulting in further improvements in overall read and write speeds.
So even the 4-disk "ordinary" drive setup of today can equal the SSD - and 4 raptors will eat it for lunch.
Think of it - on your single SSD, copying a large file from one directory to another in /home kills it. On this setup, the other 3 drives are totally unaffected. A process that needs to load something from /lib doesn't have to share I/O with your file copy. Neither does the process writing to /var/log. And none of them care that someone is downloading from /srv/ftp.
Throw in a multi-core setup, and you can easily see the advantages over the bottleneck a single SSD has.
And then there's the price. For 1 TB, SSDs are 30x the price. No thanks.
So now, instead of IOPS, your primary goal is sustained throughput. A 4-drive setup gives you the same read/write throughput as an Intel X25 SSD (which claims 4x the throughput of a regular hd, and 2x the throughput of a fast hd, so the math is really simple), but much more bang for the buck.
Copying a large file in home no longer affects /srv or /var - and remember, each of the drives has a 32meg hdd cache. Combine that with look-ahead, elevator algorithm head movement, NOATIME, and you have a system where, unlike the single SSD, copying a file in /home to another directory has ZERO effect on the performance of the other drives.
This is great for web servers, because writes to the log file no longer generate much head movement, and reads to serve up data no longer move the heads away from the log file. Throw in that now, each drive is also much more likely to score a cache hit on it's particular data than would happen with one big 128meg hd cache, and it's not just a serious competitor to SSDs - if you need faster performance, you can beat that single SSD by a factor of two (Intel's numbers) by switching to 15k rpm HDs.
My point isn't that SSDs are bad - who wouldn't want one - IF ...? They have their advantages - but real-life performance on a dollar-for-dollar basis (or even 10-to-one basis) isn't there when you get near a terabyte, and even those $498 laptops and desktops in this weekends flier have 640gig hds and 6 gigs of ram.
IF they were comparable in price.
IF they were comparable in capacity.
THEN I'd use them. They're neither, so I don't, and I don't see anyone around me making the switch either. Not when each new machine is already so much faster than our previous one. "Good Enough Computing" - I'll spend the savings elsewhere.
-- Barbie
Everything benefits, from compile times to web serving - the drive is no longer the bottleneck - it's the rest of the system. So you end up with much larger capacities, and a compile or file copy in your /home drive/directory doesn't affect the web server on /srv, or the writing of the logs to /var/log/apache2/
The benefit of SSDs to be able to handle task swapping for virtual memory is now pretty much moot - $500 gets you a system with 6 gigs of ddr3. Remove your swap or page file - you don't need it, you don't need the overhead of managing it, and you won't see any disk thrashing - if Firefox or Openoffice or Opera become too much of a hog, they should be killed off and restarted anyway.
So, given the choice between a single SSD and 4 much larger rustbuckets, I pick the rustbuckets - the combined bandwidth exceeds the single SSD, much higher capacity for less money, when I buy bigger ones, the older ones become backups or can be gifted to someone else, and the performance and flexibility of a multi-drive setup is nice.
And yes, hard drives ARE getting faster - combine 4 drives with 32mb disk cache each, a hardware-implemented elevator algorithm, more data per track (so fewer seeks), and today's drives outperform previous drives by a wide margin.
Compatibility testing, backing up to the second machine, running my own svn/ftp/http/ssh servers (separate from the ones I run on my main machine), keeping personal stuff (email, etc) isolated on one machine, business email on another, these are all valid reasons to have a second computer (or in my case, since nobody else at the office wants to use the iCrap, a third). Having access to an extra machine so I can do a quick fix on a problem, rather than disturb the workflow on my current box, is also convenient.
My important points about 17" laptops, and ones everyone seems to have overlooked, are that
Yes, they weigh a bit more. The extra weight is irrelevant when it's sitting on a desk, which is where I work. I have yet to see anyone spend much time with a laptop actually on their lap, unless it was off/hiberating/sleeping/broken ... :-)
-- Barbie
Are you aware that all the Apple laptops (even the 11.6" MacBook Air) use the same size keyboard?
That is so sad. A keyboard without a number pad and full-size keys is not a very convenient keyboard. That's one of the reasons I like 17" laptops - the others being the bigger display and the room for a second HD.
I want it for work - not as a fashion accessory. If I want help generating typos, I'll get my dog to lick the keyboard - I don't need to spend several times the price for an undersized one.
my home directory, the average file size is 19,065,740 bytes
Given that you said "average", I presume you mean mean, which is not a good indicator of the most frequently present file. Median would be better, if you want to say "50% of my files are under this size".
No - I said average because I meant average.
Pretty much the only files under 4k in size were symlinks - and they point to much bigger files. The argument in favour of SSDs fails because pretty much any file access, even to a symlink, will end up reading one of those larger files, so sequential read speed definitely becomes a factor in almost every file access.
Just going to multiple hard drives (forget raid5 - that sucks the speed out of a system in comparison to either separate or mirrored drives) will give you all the speed advantages of SSDs, and a lot more storage space, for a lot less money.
So it's irrelevant if the median is 195,000 bytes, and the average is 250,000 bytes - they're BOTH well outside the original poster's claim that the OS uses mostly files under 4k. That has never been true, except perhaps under Microware's OS9 (not the same product at all as the much later mac 0S9), where modules could be made in a few hundred bytes of position-independent code, and the whole system could fit on two single-sided 160k floppies - quite an achievement for a multi-tasking OS that supported multiple text and graphical terminals.
Most people's home directories are exceptions to this - configuration files tend to be smaller, and so are startup scripts - but the code they call to run is much larger - and it has to be loaded into memory to run, which means it has to be read off the disk - so that 100-byte script can still end up resulting in several megs of disk activity.
It's just that it happens so fast, that people forget all that other activity.
SSDs are currently 30x the price per gig of storage when you get into the terabyte range..
Now tell me, are you going to be more productive with 4 crappy laptops or 1 really good laptop?
There are plenty of use cases where multiple computers not only beats a single one, especially if you're a developer - its pretty much mandatory. So yes, 4 $500 laptops (especially since 17", 4 gigs of ram, etc., is "good enough") is WAY more productive than even the best, most pimped out, mac air. Bbesides, they don't make a 17" mac air, and I won't work without a full-sized keyboard and a decent-sized screen - 17" is the minimum, even on a laptop.
I see too many people futzing around with smaller screens, and they don't realize how hunched-over they are, trying to see the details on a smaller screen. Extended use of a smaller screen is a health hazard.
-- Barbie
Loaded the results into a spreadsheet.
my home directory, the average file size is 19,065,740 bytes (there's always a few tarballs sitting there waiting to be filed away ...)
/usr/bin - 196.506.23 average file size - what good is an OS without programs?
/usr/lib - 173,865.68 bytes average - how can you run without libraries?
...
my download directory - 475,798,512 bytes average (there's a few linux dvd isos in there
There are very few places on my system that would have an average size under 1k, unless you want to count symlinks - but they end up resolving to much larger files, so you end up with the same end result - sequential read speed is important, and especially so for large files (there's not much fragmentation on my system since I have LOTS of space).