Slashdot Mirror


Optimizing Linux Systems For Solid State Disks

tytso writes "I've recently started exploring ways of configuring Solid State Disks (SSDs) so they work most efficiently in Linux. In particular, Intel's new 80GB X25-M, which has fallen down to a street price of around $400 and thus within my toy budget. It turns out that the Linux Storage Stack isn't set up well to align partitions and filesystems for use with SSD's, RAID systems, and 4k sector disks. There are also some interesting configuration and tuning that we need to do to avoid potential fragmentation problems with the current generation of Intel SSDs. I've figured out ways of addressing some of these issues, but it's clear that more work is needed to make this easy for mere mortals to efficiently use next generation storage devices with Linux."

207 comments

  1. Still too expensive... by Anonymous Coward · · Score: 0

    I really don't care about performance, when they retail for $400. Talk to me when I can get an 80GB one for under $50.

    1. Re:Still too expensive... by von_rick · · Score: 1

      The more people buy it the sooner it will get under $50. But considering the recent financial conditions, people would rather let others buy the SSD so that they can get it for $50 in August 2010. I'm afraid this time its gonna take longer than that to see a tenfold reduction in storage device costs.

      --

      Face your daemons!

    2. Re:Still too expensive... by the_humeister · · Score: 1

      I've considered getting a large capacity CF card (16 GB or 32 GB) to use as a solid state drive for my laptop. The CF + adapter combination is a lot cheaper than these new SSD. So why should I get a SSD vs. a CF card?

    3. Re:Still too expensive... by NekoXP · · Score: 4, Informative

      > So why should I get a SSD vs. a CF card?

      10 times better performance and wear-leveling worth a crap.

    4. Re:Still too expensive... by tinkerghost · · Score: 2, Informative

      So why should I get a SSD vs. a CF card?

      Your CF card is going to use the USB interface which maxes out at about 40Mbps as opposed to using an internal SSD's SATAII interface which maxes at 300Mbps. Not quite an order of magnitude, but close.

      On the other hand, if you're going to use an external SSD connected to the USB port, then you wouldn't see any difference between the 2 in terms of speed. Lifespan might be longer w/ the SSD due to better wear leveling, but in either case you're probably going to lose or break it before you get to the fail point.

    5. Re:Still too expensive... by Anonymous Coward · · Score: 5, Informative

      A real SSD has several advantages over using CF cards, but not for the reasons you state.

      With a simple plug adapter, CF cards can be connected to an IDE interface, so speeds won't be limited by interface speed. The most recent revision of the CF spec adds support for IDE Ultra DMA 133 (133 MB/s)

      A couple of additional points, just because I love nitpicking:
      - A USB 2.0 mass storage device has a practical maximum speed of around 25 MB/s, not 40 Mb/s.
      - The so-called SATA II interface (that name is actually incorrect and is not sanctioned by the standardization body) has a maximum speed of 300 MB/s, not Mb/s.

    6. Re:Still too expensive... by couchslug · · Score: 2, Interesting

      If it's an older laptop or the mechanical hard disk died, go for it. Addonics make SATA CF adapters so you are not restricted to IDE CF adapters.

      --
      "This post is an artistic work of fiction and falsehood. Only a fool would take anything posted here as fact."
    7. Re:Still too expensive... by karnal · · Score: 2, Informative

      Why is this informative? CF with an adapter is NOT USB.

      From my experience, using an adapter puts it on the native interface - notably, with CF, it's easiest to put the device into a machine that has a native IDE (not SATA) interface. CF is pin compatible with IDE.

      Now, in the current offering of SLC/MLC "drives" you can actually get better read/write since they "raid" for lack of a better term the internal chips. I'm using a transcend ATA-4 CF device that gets around 30MB/sec read/write in a machine in my garage; it's an SLC device that isn't their top of the line, but it was more cost-effective.

      So, using the IDE/ATA-4 interface on the CF card, it gets lower CPU utilization than a USB device. Still doesn't hit the 40MB/sec you quoted, but 40MB/sec is a pipe dream on USB in my experience.

      --
      Karnal
    8. Re:Still too expensive... by jimmyhat3939 · · Score: 1

      No doubt. But, I really think that within 5 years you're going to see most laptops using only an SSD.

      --
      Free Conference Call -- No Spam, High Quality
    9. Re:Still too expensive... by Dr.+Ion · · Score: 2, Informative

      Your CF card is going to use the USB interface

      This is Informative?

      CF cards are actually IDE devices. The adapters that plug CF into your IDE bus are just passive wiring.. no protocol adapter needed.

      It's trivial to replace a laptop drive with a modern high-density CF card, and sometimes a great thing to do.

      The highest-performance CF cards today use UDMA for even higher bandwidth.

      HighSpeed USB can't reasonably get over 25MB/sec from the cards using a USB-CF adapter, but you can do better by using its native bus.

    10. Re:Still too expensive... by KingMotley · · Score: 1

      Except that the lastest gen SSD's exceed 250MB/sec throughput. If the latest CF spec just added 133MB/sec, then that would be a huge bottleneck in throughput.

    11. Re:Still too expensive... by pla · · Score: 2, Informative

      So why should I get a SSD vs. a CF card?

      CF works passably in WORM-like scenarios, where you basically use it in read-only mode and update it rarely and in big chunks. For random R/W access, CF lacks wear leveling to give it a tolerable life expectancy... Thus you commonly see it used in embedded devices such as routers and dumbterms where you may update the firmware or OS every few months; You don't see it used much in real, live writable FSs.

      It also tends to have rather poor performance, with reads in the sub-5MB/s range and writes taking forever. So again, using a 32MB CF to boot a router, works great; Using a 32GB CF as the system partition for a modern desktop PC (even with some solution to the limited erase lifetime, such as a UnionFS against a ramdisk with commit-on-shutdown), you can expect 10+ minute boot times.

    12. Re:Still too expensive... by Anonymous Coward · · Score: 0

      Your CF card is going to use the USB interface which maxes out at about 40Mbps as opposed to using an internal SSD's SATAII interface which maxes at 300Mbps. Not quite an order of magnitude, but close.

      Actually, that *is* an order of magnitude difference!

    13. Re:Still too expensive... by Mattsson · · Score: 2, Informative

      Your CF card is going to use the USB interface which maxes out at about 40Mbps as opposed to using an internal SSD's SATAII interface which maxes at 300Mbps. Not quite an order of magnitude, but close.

      There are three factual errors in that statement.
      1. CF-cards can be connected directly to the ATA-port via a simple passive connector-adapter and therefor have a theoretical maximum transfer speed of 133MB/s, which roughly translates to 1300Mbps. There's even adapters with room for both a master and slave CF-card in the same shape, size and connector position as a 2.5" ATA drive, specifically made to use CF-cards in laptops.
      2. USB is 480Mbps.
      3. SATA is 3000Mbps

      The big speed-difference between SSD and CF is due to the construction of the devices themselves, not the interface that connects them to the computer.
      A fast CF-card can get you around 40MB/s and at the moment they also top out at 32GB sizes and they're not made to handle long term random write operations.
      A fast SSD can get you all the way to the theoretical maximum of SATA, around 300MB/s, and are available in much bigger sizes.

      --
      /.Mattsson - My native language is not English, so please don't whine over linguistic errors. (That's lame anyway...)
    14. Re:Still too expensive... by drinkypoo · · Score: 2, Insightful

      The modern hot-shit high-speed CF cards have wear leveling and do UDMA transfers, you get a CF to ATA adapter, not CF to USB, and they will outperform most hard disks.

      --
      "You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
    15. Re:Still too expensive... by DrSkwid · · Score: 1

      You can also get CFSATA adapters. I have a couple here

      --
      There are places where the networks are not touching,and there are places where they are-Boeing's Lori Gunter
    16. Re:Still too expensive... by karnal · · Score: 1

      Yea, I just wanted to stress the fact that it's not USB more than anything; haven't tested personally the CF-->SATA bridges. They work well?

      --
      Karnal
    17. Re:Still too expensive... by cheater512 · · Score: 1

      Its not the volume of supply which is causing the high prices.
      They are inherently expensive to make with todays methods.

    18. Re:Still too expensive... by cheater512 · · Score: 0, Redundant

      Erm he's talking about a CF to IDE adaptor not a CF to USB adaptor.

      CompactFlash uses essentially a IDE pinout.

    19. Re:Still too expensive... by DrSkwid · · Score: 1

      The benchmark comes down to the individual CF card, so works well = works. I had them as a software striped RAID for a while to make write speed > 25Mb/s (firewire dv video speed) and achieved that. Sadly my firewire camera uses the protocol that Linux doesn't so that project is on the shelf.

      --
      There are places where the networks are not touching,and there are places where they are-Boeing's Lori Gunter
    20. Re:Still too expensive... by karnal · · Score: 1

      Well I've noticed that I don't have a CF-IDE adapter that does 3.3v, so I'm being limited to ATA-4 instead of ATA-5 on my current card. Have you noticed/read anything similar?

      --
      Karnal
  2. Mere mortals need mroe toy budget by wjh31 · · Score: 4, Insightful

    I think the bigger challenge will be in getting mere mortals to have a $400 toy budget to afford the SSD

    1. Re:Mere mortals need mroe toy budget by KibibyteBrain · · Score: 2, Insightful

      Well, they will obviously go down in price eventually. The real price issue won't be affordability but rather value. Do most consumers out there really want a what would seem to average out to slightly faster drive, or an order of magnitude or two more storage? There have always been fast drive solutions in the past and they have never been very popular, and quickly become obsolete. Eventually some sort of SSD will take over the market, but I don't believe this sort of compromised experience business model will sell them, unless cloud storage and internet everywhere becomes mainstream fast.

    2. Re:Mere mortals need mroe toy budget by Average · · Score: 1

      Sure. There are *lots* of considerations beyond speed to want SSDs.

      First is battery life. Batteries suck. Laptops pulling 5 or 6 watts total make that suck more bearable. SSDs are part of that.

      There's also noise. Hard drives have gotten much quieter. But in a dead-silent conference room, I want dead-silence.

      Even form-factor is an issue. a 2.5" cylinder is a notable chunk of a small notebook. 1.8" drives are, generally, quite slow. SSDs can be worked into design.

    3. Re:Mere mortals need mroe toy budget by piripiri · · Score: 3, Informative

      Sure. There are *lots* of considerations beyond speed to want SSDs

      And SSD drives are also shock-resistant.

    4. Re:Mere mortals need mroe toy budget by amclay · · Score: 1

      Obviously he needs to overclock his SSD. That would be epic.

      --
      It's all fun and games till someone divides by 0. Then it's hilarious.
    5. Re:Mere mortals need mroe toy budget by berend+botje · · Score: 1

      For dead-silence you might be better off with getting a LED backlight. In my laptop I can't hear the hard drive over the whine of the backlight converter.

    6. Re:Mere mortals need mroe toy budget by Anonymous Coward · · Score: 0

      yes

    7. Re:Mere mortals need mroe toy budget by Anonymous Coward · · Score: 0

      News Flash: your backlight converter is busted.

    8. Re:Mere mortals need mroe toy budget by Anonymous Coward · · Score: 2, Interesting

      As other components become less noisy, the "solid state" electronics' acoustic noise becomes audible. It isn't necessarily faulty electronics, just badly designed with no consideration for vibrations due to electromagnetic fields changing at audible frequencies. These fields subtly move components and this movement causes the acoustic noise. Most often it is a power supply or regulation unit which causes high pitched noises. Old tube TV sets often emit noise at the line frequency of the TV signal (ca. 15.6kHz for PAL, ca. 15.8kHz for NTSC).

    9. Re:Mere mortals need mroe toy budget by bcrowell · · Score: 1

      Sure. There are *lots* of considerations beyond speed to want SSDs.

      Another example: I have a tiny NSLU2 network appliance that I use as a music server. In the out-of-the-box configuration, it runs Linux from a ROM, but you can add an external drive via a USB cable and boot Linux off of that. It doesn't have SATA, so that wasn't an option.

      I'm not sure why this guy paid $400 for an 80 Gb SSD. I just upgraded my music server to a 64 Gb SSD, and it only cost $100. Maybe the one he got is a fancier, faster drive?

      For my application, none of these filesystem performance things are really an issue at all. I'm almost always reading, almost never writing, and the bottleneck for speed, when there is one, is always the ARM CPU.

      It's great if people enjoy tinkering with the latest technology, but the impression I get is that this just isn't the right time to be switching your desktop machine to SSD. Price per Gb is going down rapidly, but is still very poor compared to platters. Performance with SSD technology is potentially much better than with platters, but it will probably be a few more years until (a) operating systems are optimized for SSDs, and (b) all the drives on the market are really optimized for performance the way they should be.

    10. Re:Mere mortals need mroe toy budget by berend+botje · · Score: 1

      Thanks for the info. However, it seems most converters are busted, as I can hear them on quite a lot of laptops or tft screens.

      I don't mean to frighten you, but perhaps you should have your ears checked next time you get a physical. If you've spend considerable time around heavy machinery or loud music, it might be you have lost the ability to hear high pitched sounds. As this goes gradually, it isn't generally noticed.

      Really, get it checked out and (when applicable) change your habits regarding to exposure to loud sounds.

    11. Re:Mere mortals need mroe toy budget by Hatta · · Score: 1

      You can buy a 32GB SSD for less than $100 today. Is that within the budget of mere mortals?

      --
      Give me Classic Slashdot or give me death!
    12. Re:Mere mortals need mroe toy budget by andreyvul · · Score: 1

      By old, you mean every single CRT TV.
      I've heard the 16kHz whine whenever I mute the sound.
      CRT monitors are exempt because VGA line frequency is > 22 kHz.

      --
      proud caffeine whore
    13. Re:Mere mortals need mroe toy budget by rcw-home · · Score: 1

      I'm not sure why this guy paid $400 for an 80 Gb SSD. I just upgraded my music server to a 64 Gb SSD, and it only cost $100. Maybe the one he got is a fancier, faster drive?

      Price/GB for SSDs seems to be largely proportional to the number of write operations per second the SSD can handle. Once a handful of manufacturers solve that particular puzzle, I expect prices will drop significantly.

    14. Re:Mere mortals need mroe toy budget by Anonymous Coward · · Score: 1, Insightful

      Well maybe you should check who the story submitter is.
      If he doesn't "have the time to optimize it", we're in deep trouble :-)

    15. Re:Mere mortals need mroe toy budget by beaviz · · Score: 1

      Sure. There are *lots* of considerations beyond speed to want SSDs

      And SSD drives are also shock-resistant.

      But... Are they resistant to shouting?

    16. Re:Mere mortals need mroe toy budget by Anonymous Coward · · Score: 0

      I think the bigger challenge will be in getting mere mortals to have a $400 toy budget to afford the SSD

      Agreed, my whole last computer (with monitor and printer included) cost around $400.

    17. Re:Mere mortals need mroe toy budget by gmuslera · · Score: 1

      Not all SSDs are equal. Why you should pay US$400 for the Intel X25-M if you can get another for under US$100? Check this AnandTech review, that spent a lot of time bashing JMicron JMF602 based SSDs.

    18. Re:Mere mortals need mroe toy budget by DrSkwid · · Score: 1

      I can't hear over 9114Hz you insensitive clod!

      --
      There are places where the networks are not touching,and there are places where they are-Boeing's Lori Gunter
    19. Re:Mere mortals need mroe toy budget by DrSkwid · · Score: 1

      I was born 2 weeks prem, consequently I can't hear over 9114Hz. Which I didn't find out until I was working in a music studio and the other people started shouting at me to turn that feedback off. "What feedback?" was all I could say and turned the amps off.

      And that was the end of that chapter.

      --
      There are places where the networks are not touching,and there are places where they are-Boeing's Lori Gunter
    20. Re:Mere mortals need mroe toy budget by DrSkwid · · Score: 1

      Seeing as he has a bit of money he could afford someone to teach him the difference between then and than.

      --
      There are places where the networks are not touching,and there are places where they are-Boeing's Lori Gunter
    21. Re:Mere mortals need mroe toy budget by MooUK · · Score: 2, Funny

      Surely, if you can't hear over 9kHz, that makes you the insensitive one?

    22. Re:Mere mortals need mroe toy budget by MooUK · · Score: 1

      You weren't running the karaoke night at a pub I was in the other night, were you?

    23. Re:Mere mortals need mroe toy budget by iminplaya · · Score: 1

      CRT monitors are exempt because VGA line frequency is > 22 kHz.

      Well, that explains all the dogs howling and loitering around outside the house.

      --
      What?
    24. Re:Mere mortals need mroe toy budget by Anonymous Coward · · Score: 0

      So, feel like an idiot yet?

    25. Re:Mere mortals need mroe toy budget by WuphonsReach · · Score: 1

      MLC drives (with only a few tens of thousands of rewrite cycles) are cheap. Most thumbdrives are MLC, and all of the inexpensive SSDs are MLC.

      SLC drives (with hundreds of thousands or millions of rewrite cycles) are expensive. These are basically the server-quality hardware (like IDE vs SCSI used to be).

      That's the big difference in pricing. The other difference is the quality of the internal drive logic.

      --
      Wolde you bothe eate your cake, and have your cake?
  3. Hyperinflation to the rescue by Anonymous Coward · · Score: 1, Funny

    Your government is working towards it.

    1. Re:Hyperinflation to the rescue by Anonymous Coward · · Score: 0

      Same as saying "Not A Chance In Hell", but i do hope it does get cheaper though.

  4. Agreed .. But equally important is ... by Anonymous Coward · · Score: 1, Interesting

    Yes, we do need progress in that area. However, for many of us who require better-than-average data security, the matter of SSD's read/write behaviour makes the devices extremely vulnerable to analyses and discovery of data the owner/author of which believes to be inaccessible to others: 'secure wiping', or lack thereof, is the issue. As i understand it, 'secure wiping' programs fail to do their job, on SSD's . It's been reported among 'criminals' that SSD's are a 'forensic analyst's dream come true' ! and so it must be for corporate spies, etc,, who have a yen for theft of private data.

    1. Re:Agreed .. But equally important is ... by ultrabot · · Score: 2, Informative

      However, for
      many of us who require better-than-average data security, the matter of SSD's read/write behaviour makes the devices extremely vulnerable to analyses and discovery of data the owner/author of which believes to be inaccessible to others: 'secure wiping', or lack thereof, is the issue.

      Obviously you should be encrypting your sensitive data.

      Also, it should be no problem to write a bootable cd/usb that does a complete wipe. Just write over the whole disk, erase, repeat. No wear leveling will get around that.

      --
      Save your wrists today - switch to Dvorak
    2. Re:Agreed .. But equally important is ... by Antique+Geekmeister · · Score: 3, Insightful

      Such tools already exist. Even the venerable "dd if=/dev/zero of=/dev/sda" is extremely efficient at flushing a drive well beyond the ability of any but the most well-equipped recovery services, and it's a lot faster than the "overwrite with zeroes, then ones, then 101010..., then 010101..., then random data" approach used by some people with too much time on their hands and too much paranoia for casual data.

    3. Re:Agreed .. But equally important is ... by Kjella · · Score: 1, Informative

      Also, it should be no problem to write a bootable cd/usb that does a complete wipe. Just write over the whole disk, erase, repeat. No wear leveling will get around that.

      At least for OCZ drivers, the user capacity is several gigs lower than the user capacity, like 120GB to 128GB. I don't know about your data but pretty much can ble left in those 8GB. The only real solution is to not let sensitive data touch the disk unencrypted.

      --
      Live today, because you never know what tomorrow brings
    4. Re:Agreed .. But equally important is ... by raynet · · Score: 3, Informative

      Unfortunately flash SSDs usually have some percentage of sectors you cannot directly access, these are used for wear leveling and bad sector remapping. So when you dd with /dev/zero, it is quite possible that some part of the original data is left intact. And there can be quite alot of those sectors, I recall reading on one SSD drive that had 32GiB flash in it, but had 32GB available for the user, so 2250MiB was used for wear leveling and bad sectors (helps to get better yealds if you can have several bad 512KiB cells).

      --
      - Raynet --> .
    5. Re:Agreed .. But equally important is ... by WNight · · Score: 1

      Agreed. And not just SSDs. Regular HDs remap sectors if they think they're failing. But usually they do so without you noticing a failure, which means that an almost perfectly readable copy of that sector has simply been remapped. No amount of overwriting will ever hit that sector because the drive is sure it's doing you a favor.

      The info is still there, just a few debug commands away.

    6. Re:Agreed .. But equally important is ... by WNight · · Score: 1

      Yes, dd, especially with random data, is pretty much as secure as any commercial product. But they all fail to touch the hidden blocks the drive has remapped because of potential failure.

    7. Re:Agreed .. But equally important is ... by RiotingPacifist · · Score: 1

      then nuke the disk from orbit, this approach is the only way to be sure.

      --
      IranAir Flight 655 never forget!
    8. Re:Agreed .. But equally important is ... by jo42 · · Score: 1

      nuke from orbit

      Problem with that is you might miss. Better to strap the disk right to the nuke and run far away very fast.

    9. Re:Agreed .. But equally important is ... by couchslug · · Score: 1

      " 'secure wiping', or lack thereof, is the issue. "

      The desire to wipe with software instead of the trivial amount of effort to physically smash and/or incinerate the media is the issue.

      Compared to important data, media costs are trivial. Wipe media by destroying it thoroughly and you won't have to wonder about forensic recovery. Drive shredders and the like are spiffy, but a few dollars worth of common hand tools can destroy any drive.

      --
      "This post is an artistic work of fiction and falsehood. Only a fool would take anything posted here as fact."
  5. No Money, Mo Problems by flakblas · · Score: 1

    I know right? Send some cheddar my way Mr. Gates.

    1. Re:No Money, Mo Problems by larry+bagina · · Score: 3, Funny
      No worries. Once Barack Obama(1) pays for your house and car, he'll pay off your credit card bills.
      1. future generations of Americans
      --
      Do you even lift?

      These aren't the 'roids you're looking for.

    2. Re:No Money, Mo Problems by flakblas · · Score: 1

      Nice.

    3. Re:No Money, Mo Problems by Anonymous Coward · · Score: 0

      It is. From the way Americans spend money they don't have to buy things they want to talk about owning around the water cooler the President is doing exactly what Americans want him to do.

      Of course, the Republicans will shriek, but just what is their platform:

      1. Fiscal Responsibility (historical). You've got to be kidding.

      2. Freedom from overly intrusive government (historical). You've got to be kidding (and, yes, I do know we're "at war".)

      3. Put in place a cadre of puppet sycophants that will establish American government as an Evangelical Theocracy. Bingo.

      Of course, the detractors will trot out a chorus line of sad families in truly desperate straits through no fault of their own. But as much as people might not think so, these cases are the minority. But why should the Republicans suddenly care about co-lateral damage? They never have before; here or in other countries.

      The President is doing anything he can think of to kick start the economy from the ground up. The stated alternative of 'companies should pay no taxes' and 'banks can falsify assets' is what put us here, aided and abetted by a stupid, selfish populace. Reganomics works as well as Communism, it's a good thing they are both going to die. America is richer than the ol' CCCP so Reganomics is taking longer to expire. Probably needs a head-shot.

      Please try and remember that Obama is well right of the Welfare State the Repubs are screaming about. (I do wish he'd get rid of that bitch Pelosi though. She's bi-partisanship's Typhoid Annie.)

  6. Re:Mere mortals need more toy budget by Carrion+Creeper · · Score: 1

    I for one hope he is successful so that when SSDs become more affordable, or even the default, Linux will be nicely optimized.

  7. Is it only linux? by jmors · · Score: 4, Interesting

    This article makes me wonder if any OS is really properly optimized for SSDs. Has there been any analysis as to whether or not windows machines properly optimize the use of solid state disks? Perhaps the problem goes beyond just linux?

    --
    The Matrix is real... but I'm only visiting!
    1. Re:Is it only linux? by Jurily · · Score: 2, Informative

      unfortunately the default 255 heads and 63 sectors is hard coded in many places in the kernel, in the SCSI stack, and in various partitioning programs; so fixing this will require changes in many places.

      Looks like someone broke the SPOT rule.

      As for other OSes:

      Vista has already started working around this problem, since it uses a default partitioning geometry of 240 heads and 63 sectors/track. This results in a cylinder boundary which is divisible by 8, and so the partitions (with the exception of the first, which is still misaligned unless you play some additional tricks) are 4k aligned.

    2. Re:Is it only linux? by Anonymous Coward · · Score: 0

      That's not the way advances in compatible systems are achieved. First the new technology has to be fast enough to be a drop-in replacement for its predecessor. After a while, other parts of the system adapt to the changed characteristics of the whole and thereby realize more of the potential which came with the new technology.

    3. Re:Is it only linux? by mxs · · Score: 2, Informative

      Of course it goes beyond just Linux. Microsoft is aware of the problem and working on improving its SSD performance (they already did some things in Vista as the article states, and Windows 7 has more in store; google around to find a few slides from WinHEC on the topic).

      The problem with Windows w.r.t. optimizing for SSDs is that it LOVES to do lots and lots of tiny writes all the time, even when the system is idle (and moreso when it is not). Try moving the "prefetch" folder to a different drive. Try moving the system log event files to a different drive. And try to keep an eye out for applications that use the system drive for small writes, extensively (or muck about in the registry a lot). These are the hard parts. The easier parts would be to make sure hibernation is disabled, pagefiles are not on the SSD (good luck in getting Windows to not use pagefiles at all; possible, but painful even if you have a dozen gigs of memory), prefetching is disabled, the filesystem is properly aligned, printer spools, etc. With only the things Windows provides, it is painful to attempt to prolong your SSD's life (this is not just about performance; remember that you only have a limited amount of erases until the drive becomes toast).

      There are some solutions; MFT for Windows (http://www.easyco.com/) provides a block device that consolidates many small writes into larger ones and does not overwrite anything unless absolutely necessary (i.e. changes are written onto the disk sequentially; overwriting only takes place once you run out of space). It is very, very costly, but it does its job well. Performance skyrockets, drive longevity improves by an order of magnitude.

      You can also use hacks such as Windows SteadyState; This also streamlines writes (but adds another layer of indirection). Performance improves, but you get to deal with SteadyState-issues. EFT also works (and is less of a GUI-y system, though largely providing the same services even on Windows 2000/XP); you have got to be careful though, if your system tends to lose power or crash, all the changes since the last boot will be lost; EFT can be made to write out all the changes it has accumulated -- but after that, the only way to reenable it is to restart the system.

      Windows is not particularly nice to SSDs when used as a system disk. For data partition it is not quite as bad (although if you deal with many small writes, you might still run into heaps of trouble). The optimizations related here for Linux are applicable to Windows as well (aligning filesystem blocks to erase-blocks and 4k nand-sectors). You would also want to attempt to move stuff that does lots of small writes to a different (spinning) disk -- system logs, for instance, and most spool directories. You'd also want to make absolutely sure that you do not have access time updates enabled; each of those is, essentially, a write (even if ultimately consolidated).

    4. Re:Is it only linux? by Anonymous Coward · · Score: 0

      Apparently Vista, Windows 7, and OS X automatically align the partition when formatting a disk. XP needs to be tweaked by hand.

    5. Re:Is it only linux? by NekoXP · · Score: 5, Insightful

      Yeah, hard disk manufacturers.

      Since they moved to large disks which require LBA, they've been fudging the CHS values returned by the drive to get the maximum size available to legacy operating systems. Since when did a disk have 63 heads? Never. It doesn't even make sense anymore when most hard disks are single platter (therefore having 1 or 2) and SSDs don't even have heads.

      What they need to do is define a new command structure for accurately determining the best structure on the disk - on an SSD this would report the erase block size or so, on a hard disk, how many sectors are in a cylinder, without fucking around with some legacy value designed in the 1980's.

    6. Re:Is it only linux? by __aardcx5948 · · Score: 1

      Yes. The soon-to-be-released OCZ Vertex is discussed in this forum, with a poll from an OCZ guy on how the firmware will be optimized... many IO/s or many MB/s? http://www.ocztechnologyforum.com/forum/forumdisplay.php?f=186 Partition alignment is important, as is some registry tweaks. Disable prefetch and search indexing, probably some other services that are useless and/or just waste the SSD's life span instead of enhancing performance.

    7. Re:Is it only linux? by eharvill · · Score: 1

      Does that include the system drive or just the data drives?

      --
      At night I drink myself to sleep and pretend I don't care that you're not here with me
    8. Re:Is it only linux? by Anonymous Coward · · Score: 1, Interesting

      Somebody please mod the parent up to 5.

      Yeah, hard disk manufacturers.

      Since they moved to large disks which require LBA, they've been fudging the CHS values returned by the drive to get the maximum size available to legacy operating systems. Since when did a disk have 63 heads? Never. It doesn't even make sense anymore when most hard disks are single platter (therefore having 1 or 2) and SSDs don't even have heads.

      What they need to do is define a new command structure for accurately determining the best structure on the disk - on an SSD this would report the erase block size or so, on a hard disk, how many sectors are in a cylinder, without fucking around with some legacy value designed in the 1980's.

      With the drive electronics as complex as they are nowdays you'd think the OS wouldn't need to know much. Just give it a couple of stats to allow the file system to align properly and stop with all this CHS translation.

    9. Re:Is it only linux? by HartDev · · Score: 1

      Do you think that this will be addressed in Linux before it will anything else? MS has not been on the ball twice now, Windows ME and Vista. And Linux is being used a lot in servers, which wouldn't really need a solid state drive as much as a laptop would.

      --
      To see a few of my Android apps goto: www.hartwired.com
    10. Re:Is it only linux? by maxume · · Score: 1

      Wear leveling algorithms would have to be nearly brain dead for a few megabytes a minute to kill a disk with 100 gigabytes and 100,000 write cycles (that's billions of minutes, which is hundreds of years).

      Parsimony suggests that optimizing for the characteristics of the device is a good idea, but SSD wear isn't something that desktop users even need to think about.

      --
      Nerd rage is the funniest rage.
    11. Re:Is it only linux? by Dr.+Ion · · Score: 1

      A bigger problem is our reluctance to move off 512-byte sectors. Who needs that fine granularity of LBA?

      That's two sectors per kilobyte.. dating back to the floppy disk. And we still use this quanta on TB hard disks.

    12. Re:Is it only linux? by mxs · · Score: 1

      Sorry, but you are glossing over something here -- it's not the "megabytes per minute" thing that bothers, it's the "many small writes" thing. Even the very best wear leveling algorithm can't do much about that, unless they use a write cache (which most SSDs do not; I do not know the exact procedere of the Intel offering (which is ahead of its competitors at the moment), but I would be somewhat surprised if the chip waited overly long to commit). A one-byte-write will, in the worst case, cause an entire 128kb slice to be overwritten. This can quickly become significant, especially considering that the operating system does not necessarily attempt to group writes, likes to overwrite data in sectors, and even sometimes will insist on flushes (think security logfiles) even if you had the foresight to enable write caching -- which still won't help much if you have multiple open files.

      Desktop users in particular need to think about this stuff. Observe your Windows page file. Observe how it gets used NO MATTER how much free memory you have. (don't believe it ? Open 10 windows. Go away from your computer for 10 minutes. Optionally start a large I/O job (say, copying a few gigs of files), though this is not usually necessary and just serves to illustrate the point more clearly. Switch between those 10 windows. No matter whether you have 512m, 2g, or 10g of free memory, they will have been paged to disk. Yes, it's braindead, and yes, it happens at default settings.
      There are some diagnostic tools which will tell you the average rate of rewritten slices on SSDs; I don't have it handy atm, but Google will find it (or failing that, the OCZ forums are pretty nifty for this kind of thing, even though their SSDs are somewhat inferior to the Intel offering).

      You might also want to run procmon on your Windows system in an ostensibly idle system (possibly with a few open applications) and see what actually happens at the file system level.

    13. Re:Is it only linux? by tonyr60 · · Score: 4, Informative

      Sun's new 7000 series storage arrays use them, and that series runs OpenSolaris. So I guess Solaris has at least some SSD optimisatioons... http://www.infostor.com/article_display.content.global.en-us.articles.infostor.top-news.sun_s-ssd_arrays_hit.1.html

    14. Re:Is it only linux? by aminorex · · Score: 1

      There is no major OS that makes anything remotely like an appropriate use of persistent RAM. SSD is one application of persistent RAM, but it's a terrible one, which ignores most of the benefits of persistent RAM. I want to treat flash as heirarchical memory, not as disk. I want the OS to support me not with inconsequential filesystem optimizations, but by implementing cache-on-write with an asynchronous write-back queue for mapped flash memory. I want to map allocated regions of a terabyte flash array into my 64-bit address spaces. It's not as easy as it should be to do these things: You mostly have to roll your own, and integrating with VM basically requires deep kernel magic. The situation stinks on ice.

      --
      -I like my women like I like my tea: green-
    15. Re:Is it only linux? by Hal_Porter · · Score: 1

      There are ATA commands to determing SSD geometry - erase unit size and so on. You can mark sectors as unused too, which helps with wear levelling.

      --
      echo -e 'global _start\n _start:\n mov eax, 2\n int 80h\n jmp _start' > a.asm; nasm a.asm -f elf; ld a.o -o a;
    16. Re:Is it only linux? by Hal_Porter · · Score: 1

      Well, no you don't. The ATA command set allows multiple sector writes. Most filesystems will use a cluster size that is bigger than one sector. In that case you're very close to having a sector size that isn't 512 bytes.

      --
      echo -e 'global _start\n _start:\n mov eax, 2\n int 80h\n jmp _start' > a.asm; nasm a.asm -f elf; ld a.o -o a;
    17. Re:Is it only linux? by Anonymous Coward · · Score: 0

      As for other OSes:

      Vista has already started working around this problem, since it uses a default partitioning geometry of 240 heads and 63 sectors/track. This results in a cylinder boundary which is divisible by 8, and so the partitions (with the exception of the first, which is still misaligned unless you play some additional tricks) are 4k aligned.

      Great. Now some manufacturers may optimize SSD drives for the common case of single unaligned partition, while others may optimize for aligned partitions.

    18. Re:Is it only linux? by maxume · · Score: 1

      I don't contest that there are lots of unnecessary writes. Things can be improved. Still, for a 64 GB drive, the thing will become very replaceable well before it dies from windows disk noise. I guess needing to replace such a key component after a few years is a little bit troublesome, but the replacement will surely last essentially forever (especially if you assume that software improves). Even a drive that is quite full will last several years, especially for people who leave their computer sleeping 2/3 of the time (in the best case, the wear leveling moves static data around, eliminating the free capacity factor, in worse cases, 10 GB still provides for an awful lot of writes).

      I'm not going to run out and spend $400 for a 100 GB disk, but $250 for a 250 GB disk is in sight, and it won't hurt all that bad to spend $300 for a 500GB disk 2 years after that. So, is it a possible source of problems? Yes. Is it a likely, expensive problem? No.

      --
      Nerd rage is the funniest rage.
    19. Re:Is it only linux? by c0t0d0s0.org · · Score: 2, Informative

      You should look at the L2ARC and seperated ZIL features at ZFS in Solaris and Opensolaris. It used the SSD in the way you want it.

    20. Re:Is it only linux? by WuphonsReach · · Score: 1

      A bigger problem is our reluctance to move off 512-byte sectors. Who needs that fine granularity of LBA?

      There are already plans afoot to move to a 4096 byte sector size. They started a few years ago, and might have even made /. front page in '07 or '08.

      --
      Wolde you bothe eate your cake, and have your cake?
  8. Ironically I was just going out to buy a small one by earthforce_1 · · Score: 3, Informative

    If I mount /home on a separate drive, (good to do when upgrading) the rest of the Linux file system fits nicely on a small SSD.

    --
    My rights don't need management.
  9. Re:SSD's should have no problem with fragmentation by von_rick · · Score: 3, Insightful

    From economics, lets turn our attention to optimizing this toy of ours. The thing with SSDs is that they don't have a read/write head to worry about. This means that no matter where the data is stored in the device, all we need to do is specify the fetch location and the logic circuits select that block to extract the data from desired location. From what I've heard, the SSDs have an algorithm to actually assign different blocks to store the data so that the memory cells in a single locations aren't overused.

    --

    Face your daemons!

  10. Re:SSD's should have no problem with fragmentation by Anonymous Coward · · Score: 0

    Yes, that's true. But the important thing is ensuring that the OS/filesystem breaks the data up into appropriate sized chunks that match up with the block size that the disk controller uses. This has nothing to do with fragmentation.

  11. Forget Disk paradigm by Anonymous Coward · · Score: 0

    Why not use it as a 'permanent' ram. I'll be more than happy with only an enormous Hashmap on it. Just an easy api to handle it.
    Forget about using it as a disk, it's not.

  12. Toy budget by conureman · · Score: 1

    Most of us can't afford to worry about this, but does the Fusion-io suffer from this issue?

    --
    The cost of that cleanup, of course, will be borne by taxpayers, not industry.
  13. What is teh specific issue? by DJRumpy · · Score: 1

    Surely it's not the block size. I know nothing about filesystems beyond basics. Windows could specify the block size to be used. I assumed that Linux did the same? I have no idea about OS X either.

    Are there standard block sizes in use for Linux and OS X filesystems? Can they be modified when they are formatted? If so, and the issue really is due to blocksize and fragmentation as a result, this would seem like an easy fix. Linux and OS X already resist fragmentation. I won't speak to MS's efforts there as they state NTFS does, but the implementation seems to be very different in the real world.

    Some of you FS guru's fill us in here. How hard is it to implement something like variable block sizes, or to allow you to specify block size at format time?

    1. Re:What is teh specific issue? by badkarmadayaccount · · Score: 1

      If you ask me, they should implement a object storage system in RAM, add supper (hyper?) page support down to the swap partition, then optimize it. But, then again, I'm no FS guru either.

      --
      I know tobacco is bad for you, so I smoke weed with crack.
  14. Re:Ironically I was just going out to buy a small by Anonymous Coward · · Score: 0

    I've been doing this for years with CF cards.

    Put the volatile stuff on a spindle, the rest on a CF card.

  15. No. Not Now. Not Ever. I'm Coming For All of You! by Anonymous Coward · · Score: 5, Funny

    > Vista has already started working around this problem, since it uses a default partitioning geometry of 240 heads and 63 sectors/track. This results in a cylinder boundary which is divisible by 8, and so the partitions (with the exception of the first, which is still misaligned unless you play some additional tricks) are 4k aligned. So this is one place where Vista is ahead of Linuxâ¦.

    Although the technology it is used in is repugnant, NTFS has always been the One True Filesystem. It descended from DIGITAL's ODS2 (On Disk Structure 2) which traces back to the original Five Models (PDP 1, 8, 10, 11 and 12). You see, ODS was written by passionate people with degrees and rich personal lives in Massachusetts who sang and danced before the fall of humanity to the indignant Gates series who assimilated their young wherever possible and worked them into early graves during his epic battle with the Steves before the UNIX enemy remerged after a 25 year sleep and nuked the United States, draining all of its technological secrets to the other side of the world. Gates, realizing what he's done, now travels the universe seeking to rebuild his legacy by purifying humanity while the Steve series attempts to rebuild itself. Some of the original Five are still around, left to logon to Slashdot and witness what's left of the shadow of humanity still in the game as they struggle blindly around in epic circles indulging new and different ways to steal music, art and technology to make up for their lack of creativity long ago bred out of them by the Gates series.

  16. Why pretend these are ordinary disks? by jensend · · Score: 4, Insightful

    SSDs gradually gain more and more sophisticated controllers which do more and more to try to make the SSD seem like an ordinary hard drive, but at the end of the day the differences are great enough that they can't all be plastered over that way (the fragmentation/long term use problems the story linked to are a good example). I know that (at present- this could and should be fixed) making these things run on a regular hard drive interface and tolerate being used with a regular FS is important for Windows compatibility, but it seems like a lot of cost could be avoided and a lot of performance gained by having a more direct flash interface and using flash-specific filesystems like UBIFS, YAFFS2, or LogFS. I have to wonder why vendors aren't pursuing that path.

    1. Re:Why pretend these are ordinary disks? by NekoXP · · Score: 4, Interesting

      Because Intel and the rest want to keep their wear-leveling algorithm and proprietary controller as much of a secret as possible so they can try to keep on top of the SSD market.

      Moving wear-levelling into the filesystem - especially an open source one - effectively also defeats the ability to change the low-level operation of the drive when it comes to each flash chip - and of course, having a filesystem and a special MTD driver for *every single SSD drive manufactured* when they change flash chips or tweak the controller, could get unwieldy.

      Backing them behind SATA is a wonderful idea, but this reliance on CHS values I think is what's killing it. Why is the Linux block subsystem still stuck in the 20MB hard-disk era like this?

    2. Re:Why pretend these are ordinary disks? by Mike+McTernan · · Score: 1

      > and of course, having a filesystem and a special MTD driver for
      > *every single SSD drive manufactured* when they change flash
      > chips or tweak the controller, could get unwieldy.

      Large numbers of flash chips can be supported by the MTD CFI drivers:

      http://en.wikipedia.org/wiki/Common_Flash_Memory_Interface

      Something similar could be done for SSDs too, except they've chosen HDD standards as they are a better fit.

      Mike

      --
      -- Mike
    3. Re:Why pretend these are ordinary disks? by aminorex · · Score: 1

      Same reason it doesn't reasonably support heirarchical persistent RAM: Everybody who wants to do it is too busy with other work.

      --
      -I like my women like I like my tea: green-
    4. Re:Why pretend these are ordinary disks? by gillbates · · Score: 2, Insightful

      Why is the Linux block subsystem still stuck in the 20MB hard-disk era like this?

      As one who had to tune the performance of hard drives at the kernel level, I can say with some authority that the Linux block subsystem is not at all stuck in the 20MB hard-disk era. In fact, everything is logical blocks these days, and it's the filesystem driver and IO schedulers which determine the write sequences. The block layer is largely "dumb" in this regard, and treats every block device as nothing more than a large array of blocks. A properly designed wear-leveling filesystem has no dependencies on the underlying hardware with one exception: block size. But seeing as every Linux filesystem since Ext2 has had the option of creating filesystems with different block sizes, I doubt this is, or ever will be, an issue.

      The only real issue with wear-leveling filesystems is that they don't work well with conventional hard disks, largely due to the fact that with flash, the block access time is pretty much constant no matter where on the drive it is located. Hence, there's no need to schedule based on C/H/S values. Because of this disparity, there won't be ONE TRUE FILESYSTEM in Linux. This might actually be a good thing, if you've ever been privy to the debates over Reiserfs and Ext3...

      The hardware SSD wear-levelling algorithms used by Intel, et al... are nothing special. Yes, they probably do offer higher performance than a general purpose filesystem, but performance is not their reason for existence. They exist largely because the overwhelming majority of consumer devices still use FAT32, which would destroy an SSD without wear-leveling very quickly. Think of how many flash chips are used in cameras, cellphones, thumb drives, etc... Intel had to do this just to access the non-Linux market.

      --
      The society for a thought-free internet welcomes you.
    5. Re:Why pretend these are ordinary disks? by mgblst · · Score: 1

      and of course, having a filesystem and a special MTD driver for *every single SSD drive manufactured*

      If you need to go to the extreme, then you are probably wrong.

      There is a step between no filesystem wear level algorithm, and one for every single drive manufactured.

      That is the obvious solution, have some filesystem algorithms. Let the hardware tell us which one we should use, and use that one. Not such a big problem, is it, happens all the time.

    6. Re:Why pretend these are ordinary disks? by Anonymous Coward · · Score: 0

      "Interesting"? Are you kidding me? That guy didn't even read the article. Thist is about a partitioning scheme ("MBR"). It has nothing to do with the Linux block layer.

    7. Re:Why pretend these are ordinary disks? by NekoXP · · Score: 1

      Seems you didn't read the article either, or the parent. I was discussing the reason why SSD manufacturers aren't using special MTD drivers anymore, and the reason they don't is because wear-levelling generally gets done in the MTD driver if there is a simplistic flash controller behind it (although you could do it in the controller, that makes the

      The real problem is Linux uses CHS values (fake as they may be) in the block layer. Everywhere. And the partitioning tools do. And the RAID tools do by proxy.

      ext4 has absolutely no idea what the "natural" alignment of the disk blocks and how they fit in a "cylinder" is because there's no decent way to find out based on CHS values which are fixed up and hardcoded inside the block layer.

      You can find out how big a "physical" block is (512, 2048, 2352, 4096..), but 100% of available SSDs return 512 and a bunch of fake CHS all for compatibility's sake. CHS just doesn't work anymore and the compatibility means more fake CHS values are being implemented and dropped on top of the LBA addressing scheme.

      If Linux or any other OS had any way of finding out where the natural alignment stood then regardless of where the partition is created (thus moving the problem away from some userspace tools which all get updated independently) then the filesystem can be created to take advantage of that alignment.

      If the partition is created through "fake" CHS values then performance would suffer if the filesystem isn't aligned. This is the problem right now, filesystems assume that the start of the partition is naturally "cylinder" aligned. With a 128k erase block and CHS "cylinder" alignment with a 4k block size on the filesystem you could be pretty far away from well-aligned. If it knew that it had to align it's data structures then it doesn't have to make assumptions about the partitioning scheme. Let's be honest; there are more partitioning schemes than MBR. What about GPT or RDB? BSD slices?

      http://www.ipnom.com/FreeBSD-Man-Pages/fdisk.8.html

      I love this little snippet;

      If you hand craft your disk layout, please make sure that the FreeBSD slice starts on a cylinder boundary. A number of decisions made later may assume this. (This might not be necessary later.)

      So, it may be necessary or not. BSD slices are not naturally aligned - as defined - on cylinder boundaries, they use sector size only. Some tool such as fdisk tries to handle this for you. But the values the disk and the kernel pass back are just not realistic (255 heads, 63 cylinders...) and do not reflect ANY disk.

      Since you can't change the 255/63 value passed in by the disk or hardcoded in the block layer, plus cylinders and heads make zero sense on a flash drive (or a ramdisk or a virtualized block layer) why not a new ATA command set which reports the true natural alignment of the disk, with reasonable values which can be used to optimize performance, well away from the compatibility values, that the filesystem can get (as it gets the sector size) and rely on for the best performing filesystem on that media?

    8. Re:Why pretend these are ordinary disks? by badkarmadayaccount · · Score: 1

      Dude, any idea if any stable ReiserFS drivers will show up? I've been dying to try, but I'm too spooked for my data. Sorry for the OT post.

      --
      I know tobacco is bad for you, so I smoke weed with crack.
  17. Re:SSD's should have no problem with fragmentation by Jurily · · Score: 1

    This means that no matter where the data is stored in the device, all we need to do is specify the fetch location and the logic circuits select that block to extract the data from desired location.

    Which is why you don't need head-optimized I/O schedulers like Anticipatory, which waits a couple of ms after every read to see if there's more from that area, thus saving on seek times.

    SSD's must be optimized differently. For instance, they can't write arbitrary small pieces of data, only whole blocks. Thus, if you want to optimize it, you'd better make sure to write whole blocks at a time if possible, and not have small files cross boundaries if they don't have to.

  18. Re:Mere mortals need more toy budget by conureman · · Score: 1

    I've been wrestling this idea around as a sound studio solution, and it seems that an external storage unit makes the most sense, with a DRAM card for the currently working files. Almost affordable, anyway.

    --
    The cost of that cleanup, of course, will be borne by taxpayers, not industry.
  19. Re:SSD's should have no problem with fragmentation by Anonymous Coward · · Score: 0

    Every mass storage device since cassette tapes read/writes a whole block at a time.

  20. Re:No. Not Now. Not Ever. I'm Coming For All of Yo by kclittle · · Score: 1

    I have mod points, but cannot find the "Totally Bonkers" mod...

    --
    Generally, bash is superior to python in those environments where python is not installed.
  21. Re:No. Not Now. Not Ever. I'm Coming For All of Yo by Anonymous Coward · · Score: 0

    Don't do drugs, man.

  22. Is 1 Disk Raid the solution? by WittyName · · Score: 0

    Partition the drive into BlockSize/4KB logical disks.
    Make sure the alignment is correct, then RAID these
    into 1 big disk.

    This gives us one usable disk with maybe 128kb clusters.

    Small files would need to share a cluster, but they
    would have done that anyway..

    --
    The law is a weapon of the government, not a protection for the likes of you. Surely you understand that.
  23. Take a look at Maemo . . . by PolygamousRanchKid+ · · Score: 1

    . . . which runs on the Nokia N800/N810 "Internet Tablets" (www.maemo.org). They might have done some tweaking, since this is Linux running on SSDs.

    --
    Schroedinger's Brexit: The UK is both in and out of the EU at the same time!
    1. Re:Take a look at Maemo . . . by DragonTHC · · Score: 2, Interesting

      Don't forget android.

      --
      They're using their grammar skills there.
    2. Re:Take a look at Maemo . . . by kamatsu · · Score: 1

      Dunno about maemo, but android uses flash-optimized yaffs for its storage, and doesn't include support for traditional linux types like ext2 or 3.

    3. Re:Take a look at Maemo . . . by ADRA · · Score: 1

      Maemo and several other embedded systems have been using flash based disk storage for years. The problem is that SSD isn't a flash storage device, its a hard-drive interface wrapped around a flash device.

      Since Linux can't see the flash devices themselves, it can't properly implement a flash based hard-drive interface.

      --
      Bye!
  24. repeated re-write issues? by supernova87a · · Score: 1

    when I saw the headline, I was thinking not so much the fragmentation issues, but the repeated re-writing of logs and other small frequently accessed files that SSDs are susceptible to (maximum # of rated read-write cycles). Have there been any developments in that area?

    1. Re:repeated re-write issues? by nedlohs · · Score: 4, Informative

      It will outlast a standard hard drive by orders of magnitude so it's completely not an issue.

      With wear leveling and the technology now supporting millions of writes it just doesn't matter. Here's a random data sheet: http://mtron.net/Upload_Data/Spec/ASIC/MOBI/PATA/MSD-PATA3035_rev0.3.pdf

      "Write endurance: >140 years @ 50GB write/day at 32GB SSD"

      Basically the device will fail before it reaches the it runs out of write cycles. You can overwrite the entire device twice a day and it will last longer than your lifetime. Of course it will fail due to other issues before then anyway.

      Can there be a mention of SSDs without this out-dated garbage being brought up?

    2. Re:repeated re-write issues? by A+beautiful+mind · · Score: 4, Informative

      There are a few tricks up the manufacturer's sleeve to make this slightly better than it really is:

      1. large block size (120k-200k?) means that even if you write 20 bytes, the disk physically writes a lot more. For logfiles and databases (quite common on desktops too, think of index dbs and sqlite in firefox for storing the search history...) where tiny amounts of data are modified, this can add up rapidly. Something writes to the disk once every second? That's 16.5GB / day, even if you're only changing a single byte over and over.

      2. Even if the memory cells do not die, due to the large block size, fragmentation will occur (most of the cells will have a small amount of space used in them). There has been a few articles about this that even devices with advanced wear leveling technology like Intel's exhibit a large performance drop (less than half of the read/write performance of a new drive of the same kind) after a few months of normal usage.

      3. According to Tomshardware unnamed OEMs told them that all the SSD drives they tested under simulated server workloads got toasted after a few months of testing. Now, I wouldn't necessary consider this accurate or true, but I'd sure as hell would not use SSDs in a serious environment until this is proven false.

      --
      It takes a man to suffer ignorance and smile
      Be yourself no matter what they say
    3. Re:repeated re-write issues? by berend+botje · · Score: 2, Informative

      All nice and dandy, but these figures aren't exactly honest. In a normal scenario your filesystem consists for a large part on static data. These blocks/cells are never rewritten. Therefore the writes (for logfiles etc) are concentrated on a small part of the disk, wearing it out rather more quickly.

      Having a few Compact Flash disks wear out in the recent past, I'm not exactly anxious to replace my server disks with SSD.

    4. Re:repeated re-write issues? by Johnny+Mnemonic · · Score: 1

      I'd expect that wear-leveling algorithms look for that kind of discrepancy and moves static files to sectors that are getting heavier use, and starts putting heavily written files onto the sectors that previously contained static info. That would be pretty easy to do. At least OS X moves files around according to how often they're being used (but the OS X technology was designed for optimizing platters).

      --

      --
      $tar -xvf .sig.tar
    5. Re:repeated re-write issues? by Jeffrey+Baker · · Score: 1

      This is not "informative" it's "crap" and also "wrong". Modern SSDs move data even when it isn't written. Therefore there is no static data from the flash controller's point of view.

    6. Re:repeated re-write issues? by A+beautiful+mind · · Score: 1
      I hate to reply to my own posts, but I linked to the wrong article for my last claim. This is the correct article and quoting what it says:

      Customers, such as a large OEM we won't mention, have been trying to validate flash SSDs for enterprise applications by looping hardcore I/O loads, and they all failed with write errors after only a few months.

      --
      It takes a man to suffer ignorance and smile
      Be yourself no matter what they say
  25. Re:SSD's should have no problem with fragmentation by v1 · · Score: 5, Interesting

    I don't think this is going to be a significant problem when compared to normal seek time problems.

    Lets say we have 100 k of data to read. 512 byte blocks would require 200 reads. 4k blocks would require 25 reads.

    For rotating discs: If the data is contiguous, we have to hope that all the blocks are on the same track. If they are, then there is 1 (potentially very costly) seek to get to the track with all the blocks on it. The cost of the seek is dependent on the track it's going to, the track it's on, and whether or not the drive is sleeping or spun down. Otherwise we also get to do another very short seek, which is going to add a bit of time to get to the next adjacent track. Worst case scenario all 200 blocks are on different tracks, scattered randomly on the platter, requiring 200 seeks. Ouch ouch ouch.

    For SSDs: What is important is the number of cells we have to read. Cells will be 4k in size. All seek times are essentially zero. Best case scenario, all data is contiguous, and the start block is at the start of a cell. Read time boils down to how fast the flash can read 20 cells. Worst case scenario is where the data is 100% fragmented, such that all 200 512 byte blocks reside in a different cell, requiring 200 cell reads. (10fold increase in time required) There will also be overhead in copying out the 512 byte data from each buffer and assembling things, but this time is negligible for this comparison.

    While the 20x time increase (order N) looks significant, it's important to compare the probabilities involved, and just how bad things get. The most important difference between how these two drives react is the space between fragments. In the "worse case' for SSD, 100% fragmentation, is highly unlikely. I don't even want to think about what a spinning disc would do if asked to perform a head seek for 100% of the blocks in say, a 1mb file. The read head would probably sing like a tuning fork at the very least. 2000 cell reads compared to 2000 seeks, the SSD will win handily every single time, even if the tracks on the disc are close.

    If the spacing between fragments is anything near normal, say 30-100k, then there will be some seeking going on with the disc, and there will be some wasted cell reads with the SDD, but having to do an extra one cell read compared with having to do an extra head seek, again the SSD wins hands down. The advantage of the SSD actually goes down as fragmentation goes down, because most fragments are going to cause a head seek, each of will significantly widen the time gap. Also a spinning disc will read in the blocks much faster than the cells on a SSD.

    I realize the OP was more describing the possibility of "not so much bang for the buck as you are expecting" due to fragmentation, and I know the above hits more on comparing the two than what happens to the SSD, but if you consider the effects of fragmentation on a spinning disc, and then weigh how the impact compares with a SSD, it's easy to see that fragmentation that sent you running for the defrag tool yesterday may not even be noticeable with a SSD. So I'd call this a "non-issue".

    What I'm waiting for is them to invest the same dev time in read speeds as write speeds. SSDs don't appear to be doing any interleaved reads - they're doing it for the writes because they're so slow. Though at this point I wonder if read speeds are just plain running into a bus speed limit with the SSDs?

    --
    I work for the Department of Redundancy Department.
  26. Re:No. Not Now. Not Ever. I'm Coming For All of Yo by Gogogoch · · Score: 1

    Please mod the parent funny; so say we all.

  27. What is different about SSD's? by deathguppie · · Score: 1

    From what I can scrape together quickly off of the Internet IANASE (I am not a software engineer). The biggest difference seems to be the lack of a need for error checking and disk defrag etc. Since the a normal spinning hdd does not actually delete a file but just removes the markers the filesystem treats all areas the same and does the same things to both real and non-real data to keep the disk state sane. In an SSD all of this leads to a lot of unneeded disk usage and premature degradation of the drive itself.

    There seems to be more about Data set management but I don't quite understand it.. maybe someone more knowledgeable could explain it?

    --
    once more into the breach
    1. Re:What is different about SSD's? by ADRA · · Score: 1

      Flash devices have the inherent weakness that if you write to the same place in the disk say 10000 times, that part of the disk will stop working.

      Its kind of like a corrupt sector(piece of the disk) on your regular hard-drive, but instead of the timer being based on some drive defects or head crashes, its based on a write timer.

      Why is this a big deal? Say I have a file called foose.txt. I decide that my neat program will open the file, increment a number, then close the file again. It sound pretty simple, but imagine if this ran 20x a minute. That's 1200x writes an hour, or 28800 times in one day.

      If I was running this application against a raw flash device, I would have killed that 10000 write flash sector in a single day.

      What the sophisticated management software in SSD's does is notice that I'm writing to that sector too many times and decides to move my written file to somewhere else instead. So, I'm still killing my flash based device by eating up 28800 writes to the device as a whole, but 28800 writes spread over hundreds/thousands of sectors is a lot better than killing a single sector.

      That's why the selective and flexible selection of writing to a flash based disk is so important. Many Linux based flash disk technologies do basically the same thing as SSD does behind the scenes, but since Linux can't see behind the veil which is SSD, we can't use flash file-systems on top of SSD disks. Because of this, I imagine that the author would like Linux devs to better support SSD's by getting non-flash file systems to support SSD better than they are today.

      --
      Bye!
    2. Re:What is different about SSD's? by tytso · · Score: 4, Informative

      Because of this, I imagine that the author would like Linux devs to better support SSD's by getting non-flash file systems to support SSD better than they are today.

      Heh. The author is a Linux dev; I'm the ext4 maintainer, and if you read my actual blog posting, you'll see that I gave some practical things that can be done to support SSD's today just by better tuning parameters given to tools like fdisk, pvcreate, mke2fs, etc., and I talked about some of the things I'm thinking about to make ext4 better at support SSD's better than it does today.....

    3. Re:What is different about SSD's? by Anonymous Coward · · Score: 0

      You may be a Linux dev but you're certainly new here if you thought anyone would RTFA.

    4. Re:What is different about SSD's? by Anonymous Coward · · Score: 0

      That tired old joke is hideously not-funny these days, you know. Or maybe you don't.

  28. Don't SSD's have a pre-set number of writes? by DJRumpy · · Score: 2, Funny

    I'm just sitting here thinking. Doesn't an SSD have a preset number of writes in it due to it's nature?

    Does it really matter if they spread these writes around on the hard drive when the number of writes the drive is capable of doing is still the same in the end?

    To drastically oversimplify, lets say that each block can be written to twice. Does it really matter if they used up the first blocks on the drive and just spread towards the end of the drive partition with general usage rather than jumping all over to try to spread the writes around?

    Am I thinking about this the wrong way? What benefit does it give them to spread the writes around if the total number of writes doesn't change? Doesn't it just further fragment the files with little gain?

    1. Re:Don't SSD's have a pre-set number of writes? by berend+botje · · Score: 2, Informative

      Say you 100 cells and can write 10 times to each cell.

      Having every cell written to nine times: 100 * 9 = 900 writes and you still have a completely working disk.

      Writing 900 writes to the first couple of cells: you now have 90 defective cells. In fact, as you still have to rewrite the data to working cells, you have lost your data as there aren't enough working cells.

    2. Re:Don't SSD's have a pre-set number of writes? by Anonymous Coward · · Score: 0

      Not sure why your post is modded funny.

      You are correct - there is a limited count on the number of times you can write to the SSD. Actually, the number of times you can erase is what is limited, but it amounts to the same thing.

      The problem is that most file-systems want to keep important book-keeping data in a fixed location on a disk so that they can find it. That isn't a good idea with SSDs because you will quickly "burn out" that location because of all the read/erase/write cycles on it. When that location is gone, the SSD will be unusable to that file-system.

      Most hardware wear-leveling (the "smarts" inside the SSD) works by fooling the file-system. It will remap locations on the SSD so that all of the SSD wears equally. Even when a filesystem thinks it is repeatedly erasing/writing a single location, the SSD will spread it out across all available locations.

      On the other hand, file-systems that are "flash aware" will do the wear-leveling themselves, even on raw flash chips that don't have hardware wear-leveling.

      The question that arises is which is better - doing wear-leveling in SSD or in the file-system? There are pros and cons to each method.

    3. Re:Don't SSD's have a pre-set number of writes? by DJRumpy · · Score: 1

      So in effect, instead of 'burning' out a specific section of an SDD, they will simply burn out the entire disk at once due to wear leveling? Seems to me they are Robbing peter to pay Paul and the end result is still the same, albeit with far more fragmentation. If you are then forced to defragment your SDD to get the peformance back, you are in effect killing your SDD due to all the erase/writes that defragging will cause.

      I think I prefer the slow and progressive method rather than waking up some morning to find that it's burned out once wear leveling can't find any good blocks to erase.

      That leads to another few questions. What about data that is stored in a location that was written to for a final time? (say the 10th erase/write cycle in that block using the analogy above). The data is still retrievable right? It simply can't be written to again?

      I wonder if the wear leveling algorithms will also store files that are read-only in nature in blocks that are close to failure?

      Last but not least, do the OS's turn off the 'last accessed' property that is commonly used across most OS's? Seems that would leave to much more rapid failure.

    4. Re:Don't SSD's have a pre-set number of writes? by MoonBuggy · · Score: 3, Informative

      So in effect, instead of 'burning' out a specific section of an SDD, they will simply burn out the entire disk at once due to wear leveling?

      Technically speaking, yes, the drive is more likely to go from 'all cells functioning' to 'many cells dead' in a relatively short amount of time due to wear levelling, whereas without it the mode of failure would be a more gradual reduction in functioning cells.

      Practically speaking, however, these things support an awful lot of read/write cycles. On the order of a million or more, according to the data I could find. Unfortunately the Intel datasheet for the drive mentioned in the summary doesn't actually include write-cycle data, though.

      A quick and dirty calculation (not taking into account block size, etc.) for drive lifetime is simply (capacity)*(write cycles)/(write speed).

      Imagine a drive with no wear levelling. Say you have a 1GB file, the entirety of which is being continually rewritten to the same 1GB section of the drive. A million read/write cycles means you need to write approximately 1,000,000 GB (that's 1000TB!) to that 1GB section of drive to kill it. Again, somewhat inaccurate in the real world, but good enough for a back of the envelope estimate. Allowing a fairly generous write speed of 100MB/s, writing to that same 1GB area of disk 24/7, would burn it out in around 115 days - about 4 months. In that time, remember, you'll have generated 1000TB of data - that's certainly not insignificant, even for fairly major applications, but it could be done, and you're left with a drive that's got 1GB less capacity than it started with.

      Now consider the same case with wear levelling. Assume for the sake of simplicity it functions perfectly, and ignore block size. On an 80GB drive, continually overwriting that same 1GB file, it will simply cycle through the entire 80GB capacity of the drive repeatedly rather than just hammering the same 1GB section. This means that you suddenly increased the effective lifespan by a factor of 80 (again, not entirely real-world due to the fact that the drive would normally have data filling some of the rest of that 80GB, but sufficient to get the point across). You're now looking at over 25 years of continuous writing, by which time you will have generated 8 yottabytes of data.

      That's why wear levelling is a good thing. Even on a disk that's completely full (not something that happens particularly often, but still worth thinking about) the drive itself has some built in excess capacity to use for wear reduction.

    5. Re:Don't SSD's have a pre-set number of writes? by DJRumpy · · Score: 1

      Excellent. Thanks ;)

    6. Re:Don't SSD's have a pre-set number of writes? by Achromatic1978 · · Score: 1

      Fragmentation is a non-issue on an SSD. Access time is equal, no matter where the blocks are, be they contiguous or disparate.

    7. Re:Don't SSD's have a pre-set number of writes? by Anonymous Coward · · Score: 0

      That leads to another few questions. What about data that is stored in a location that was written to for a final time? (say the 10th erase/write cycle in that block using the analogy above). The data is still retrievable right? It simply can't be written to again?

      I wonder if the wear leveling algorithms will also store files that are read-only in nature in blocks that are close to failure?

      Last but not least, do the OS's turn off the 'last accessed' property that is commonly used across most OS's? Seems that would leave to much more rapid failure.

      It's very hard to determine exactly when a location is near going bad. In most cases it is best to just try a write and see if it 'sticks'. Wear-leveling will mark a bad location (transparently to the file-system) and use another if the write fails.

      I'm not aware of a wear-level scheme that dynamically maps read-only data to locations with a high usage count.

      You're right - the 'last accessed' property increases the number of writes to the SSD. Most file-systems provide a way to turn it off. Some flash-aware file-systems don't even provide it.

    8. Re:Don't SSD's have a pre-set number of writes? by tytso · · Score: 2, Informative

      Flash using MLC cells have 10,000 write cycles; flash using SLC cells have 100,000 write cycles, and are much faster from a write perspective. The key is write amplification; if you have a flash device with an 128k erase block size, in the worst case, assuming the dumbest possible SSD controller, each 4k singleton write might require erasing and rewriting a 128k erase block. In that case, you would have a write amplification factor of 32. Intel claims that with their advanced LBA redirection table technology, they have a write amplification of 1.1, with a wear-leveling overhead of 1.4. So if these numbers are to be believed, on average, over time, a 4k write might actually cost a little over 6k of flash write. That is astonishingly good.

      The X25-M uses MLC technology, and is rated for a life for 5 years writing 100GB a day. In fact, if you have an 80GB worth of flash, and you write 100GB a day, with an write amplification and wear-leveling overhead of (1.1 and 1.4, respectively), then over 5 years you will have used approximately 3200 write cycles. Given that MLC technology is good for 10,000 write cycles, that means Intel's specification has a factor of 3 safety margin built into them. (Or put another way, the claimed write amplification factors could be three times worse and they would still meet their 100GB/day, 5 year specification.)

      And 100GB a day is a lot. Based on my personal usage of web browsing, e-mail and kernel development (multiple kernel compiles a day), I tend to average between 6 and 10GB a day. When Intel surveyed system integrators (i.e., like Dell, HP, et. al), the number they came up with as the maximum amount a "reasonable" user would tend to write in a day was 20GB. 100GB is 10 times my maximum observed write, and 5 times the maximum estimated amount that a typical user might write in a day.

      For those of you who are Linux users, you can measure this number yourselves. Just use the iostat command, which will return the number of 512 byte sectors written since the system was booted. Take that number, and divide it by 2097152 (2*1024*1024) to get gigabytes. Then take that number and divide it by the number of days since your system was booted to get your GB/day figure.

    9. Re:Don't SSD's have a pre-set number of writes? by DJRumpy · · Score: 2, Insightful

      The TFA would disagree with you, as it states that write performance does indeed drop, sometimes up to half the original performance or more due to wear leveling and write combining techniques used. Your talking read access times, where we're talking write/erase access times.

  29. Re:SSD's should have no problem with fragmentation by diskis · · Score: 1

    Yes, but for SSD's the blocks are larger - problems when essentially all software is optimized for smaller blocks.

  30. What?! by Anonymous Coward · · Score: 0

    You mean after all the hoopla the Linux people made about the Anticipatory Scheduler, the code is nothing more than:

    wait_awhile()

    What a ripoff.

  31. Re:Ironically I was just going out to buy a small by Anonymous Coward · · Score: 0

    "coincidentally", not "ironically".

  32. chs no longer used by Anonymous Coward · · Score: 1, Informative

    i haven't yet found a sata device
    (even doms) that require chs addressing.

    clearly it was a mistake to use hardware
    quirks to address sectors, but the again,
    ata became a de facto standard before
    realized it might become one.

    1. Re:chs no longer used by Hal_Porter · · Score: 2, Informative

      CHS disappeared ages ago. The maximum device supported was ~8 Gbyte (1023 cylinders * 255 heads * 63 sectors * 512 bytes)

      --
      echo -e 'global _start\n _start:\n mov eax, 2\n int 80h\n jmp _start' > a.asm; nasm a.asm -f elf; ld a.o -o a;
  33. Another file strategy - file segregation by f(x) by spineboy · · Score: 4, Insightful

    Why not functionally group files to decrease or eliminate fragmentation? Or maybe this is already done.
    For example - I have a large collection of MP3 files. They essentially do not change, as in I don't edit them, and rarely erase them. The file system could look at they type of file (mp3, vs doc) and place it accordingly. It could also look at the last change in the file and place it in a certain area. Older unchanged files are placed in a tightly placed/packed file area that is optimized and not fragmented.

    --
    ..........FULL STOP.
  34. The responses prove... by Anonymous Coward · · Score: 0

    that there's way too much effort and so much overhead for so little gain and the fatal problem of SSDs having a limited lifespan is just too much to overcome.

    SSDs are awesome as a simple storage medium for stuff you don't change around much, i.e. a replacement for floppies/optical media/etc. They are NOT, however, a replacement for hard drives, and it's sad that people continue to push them in that direction when it is utterly futile and, frankly, stupid to do so.

    AC because this is a harsh truth that no one wants to admit and therefore would be modded down to oblivion by mods that believe it's a troll.

  35. "shock-resistant" by Anonymous Coward · · Score: 1, Funny

    Sure. There are *lots* of considerations beyond speed to want SSDs

    And SSD drives are also shock-resistant.

    The drives will be shocked when they see what I have in my pr0n collection.

  36. take a look at zfs by Anonymous Coward · · Score: 0

    Seems to me that Sun's zfs filesystem is ready to use the ssd storage. The copy-on-write strategy would seem to avoid the hot spots as zfs picks new blocks from the free pool rather than rewriting the same block.

    1. Re:take a look at zfs by tytso · · Score: 1

      Seems to me that Sun's zfs filesystem is ready to use the ssd storage. The copy-on-write strategy would seem to avoid the hot spots as zfs picks new blocks from the free pool rather than rewriting the same block.

      Actually, given the X25-M's lack of TRIM support, using a log-structured filesystem, a write-anywhere filesystem, or a copy-on-write type system is actually a really bad use of the X25-M, since the X25-M will think the entire disk is in use. The X25-M is actually implemented to optimize for filesystems that reuse blocks as much as possible, since it is internally doing the equivalent of a log-structured filesystem to do wear leveling. TRIM support will obviously help, but for ZFS, the X25-M is probably not a good choice. A cheaper flash drive which doesn't try to be smart about wear leveling would actually be better for ZFS.

    2. Re:take a look at zfs by erudified · · Score: 1

      Actually, given the X25-M's lack of TRIM support, using a log-structured filesystem, a write-anywhere filesystem, or a copy-on-write type system is actually a really bad use of the X25-M, since the X25-M will think the entire disk is in use. The X25-M is actually implemented to optimize for filesystems that reuse blocks as much as possible, since it is internally doing the equivalent of a log-structured filesystem to do wear leveling.

      Interesting. My understanding is that HFS+ meets these requirements pretty well - would the X25-M be a good choice for a Macintosh system?

    3. Re:take a look at zfs by Henk+Poley · · Score: 1

      You just need a background task that zeros blocks used by deleted files. EasyCo's MFT implements such a log filesystem as a generic block device for x86 Linux and Windows.

    4. Re:take a look at zfs by tytso · · Score: 1

      It's not obvious to me that X25-M treats a block that has been zero'ed out as an "unallocated block". It could do this, but it's not at all guaranteed that it does this. Do you know for certain (via an Intel specification sheet) that writing all ZERO's is the equivalent of an ATA TRIM?

    5. Re:take a look at zfs by Henk+Poley · · Score: 1

      It is not the equivalent, because more data needs to transfered. But the effect is the same, according to Intel. Also works for most SSDs.

    6. Re:take a look at zfs by tytso · · Score: 1

      Can you give me a URL or citation from someone official at Intel who has said this? As near as I can tell, Intel has been very tight-lipped about what the X25-M does internally.

    7. Re:take a look at zfs by Henk+Poley · · Score: 1

      I think slightly misinterpreted this article:

      http://www.pcper.com/article.php?aid=669&type=expert&pid=5

      Anyways, writing zeros, or writing something else sequentially should essentially be the same. More dumb Flash based SSDs actually respond to writing zeros, so I just remembered "here you can do the same zeroing trick".

    8. Re:take a look at zfs by tytso · · Score: 1

      Anyways, writing zeros, or writing something else sequentially should essentially be the same.

      No writing sequentially is not the same as an ATA TRIM command, since the X25-M can't reuse the blocks for real data. It might (or might not) help the internal fragmentation of the X25-M's internal LBA redirection table --- but given that the PC Perspectives article pointed out that when things got bad, even a complete write pass across the entire disk was not sufficient to restore performance, I doubt it.

      This makes sense, actually; without an ATA trim command, if you write the entire disk, the X25-M won't have much in the way of spare room in order for it to do its garbage collection/defragmentation operation. All it will have is the difference between 80 (real) GB (or GiB's for people who like that notation) and 80 (hd marketing) GB's. And apparently that is not enough.

      I've had some people suggest that reserving a partition with a few gig's and never using it helps, since that provides some extra room for the X25-M to recover; but I don't have anything authoratative.

      But back to the original point, what we really need is a way to tell the disk, "we don't care about the contents of the blocks any more". It *might* be that writing some magic pattern, whether all zero's or all one's --- and in fact, all one's makes more sense since an erased flash memory cell returns '1', not '0'. But the key question is whether or not the SSD's firmware treats this as "ok to reuse" or not. And for that we need a definitive answer from Intel.

  37. One True File System by Anonymous Coward · · Score: 0

    Although the technology it is used in is repugnant, NTFS has always been the One True Filesystem.

    I thought ZFS was.

    1. Re:One True File System by ggendel · · Score: 2, Insightful

      Although the technology it is used in is repugnant, NTFS has always been the One True Filesystem.

      I thought ZFS was.

      And ZFS has native support for SSD as L2ARC. http://www.c0t0d0s0.org/media/presentations/ssd.pdf I have nothing but praise for ZFS. Simple to manage, reliable, fast. With native CIFS instead of User file system Samba, I've seen orders of magnitude performance from windows machines when doing networked file access. Gary

  38. Install Windows by Anonymous Coward · · Score: 0

    get out of that faggot o/s while you can. it's nothing but a bunch of dog shit. i hear it's big among dick smoking faggots.

    FAGGOTS SHOULD ALL DIE!!!!!

  39. Re:SSD's should have no problem with fragmentation by jimmyhat3939 · · Score: 1

    Good analysis. The statistics I've read indicate that SSD's don't perform all that much better than hard drives in real-world scenarios. I think this is part of the reason for that performance. On the other hand, they do use less energy, which is a clear positive for a laptop.

    --
    Free Conference Call -- No Spam, High Quality
  40. Re:SSD's should have no problem with fragmentation by vux984 · · Score: 1

    On the other hand, they do use less energy, which is a clear positive for a laptop.

    And thus they are cooler. A clear positive for any system, but especially a laptop.
    They are also silent and don't vibrate.
    They are also, from what I understand, more reliable.

    I'm seriously considering flash drives for my desktop PC... they just need one more capacity jump and I think they'll be worth it. $400 for 128MB is a touch small.. but I'll go for it at $400 for 256MB. On my main PC I'm only using 236GB of my 500GB drive, and I could easily move 150GB of that onto my 1TB external e-sata drives that I turn on when I need.

  41. Thinkpad X300 came with defrag tools by Britz · · Score: 2, Insightful

    I purchased an X300 Thinkpad for the company this week and took a close look at it. I thought expensive business notebooks come without crapware. And I was sure the X300 would be optimized. But they had defrags scheduled! I always thought defrag is a no no for ssds. Now I am not sure anymore. I deinstalled it first. But who knows?

  42. ZFS L2ARC by Anonymous Coward · · Score: 1, Informative

    I think, Theodore should look into technologies like the ZFS L2ARC (just look at using SSD as an additional cache to supplement disks based on rotating rust. The L2ARC stores recently evicted pages from the primary ARC (the Adjustable Replacement Cache) of ZFS on SSD. From my view this is a more reasonable usage of SSD than just as another primary storage media.

    I recently wrote an article about the mechanism of ARC and L2ARC in conjunction with SSD in my blog, but i don't want to slashdot my site ;)

    1. Re:ZFS L2ARC by Wesley+Felter · · Score: 1

      L2ARC is interesting for servers, but on a desktop or laptop you can just put all your data on flash.

    2. Re:ZFS L2ARC by tytso · · Score: 1

      I'm familiar with the L2ARC idea. I think time will tell whether or not adding an extra layer of cache between the memory and commodity SATA hard drive really makes sense or not. For laptop use where we care about the power and shock resistance attributes of SSD's, it makes sense to pay a price premium for SSD's. However, it's not clear that SSD's will indeed become cheap enough, and even if they do, historically the cache hierarchy has 3 orders of magnitude between main memory and disks, and over the last 3 decades, there have been other technologies that have been cheaper per gigabyte than main memory, but more faster given a price level than hard drives (and for one reason or another, they have fallen into what Dr. Steve Hetzler, an IBM Fellow from the IBM Almaden Research Center has called "the dead zone".

      I first heard this argument at the December 2008 IDEMA Symposium, where I was giving a talk as the new CTO of the Linux Foundation, and his presentation was well worth the effort I made to head out to the Bay Area to give the talk.

      It turns out that Dr. Steve Hetzler is apparently going to be giving the same talk in three days at the Santa Clara Valley Chapter of the IEEE Magnetics Society, which will be held at the Western Digital facility in San Jose on February 24th. A brief talk description and map to the facility can be found here. It's an extremely interesting, entertaining, and thought-provoking talk, and some folks that have seen the slides of Dr. Hetzler's talk have taken an extreme exception to them. However, he makes some very powerful arguments both from supply side (specifically, the capital cost of the Silicon Fabs to replace even 10% of the HDD market is a very large number), and the demand side. For those of you who are in the Bay Area, and who is interested in storage issues, I'd strongly encourage you to listen to his talk and make your own judgements. The web site states that no RSVP's are required, and I don't think you have to be an IEEE member to attend.

    3. Re:ZFS L2ARC by Anonymous Coward · · Score: 0

      IÂm not sure about that. Do you really want all you mp3 or digicam photos on flash? ItÂs reasonable to assume, that the user donÂt want to think about the perfect placement of itÂs photos either. Thus an intelligent algorithm would be usefull instead of user interaction. I assume, this is one of the reasons, why Apple looked into ZFS.

    4. Re:ZFS L2ARC by Anonymous Coward · · Score: 1, Interesting

      I assume, the SSD augumented rotating rust is the way to go, thus technologies like L2ARC will me more widespread in the future. As you correctly state ... you need a lot of flash storage to substitue all this magnetic storage, at least as long there arenÂt further breakthroughs like even octa- or hexa-bit MLC with SLC reliability.

      When we just use flash to accelerate the working set instead of storing seldom used data on SSD, this would make more sense. I donÂt like this attitude of big storage vendors simply selling flash drives instead of rotating rust drives without giving a good way to manage this.

      BTW: I want both in my notebook. A flash drive for my working set actual emails, actual documents and so on, and i want disk drives for my long term storage and mass data like video, images and music ... but i donÂt want to manage it.

  43. Re:Ironically I was just going out to buy a small by Cassini2 · · Score: 1

    If I mount /home on a separate drive, (good to do when upgrading) the rest of the Linux file system fits nicely on a small SSD.

    I would move /tmp to either a RAM disk or a hard drive. There is no point in having tmp files using up the lifespan of your SSD, especially after you just moved /home to extend its life. Also, you could move some of the stuff in /var to a hard drive or ramdisk. Good candidates might be /var/tmp and /var/log. Alternatively, you could just move the entire /var hierarchy to a hard drive.

  44. Organizing by partition by steveha · · Score: 3, Informative

    Why not functionally group files to decrease or eliminate fragmentation? Or maybe this is already done.

    In a Linux system, this is easily done, but few people bother.

    Most of the write activity in Linux is in /tmp, and also in /var (for example, log files live in /var/log). User files go in /home.

    So, you can use different partitions, each with its own file system, for /, /tmp, /home, and /var.

    The major problem with this is that, if you guess wrong about how big a partition should be, it's a pain to resize things. So my usual thing is just to put /tmp on its own partition, and have a separate partition for / and for /home.

    The /tmp partition and swap partition are put at the beginning of the disc, in hopes that seek penalties might be a little lower there. Then / has a generous amount of space, and /home has everything left over.

    When a *NIX system runs out of disk space in /tmp, Very Bad Things happen. Far too much software was written in C by people who didn't bother to check error codes; things like disk writes don't fail often, but when /tmp is 100% full, every write fails. A system may act oddly when /tmp is full, without actually crashing or giving you a warning. So, the moral of the story is: disk is cheap, so if you give /tmp its own partition, make it pretty big; I usually use 4 GB now. However, if you run out of disk space in /var, it is not quite as serious. Your system logs stop logging. And, many databases are in /var so you may not be able to insert into your database anymore.

    The main Ubuntu installer is fast, because it wipes out the / partition and puts in all new stuff. So, if you have separate partitions for / and /home, life is good: you just let the installer wipe /, and your /home is safely untouched. It's annoying when you have /home as just a subdirectory on / and you want to run the installer. But, by default, the Ubuntu installer will make one big partition for everything; if you want to organize by partitions, you will need to set things up by hand.

    steveha

    --
    lf(1): it's like ls(1) but sorts filenames by extension, tersely
    1. Re:Organizing by partition by Anonymous Coward · · Score: 0

      When a *NIX system runs out of disk space in /tmp, Very Bad Things happen.

      One nice thing about having /tmp and /var on different partitions: if the logs fill up the partition where /var is, it's no big deal.

      If you have one fracking huge partition with everything, and you nearly fill it with MP3s or whatever, and then log files fill it the rest of the way... then /tmp is full too and the Very Bad Things happen. This can be mysterious.

      But it probably happened more often back in the old days when hard disks were tiny.

    2. Re:Organizing by partition by Gothmolly · · Score: 1

      1st, put /tmp on RAM, and 2nd, use a modern filesystem and LVM so that you can extend/shrink your partitions dynamically. This ain't 1998.

      --
      I want to delete my account but Slashdot doesn't allow it.
    3. Re:Organizing by partition by Anonymous Coward · · Score: 0

      1st, put /tmp on RAM

      Is it really worth it? The Linux file I/O is pretty fast anyway and caches your I/O in RAM anyway. Has anyone benchmarked a system with /tmp in RAM and on disk? Is there a HOWTO for this?

      I didn't find any actual benchmarks, but I did find a pretty good step-by-step guide:

      http://opendevice.blogspot.com/2007/03/create-linux-ram-disk-to-use-with.html

      But my guess is that you won't really see a worthwhile speed improvement for running typical desktop applications this way. (Now, a database server might be another thing entirely...)

      2nd, use a modern filesystem and LVM so that you can extend/shrink your partitions dynamically.

      I'll do that when the Ubuntu installer GUI makes it super-simple, and not before. I just want to set up my computers and use them. I'm typing this on a computer that is about seven years old, and I have never in that time wished I could conveniently resize a partition on it. Why should I jump through a whole bunch of hoops and figure out LVM? Where's the payoff for me?

      And that goes double now that hard disks are so freaking huge. This old computer has only a 30 GB hard disk. A "small" hard disk these days is 250 GB. Just throw 30 GB to the / partition and have a /home with over 200 GB... I won't need to resize those partitions anytime soon.

      LVM was probably valuable when hard disks were really small, but this ain't 1998.

    4. Re:Organizing by partition by WuphonsReach · · Score: 1

      The major problem with this is that, if you guess wrong about how big a partition should be, it's a pain to resize things. So my usual thing is just to put /tmp on its own partition, and have a separate partition for / and for /home.

      That's only a pain if you don't use LVM. Most of the Linux boxes that I setup have basically (4) partitions:

      /boot - 256MB is overly generous. Useful to have on its own partition so you can keep it mounted as read-only.

      / - Anything from 8GB up to 32GB, and I can always shove any large trees elsewhere into LVM areas.

      swap - Anything from 2GB up to 16GB. I should probably move this to LVM.

      LVM area - this partition is where all of the other file systems live.

      With LVM, it's about a 5 minute process to take a file system (at least for ext3) offline, enlarge the logical volume (LV), do the extend and fsck, and then bring the file system back online. LVM takes all of the guesswork out of initial sizing guesses. In fact, you're best to create numerous small file systems, then grow them as needed later.

      --
      Wolde you bothe eate your cake, and have your cake?
    5. Re:Organizing by partition by orasio · · Score: 1

      Just wanted to point out that Ubuntu did wipe my disk, but not my users directory.

  45. Re:SSD's should have no problem with fragmentation by MichaelSmith · · Score: 1

    SSDs have a different feel to them. The time to load a file is more consistent on my eeePC than on other laptops I own which have rotating disks.

  46. Mod Parent UP by navyjeff · · Score: 1

    My kingdom for a mod point today...

  47. Solid State Disks? by Anonymous Coward · · Score: 0

    WTF! SSD means Solid State Drive. Even a 5-year old can tell there's no disk in there. What's up with these retarded self-submitted articles? They're rarely written by someone competent.

  48. Re:SSD's should have no problem with fragmentation by Anonymous Coward · · Score: 0

    I have read this as well, but you run into difficulty when assigning a specific partition for swap space, because the drive limits it's distribution to the assigned space only.
        So, if 512m of that 80G is defined swap, it will stay within the range of that swap partition, burning it out much faster than the rest of the drive.
        A solution to this, is to assign swap space to a large file contained within your primary partition (dd 512M worth of zeroes and mount it as swap), but then you are routing all your swaps through XFS/ext4 or whatever, adding significant overhead.

  49. Re:SSD's should have no problem with fragmentation by MooUK · · Score: 1

    If you'll pay $400 for 256*MB*, I think you've got a little too much money and should give me some....

  50. Re:No. Not Now. Not Ever. I'm Coming For All of Yo by Anonymous Coward · · Score: 0

    that was beautiful man, just beautiful..[sniff]

  51. Re:Another file strategy - file segregation by f(x by harry666t · · Score: 3, Interesting

    That was my idea when I've proposed an "object storage system" here on /. a few months ago: associate type and metadata with every file, making them more "object-like" (as in object-oriented programming). The storage system would know the behaviour of each object (whether it is likely to grow, or more likely to be modified in place, or probably not modified at all, etc), and would choose the most efficient way of storing every particular kind of data. I've also proposed separate namespaces for each process, capability-based security, dropping paths in favour of non-hierarchical tags, and a few other "revolutionary" ideas that all had only one downside: nobody's going to break backwards compatibility, especially while the current system still "just works".

  52. Re:Ironically I was just going out to buy a small by earthforce_1 · · Score: 1

    Good point, I will have to think about that...

    Well, I fired up Ubuntu with the new configuration and I wasn't disappointed - WOW!

    Booting is lightning quick - I am still doing a lot of downloads so I haven't had a chance as some real performance tests but from what I have seen so far the results are impressive.

    --
    My rights don't need management.
  53. tytso by r00t · · Score: 2, Informative

    "tytso" is Theodore T'so.

    He and Remy Card wrote ext2. He and Stephen Tweedie wrote ext3. He and Ming Ming Cao wrote ext4.

    He maintains the filesystem repair tool (e2fsck) and resizing tool for those filesystems.

    He also created the world's first /dev/random device, maintained the tsx-11.mit.edu Linux archive site for many years, and wrote a chunk of Kerberos. He's been the technical chairman for many Linux-related conferences. He pretty much runs the kernel summit.

    He's certainly not a kid. I think he's about to turn 40.

    Really, Intel ought to give tytso piles of free SSD hardware before it goes on sale. This would help Intel by encouraging tytso to optimize Linux for Intel's SSD hardware.

  54. destruction is fun too by r00t · · Score: 1

    So many choices!

    belt sander

    nitric acid

    cutting torch

    charcoal and a blower

    chip wired into an AC wall socket

    thermite

    repeated use as a model rocket blast deflector

    drill press

    1. Re:destruction is fun too by Cassini2 · · Score: 2, Interesting

      So many choices!

      This could be fun. Here are some more suggestions:

      - Welder - The little chips don't last long against a good arc welder.
      - 600 VAC - Why stop at a wall outlet?
      - Tesla Coil - 200 kV is better than 600 VAC
      - Lightening Rod. Why stop at 200 kV?
      - Oxy-acetylene Torch - higher temperatures
      - Plasma Cutter - even higher temperatures
      - NdYAG Laser - Etch your name into the remains of the flash chip.
      - Chew Toy for Dog - Don't underestimate some of those canines, although USB keys might not be good for them.
      - Log-Splitting Practice. How good are you at aiming that Axe?
      - Place USB in Cement Footings of a building. Do the mob thing.
      - Rock crusher
      - Grinding Machine
      - Wood chipper / pulper
      - Cement kiln
      - Blast Furnace
      - Industrial Press - Terminator Style!

      I'm pretty sure that some of these machines can destroy industrial quantities of USB keys, with little difficulty. Cement kilns and rock crushers can destroy just about anything. It would be interesting the see the resulting crushed rock in a piece of cement though. It would be colorful.

  55. Raid SSD by DrugCheese · · Score: 1

    I just recently put in two 128Gb SSD disks in a raid 0 set. I set up a ram drive for use as /tmp and have /var going to another partition on a standard SATA harddrive. I changed fstab to mount the drives noatime so it doesn't record file access times. I also made some other tweaks pointing any programs or services that write logs or use a temporary cache somewhere to use /tmp. Its a software raid I use so I'm using /dev/mapper/-- as the device so I'm not exactly sure how to use the schedular, although I have set a line in GRUB that I think does it.

    Ubuntu 64bit boots up in about 10 seconds.

    --
    *DrugCheese rants*
    1. Re:Raid SSD by Anonymous Coward · · Score: 0

      Try using the noop scheduler.

      As a kernel flag:
      elevator=noop

      Or to enable when the system is already booted:
      echo noop > /sys/block/{DEVICE-NAME}/queue/scheduler

  56. Re:SSD's should have no problem with fragmentation by Anonymous Coward · · Score: 0

    Fragmentation is a *DIFFERENT* issue in the world of SSD.

    It now becomes a matter of 'number of commands issued'.

    Lets assume a completely fragmented file of size X. Lets say size X is a multiple of one cluster (of size Y). Lets also say the controller does no read ahead into the cache.

    So to read all of the file I would need to issue X/Y reads. In the world of spinning disks there are three costs here. Bytes sent over the cable to the controller to ask for each cluster (time A). Time waiting for the controller to retrieve the bytes from the disc (time B). Time for the bytes to be sent across the the bus (time C) plus the overhead of the packets (time D). Third the OS having to reassemble the bytes back into a coherent file (time E).

    With SSD time B is no longer the dominating factor. Time A, D, and E become the dominating factor. Time A, D, and E are directly related to how fragmented the file is. Time C can vary wildly on a rotating disk depending one where the clusters are. On a SSD it fairly constant yet very low in time (not 0).

    So an unfragmented file could issue 1 command to say get N clusters and get a giant blast back. Or you could have a totaly fragmented file and get (X/Y)*2 packets back and forth.

    Now that is a worst case. But not something to just be 'ignored' because 'its now faster, so dont worry about it'. The proper answer is 'yes it is faster how much better can we do'. The SSD is still WAY slower than memory and WAY slower than the CPU.

    With SSD's order of files is no longer as important as it is with current spinning disks. You also want to reduce fragmentation of 'empty' space. Why do that? To reduce the possibility of fragmenting a real data file. Both types still have this issue.

    Also why waste bytes on the bus on 'overhead' when you could be better using that for real data? The bus will become THE dominating factor real soon of how fast these things will run. Right now its under (not by much).

  57. SSD sucks battery life. No no, not a troll. by w0mprat · · Score: 1

    In certain situations the increased performance of a SSD removes a bottleneck which would result in increased CPU/memory load. On certain platforms this means these components would spend less time in their lower power states, ie lowered cpu multiplier or core voltage level.

    Tasks for task a SSD saves power, possibly more than would be lost by any higher CPU speed steps, but in something like a looping benchmark more work is done in the same time therefore more power draw.

    This phenomena Had tom's hardware fooled http://www.tomshardware.com/reviews/ssd-hdd-battery,1955.html ("The SSD Power Consumption Hoax : Flash SSDs Donâ(TM)t Improve Your Notebook Battery Runtime â" they Reduce It")

    They later posted a retraction after some people pointed out this flaw.

    I would like to see optimizations in linux to take this into account this effect. Perhaps increasing power saving state thresholds to compensate.

    --
    After logging in slashdot still does not take you back to the page you were on. It's been that way for 20 years.
  58. Re:SSD's should have no problem with fragmentation by jimmyhat3939 · · Score: 1

    Hah. I think he meant 256GB.

    --
    Free Conference Call -- No Spam, High Quality
  59. ATA8 std. allows reporting sector size/alignment by edgecase · · Score: 1

    There is also the ability to "free" unused blocks (with CFA commands at least), maybe so they can be erased in the background, or freed from wear-leveling tracking. There is a commercial device-mapper plugin to force large physical sectors on devices that still use 512 byte logical sectors. Not much different than md-raid devices whose stripe width or stride is much like a large physical sector.

  60. Re:Another file strategy - file segregation by f(x by Anonymous Coward · · Score: 0

    a few other "revolutionary" ideas that all had only one downside: nobody's going to break backwards compatibility, especially while the current system still "just works"

    Actually, my guess is that - like most "revolutionary ideas" people throw out there - the real issue is you expect someone else to implement them. How about you come back when you've got a proof of concept that people can get excited about?

  61. Write problem is with cheap/common SSDs by Sits · · Score: 1

    While the top quality stuff might last, my own personal experience with el cheapo SSDs is that they go bad quickly with moderate (in my case laptop) use due to shabby wear levelling. Others are also warning about (cheap) SSDs throwing away data too. Such SSDs are often the ones you are going to encounter so while the majority of SSDs out there show this behaviour I think it's a warning worth mentioning...

  62. Re:Another file strategy - file segregation by f(x by harry666t · · Score: 1

    It's a lot of work to make even a PoC, and I've got work, school, a few other small projects, and a life. This kind of system would need a very careful design, a lot of experience, and deep knowledge of how the existing solutions work -- knowledge, skill, and experience isn't something you gain overnight. I'm sure that at some point in the future I will try actually implementing it, but at the moment this point seems a little bit distant.

  63. nice one n/t by DrSkwid · · Score: 1

    nice one n/t

    --
    There are places where the networks are not touching,and there are places where they are-Boeing's Lori Gunter
  64. Re:SSD's should have no problem with fragmentation by MooUK · · Score: 1

    I think so too, but I'm allowed to hope he's feeling overly rich and generous, aren't I?

  65. Re:No. Not Now. Not Ever. I'm Coming For All of Yo by evanspw · · Score: 1

    Frak me. Ronald, is that you....

    --
    Interstitial spaces are filled with cream.
  66. Linux can do already do this to a limited extent by Sits · · Score: 1

    When it comes to reads you can't really postpone them - if someone wants to play that music file now and you don't have it already in cache (and Linux already uses unused memory as cache) then you have to hit the disk.

    When it comes to writes you you can often delay them so long as no one is waiting on you telling them that you have written the data to the disk (in which case you have to no choice but to hit the disk). This is already tunable via /proc/sys/vm/dirty_writeback_centisecs . Further there are things like /proc/sys/vm/laptop_mode that will try and batch writes when other I/O was going to happen (e.g. when you play that music file all the writes can happen too). Of course, in the event of a crash you lose much more data (as it wasn't on the disk) and you create more disk contention. See the Lesswatts disk tips page for more details.

  67. Re:Another file strategy - file segregation by f(x by PiSkyHi · · Score: 1

    Context sensitive defrag - sounds like good sense to me, whichever hardware you use.

  68. Re:No. Not Now. Not Ever. I'm Coming For All of Yo by PiSkyHi · · Score: 1

    Who said a brief historical summary of the life of PCs wouldn't look totally bonkers ?

  69. 1gb /boot? lvm? wtf... by anomaly256 · · Score: 1

    Did anyone else feel this guy lost all credibility when they read the bit where he wastes 1gb on /boot and uses lvm for a single volume as a second partition? It's an SSD dude, space and overhead are already major concerns and you just exploded them..

  70. Re:Another file strategy - file segregation by f(x by Anonymous Coward · · Score: 0

    There are defrag utilities that sort by last modified date.

  71. Re:1gb /boot? lvm? wtf... by tytso · · Score: 4, Interesting

    I use 1GB for /boot because I'm a kernel developer and I end up experimenting with a large number of kernels (yes, on my laptop --- I travel way to much, and a lot of my development time happens while I'm on an airplane). In addition, SystemTap requires compiling kernels with debuginfo enabled, which makes the resulting kernels gargantuan --- it's actually not that uncommon for me to fill my /boot partition and need to garbage collect old kernels. So yes, I really do need a 1GB for /boot.

    As far as LVM, of course I use more than a single volume; separate LV's get used for test filesystems (I'm a filesystem developer, remember), but more importantly, the most important reason to use LVM is because it allows you to take snapshots of your live filesystem and then run e2fsck on the snapshot volume --- if the e2fsck is clean you can then drop the snapshot volume, and run "tune2fs -C 0 -T now /dev/XXX" on the file system. This eliminates boot-time fsck's, while still allowing me to make sure the file system is consistent. And because I'm running e2fsck on the snapshot, I can be reading e-mail or browsing the web while the e2fsck is running in the background. LVM is definitely worth the overhead (which isn't that much, in any case).

  72. Re:No. Not Now. Not Ever. I'm Coming For All of Yo by Anonymous Coward · · Score: 0

    Dave? Dave Culter? Is that you? Racked with guilt, now, are we, Dave?

    Look, some of us haven't even forgiven you for what you did to RSX-11D with that RSX-11M monstrosity, much less what you did cross-breeding ODS2 with DOS. Just 'cause KO cancelled that whole EPIC thing was no reason to become a Sith, dude!

    Rant all you want. You're not forgiven. Not by anybody who had less-than-6-digit badge numbers.

  73. Re:Another file strategy - file segregation by f(x by Anonymous Coward · · Score: 0

    That was my idea when I've proposed an "object storage system" here on /. a few months ago: associate type and metadata with every file, making them more "object-like" (as in object-oriented programming). The storage system would know the behaviour of each object (whether it is likely to grow, or more likely to be modified in place, or probably not modified at all, etc), and would choose the most efficient way of storing every particular kind of data. I've also proposed separate namespaces for each process, capability-based security, dropping paths in favour of non-hierarchical tags, and a few other "revolutionary" ideas that all had only one downside: nobody's going to break backwards compatibility, especially while the current system still "just works".

    Files data grouping was done in reiser4 by means of introducing "fibers", where the "fiber" is the way to say FS to group files data with some policy. Policies for commonly used extensions like *.c, *.h, *.o, *.mp3, etc., were built-in, others could be added. This ensured that all *.o are physically placed close to each other so that read-ahead and other nice things (like smaller number of seeks) really do their job wile compiling big tree of sources like Linux kernel.

  74. Re:Another file strategy - file segregation by f(x by orasio · · Score: 1

    What raises the question: does Hans Reiser have a laptop, and SVN access?
    Reiser4 was supposed to have a lot of metadata, at least eventually.
    Making use of the metadata is not the hard thing, the issue is to make it fast, and try not to break too many APIs. I trusted Reiser on that.

    Another group that already had that idea is MS. They have been messing around with that WinFS thing for at least a decade. They were trying to use MS sql server, at some point, I think that approach is what is keeping them from succeeding.