Optimizing Linux Systems For Solid State Disks
tytso writes "I've recently started exploring ways of configuring Solid State Disks (SSDs) so they work most efficiently in Linux. In particular, Intel's new 80GB X25-M, which has fallen down to a street price of around $400 and thus within my toy budget. It turns out that the Linux Storage Stack isn't set up well to align partitions and filesystems for use with SSD's, RAID systems, and 4k sector disks. There are also some interesting configuration and tuning that we need to do to avoid potential fragmentation problems with the current generation of Intel SSDs. I've figured out ways of addressing some of these issues, but it's clear that more work is needed to make this easy for mere mortals to efficiently use next generation storage devices with Linux."
I think the bigger challenge will be in getting mere mortals to have a $400 toy budget to afford the SSD
Your government is working towards it.
Yes, we do need progress in that area. However, for many of us who require better-than-average data security, the matter of SSD's read/write behaviour makes the devices extremely vulnerable to analyses and discovery of data the owner/author of which believes to be inaccessible to others: 'secure wiping', or lack thereof, is the issue. As i understand it, 'secure wiping' programs fail to do their job, on SSD's . It's been reported among 'criminals' that SSD's are a 'forensic analyst's dream come true' ! and so it must be for corporate spies, etc,, who have a yen for theft of private data.
I know right? Send some cheddar my way Mr. Gates.
I for one hope he is successful so that when SSDs become more affordable, or even the default, Linux will be nicely optimized.
The more people buy it the sooner it will get under $50. But considering the recent financial conditions, people would rather let others buy the SSD so that they can get it for $50 in August 2010. I'm afraid this time its gonna take longer than that to see a tenfold reduction in storage device costs.
Face your daemons!
This article makes me wonder if any OS is really properly optimized for SSDs. Has there been any analysis as to whether or not windows machines properly optimize the use of solid state disks? Perhaps the problem goes beyond just linux?
The Matrix is real... but I'm only visiting!
If I mount /home on a separate drive, (good to do when upgrading) the rest of the Linux file system fits nicely on a small SSD.
My rights don't need management.
From economics, lets turn our attention to optimizing this toy of ours. The thing with SSDs is that they don't have a read/write head to worry about. This means that no matter where the data is stored in the device, all we need to do is specify the fetch location and the logic circuits select that block to extract the data from desired location. From what I've heard, the SSDs have an algorithm to actually assign different blocks to store the data so that the memory cells in a single locations aren't overused.
Face your daemons!
Most of us can't afford to worry about this, but does the Fusion-io suffer from this issue?
The cost of that cleanup, of course, will be borne by taxpayers, not industry.
Surely it's not the block size. I know nothing about filesystems beyond basics. Windows could specify the block size to be used. I assumed that Linux did the same? I have no idea about OS X either.
Are there standard block sizes in use for Linux and OS X filesystems? Can they be modified when they are formatted? If so, and the issue really is due to blocksize and fragmentation as a result, this would seem like an easy fix. Linux and OS X already resist fragmentation. I won't speak to MS's efforts there as they state NTFS does, but the implementation seems to be very different in the real world.
Some of you FS guru's fill us in here. How hard is it to implement something like variable block sizes, or to allow you to specify block size at format time?
> Vista has already started working around this problem, since it uses a default partitioning geometry of 240 heads and 63 sectors/track. This results in a cylinder boundary which is divisible by 8, and so the partitions (with the exception of the first, which is still misaligned unless you play some additional tricks) are 4k aligned. So this is one place where Vista is ahead of Linuxâ¦.
Although the technology it is used in is repugnant, NTFS has always been the One True Filesystem. It descended from DIGITAL's ODS2 (On Disk Structure 2) which traces back to the original Five Models (PDP 1, 8, 10, 11 and 12). You see, ODS was written by passionate people with degrees and rich personal lives in Massachusetts who sang and danced before the fall of humanity to the indignant Gates series who assimilated their young wherever possible and worked them into early graves during his epic battle with the Steves before the UNIX enemy remerged after a 25 year sleep and nuked the United States, draining all of its technological secrets to the other side of the world. Gates, realizing what he's done, now travels the universe seeking to rebuild his legacy by purifying humanity while the Steve series attempts to rebuild itself. Some of the original Five are still around, left to logon to Slashdot and witness what's left of the shadow of humanity still in the game as they struggle blindly around in epic circles indulging new and different ways to steal music, art and technology to make up for their lack of creativity long ago bred out of them by the Gates series.
SSDs gradually gain more and more sophisticated controllers which do more and more to try to make the SSD seem like an ordinary hard drive, but at the end of the day the differences are great enough that they can't all be plastered over that way (the fragmentation/long term use problems the story linked to are a good example). I know that (at present- this could and should be fixed) making these things run on a regular hard drive interface and tolerate being used with a regular FS is important for Windows compatibility, but it seems like a lot of cost could be avoided and a lot of performance gained by having a more direct flash interface and using flash-specific filesystems like UBIFS, YAFFS2, or LogFS. I have to wonder why vendors aren't pursuing that path.
This means that no matter where the data is stored in the device, all we need to do is specify the fetch location and the logic circuits select that block to extract the data from desired location.
Which is why you don't need head-optimized I/O schedulers like Anticipatory, which waits a couple of ms after every read to see if there's more from that area, thus saving on seek times.
SSD's must be optimized differently. For instance, they can't write arbitrary small pieces of data, only whole blocks. Thus, if you want to optimize it, you'd better make sure to write whole blocks at a time if possible, and not have small files cross boundaries if they don't have to.
I've been wrestling this idea around as a sound studio solution, and it seems that an external storage unit makes the most sense, with a DRAM card for the currently working files. Almost affordable, anyway.
The cost of that cleanup, of course, will be borne by taxpayers, not industry.
I have mod points, but cannot find the "Totally Bonkers" mod...
Generally, bash is superior to python in those environments where python is not installed.
I've considered getting a large capacity CF card (16 GB or 32 GB) to use as a solid state drive for my laptop. The CF + adapter combination is a lot cheaper than these new SSD. So why should I get a SSD vs. a CF card?
> So why should I get a SSD vs. a CF card?
10 times better performance and wear-leveling worth a crap.
. . . which runs on the Nokia N800/N810 "Internet Tablets" (www.maemo.org). They might have done some tweaking, since this is Linux running on SSDs.
Schroedinger's Brexit: The UK is both in and out of the EU at the same time!
when I saw the headline, I was thinking not so much the fragmentation issues, but the repeated re-writing of logs and other small frequently accessed files that SSDs are susceptible to (maximum # of rated read-write cycles). Have there been any developments in that area?
I don't think this is going to be a significant problem when compared to normal seek time problems.
Lets say we have 100 k of data to read. 512 byte blocks would require 200 reads. 4k blocks would require 25 reads.
For rotating discs: If the data is contiguous, we have to hope that all the blocks are on the same track. If they are, then there is 1 (potentially very costly) seek to get to the track with all the blocks on it. The cost of the seek is dependent on the track it's going to, the track it's on, and whether or not the drive is sleeping or spun down. Otherwise we also get to do another very short seek, which is going to add a bit of time to get to the next adjacent track. Worst case scenario all 200 blocks are on different tracks, scattered randomly on the platter, requiring 200 seeks. Ouch ouch ouch.
For SSDs: What is important is the number of cells we have to read. Cells will be 4k in size. All seek times are essentially zero. Best case scenario, all data is contiguous, and the start block is at the start of a cell. Read time boils down to how fast the flash can read 20 cells. Worst case scenario is where the data is 100% fragmented, such that all 200 512 byte blocks reside in a different cell, requiring 200 cell reads. (10fold increase in time required) There will also be overhead in copying out the 512 byte data from each buffer and assembling things, but this time is negligible for this comparison.
While the 20x time increase (order N) looks significant, it's important to compare the probabilities involved, and just how bad things get. The most important difference between how these two drives react is the space between fragments. In the "worse case' for SSD, 100% fragmentation, is highly unlikely. I don't even want to think about what a spinning disc would do if asked to perform a head seek for 100% of the blocks in say, a 1mb file. The read head would probably sing like a tuning fork at the very least. 2000 cell reads compared to 2000 seeks, the SSD will win handily every single time, even if the tracks on the disc are close.
If the spacing between fragments is anything near normal, say 30-100k, then there will be some seeking going on with the disc, and there will be some wasted cell reads with the SDD, but having to do an extra one cell read compared with having to do an extra head seek, again the SSD wins hands down. The advantage of the SSD actually goes down as fragmentation goes down, because most fragments are going to cause a head seek, each of will significantly widen the time gap. Also a spinning disc will read in the blocks much faster than the cells on a SSD.
I realize the OP was more describing the possibility of "not so much bang for the buck as you are expecting" due to fragmentation, and I know the above hits more on comparing the two than what happens to the SSD, but if you consider the effects of fragmentation on a spinning disc, and then weigh how the impact compares with a SSD, it's easy to see that fragmentation that sent you running for the defrag tool yesterday may not even be noticeable with a SSD. So I'd call this a "non-issue".
What I'm waiting for is them to invest the same dev time in read speeds as write speeds. SSDs don't appear to be doing any interleaved reads - they're doing it for the writes because they're so slow. Though at this point I wonder if read speeds are just plain running into a bus speed limit with the SSDs?
I work for the Department of Redundancy Department.
Please mod the parent funny; so say we all.
From what I can scrape together quickly off of the Internet IANASE (I am not a software engineer). The biggest difference seems to be the lack of a need for error checking and disk defrag etc. Since the a normal spinning hdd does not actually delete a file but just removes the markers the filesystem treats all areas the same and does the same things to both real and non-real data to keep the disk state sane. In an SSD all of this leads to a lot of unneeded disk usage and premature degradation of the drive itself.
There seems to be more about Data set management but I don't quite understand it.. maybe someone more knowledgeable could explain it?
once more into the breach
I'm just sitting here thinking. Doesn't an SSD have a preset number of writes in it due to it's nature?
Does it really matter if they spread these writes around on the hard drive when the number of writes the drive is capable of doing is still the same in the end?
To drastically oversimplify, lets say that each block can be written to twice. Does it really matter if they used up the first blocks on the drive and just spread towards the end of the drive partition with general usage rather than jumping all over to try to spread the writes around?
Am I thinking about this the wrong way? What benefit does it give them to spread the writes around if the total number of writes doesn't change? Doesn't it just further fragment the files with little gain?
Yes, but for SSD's the blocks are larger - problems when essentially all software is optimized for smaller blocks.
Your CF card is going to use the USB interface which maxes out at about 40Mbps as opposed to using an internal SSD's SATAII interface which maxes at 300Mbps. Not quite an order of magnitude, but close.
On the other hand, if you're going to use an external SSD connected to the USB port, then you wouldn't see any difference between the 2 in terms of speed. Lifespan might be longer w/ the SSD due to better wear leveling, but in either case you're probably going to lose or break it before you get to the fail point.
A real SSD has several advantages over using CF cards, but not for the reasons you state.
With a simple plug adapter, CF cards can be connected to an IDE interface, so speeds won't be limited by interface speed. The most recent revision of the CF spec adds support for IDE Ultra DMA 133 (133 MB/s)
A couple of additional points, just because I love nitpicking:
- A USB 2.0 mass storage device has a practical maximum speed of around 25 MB/s, not 40 Mb/s.
- The so-called SATA II interface (that name is actually incorrect and is not sanctioned by the standardization body) has a maximum speed of 300 MB/s, not Mb/s.
i haven't yet found a sata device
(even doms) that require chs addressing.
clearly it was a mistake to use hardware
quirks to address sectors, but the again,
ata became a de facto standard before
realized it might become one.
Why not functionally group files to decrease or eliminate fragmentation? Or maybe this is already done.
For example - I have a large collection of MP3 files. They essentially do not change, as in I don't edit them, and rarely erase them. The file system could look at they type of file (mp3, vs doc) and place it accordingly. It could also look at the last change in the file and place it in a certain area. Older unchanged files are placed in a tightly placed/packed file area that is optimized and not fragmented.
..........FULL STOP.
Sure. There are *lots* of considerations beyond speed to want SSDs
And SSD drives are also shock-resistant.
The drives will be shocked when they see what I have in my pr0n collection.
If it's an older laptop or the mechanical hard disk died, go for it. Addonics make SATA CF adapters so you are not restricted to IDE CF adapters.
"This post is an artistic work of fiction and falsehood. Only a fool would take anything posted here as fact."
Why is this informative? CF with an adapter is NOT USB.
From my experience, using an adapter puts it on the native interface - notably, with CF, it's easiest to put the device into a machine that has a native IDE (not SATA) interface. CF is pin compatible with IDE.
Now, in the current offering of SLC/MLC "drives" you can actually get better read/write since they "raid" for lack of a better term the internal chips. I'm using a transcend ATA-4 CF device that gets around 30MB/sec read/write in a machine in my garage; it's an SLC device that isn't their top of the line, but it was more cost-effective.
So, using the IDE/ATA-4 interface on the CF card, it gets lower CPU utilization than a USB device. Still doesn't hit the 40MB/sec you quoted, but 40MB/sec is a pipe dream on USB in my experience.
Karnal
No doubt. But, I really think that within 5 years you're going to see most laptops using only an SSD.
Free Conference Call -- No Spam, High Quality
Good analysis. The statistics I've read indicate that SSD's don't perform all that much better than hard drives in real-world scenarios. I think this is part of the reason for that performance. On the other hand, they do use less energy, which is a clear positive for a laptop.
Free Conference Call -- No Spam, High Quality
On the other hand, they do use less energy, which is a clear positive for a laptop.
And thus they are cooler. A clear positive for any system, but especially a laptop.
They are also silent and don't vibrate.
They are also, from what I understand, more reliable.
I'm seriously considering flash drives for my desktop PC... they just need one more capacity jump and I think they'll be worth it. $400 for 128MB is a touch small.. but I'll go for it at $400 for 256MB. On my main PC I'm only using 236GB of my 500GB drive, and I could easily move 150GB of that onto my 1TB external e-sata drives that I turn on when I need.
Your CF card is going to use the USB interface
This is Informative?
CF cards are actually IDE devices. The adapters that plug CF into your IDE bus are just passive wiring.. no protocol adapter needed.
It's trivial to replace a laptop drive with a modern high-density CF card, and sometimes a great thing to do.
The highest-performance CF cards today use UDMA for even higher bandwidth.
HighSpeed USB can't reasonably get over 25MB/sec from the cards using a USB-CF adapter, but you can do better by using its native bus.
I purchased an X300 Thinkpad for the company this week and took a close look at it. I thought expensive business notebooks come without crapware. And I was sure the X300 would be optimized. But they had defrags scheduled! I always thought defrag is a no no for ssds. Now I am not sure anymore. I deinstalled it first. But who knows?
Except that the lastest gen SSD's exceed 250MB/sec throughput. If the latest CF spec just added 133MB/sec, then that would be a huge bottleneck in throughput.
I think, Theodore should look into technologies like the ZFS L2ARC (just look at using SSD as an additional cache to supplement disks based on rotating rust. The L2ARC stores recently evicted pages from the primary ARC (the Adjustable Replacement Cache) of ZFS on SSD. From my view this is a more reasonable usage of SSD than just as another primary storage media.
I recently wrote an article about the mechanism of ARC and L2ARC in conjunction with SSD in my blog, but i don't want to slashdot my site ;)
So why should I get a SSD vs. a CF card?
CF works passably in WORM-like scenarios, where you basically use it in read-only mode and update it rarely and in big chunks. For random R/W access, CF lacks wear leveling to give it a tolerable life expectancy... Thus you commonly see it used in embedded devices such as routers and dumbterms where you may update the firmware or OS every few months; You don't see it used much in real, live writable FSs.
It also tends to have rather poor performance, with reads in the sub-5MB/s range and writes taking forever. So again, using a 32MB CF to boot a router, works great; Using a 32GB CF as the system partition for a modern desktop PC (even with some solution to the limited erase lifetime, such as a UnionFS against a ramdisk with commit-on-shutdown), you can expect 10+ minute boot times.
I would move /tmp to either a RAM disk or a hard drive. There is no point in having tmp files using up the lifespan of your SSD, especially after you just moved /home to extend its life. Also, you could move some of the stuff in /var to a hard drive or ramdisk. Good candidates might be /var/tmp and /var/log. Alternatively, you could just move the entire /var hierarchy to a hard drive.
Why not functionally group files to decrease or eliminate fragmentation? Or maybe this is already done.
In a Linux system, this is easily done, but few people bother.
Most of the write activity in Linux is in /tmp, and also in /var (for example, log files live in /var/log). User files go in /home.
So, you can use different partitions, each with its own file system, for /, /tmp, /home, and /var.
The major problem with this is that, if you guess wrong about how big a partition should be, it's a pain to resize things. So my usual thing is just to put /tmp on its own partition, and have a separate partition for / and for /home.
The /tmp partition and swap partition are put at the beginning of the disc, in hopes that seek penalties might be a little lower there. Then / has a generous amount of space, and /home has everything left over.
When a *NIX system runs out of disk space in /tmp, Very Bad Things happen. Far too much software was written in C by people who didn't bother to check error codes; things like disk writes don't fail often, but when /tmp is 100% full, every write fails. A system may act oddly when /tmp is full, without actually crashing or giving you a warning. So, the moral of the story is: disk is cheap, so if you give /tmp its own partition, make it pretty big; I usually use 4 GB now. However, if you run out of disk space in /var, it is not quite as serious. Your system logs stop logging. And, many databases are in /var so you may not be able to insert into your database anymore.
The main Ubuntu installer is fast, because it wipes out the / partition and puts in all new stuff. So, if you have separate partitions for / and /home, life is good: you just let the installer wipe /, and your /home is safely untouched. It's annoying when you have /home as just a subdirectory on / and you want to run the installer. But, by default, the Ubuntu installer will make one big partition for everything; if you want to organize by partitions, you will need to set things up by hand.
steveha
lf(1): it's like ls(1) but sorts filenames by extension, tersely
Although the technology it is used in is repugnant, NTFS has always been the One True Filesystem.
I thought ZFS was.
And ZFS has native support for SSD as L2ARC. http://www.c0t0d0s0.org/media/presentations/ssd.pdf I have nothing but praise for ZFS. Simple to manage, reliable, fast. With native CIFS instead of User file system Samba, I've seen orders of magnitude performance from windows machines when doing networked file access. Gary
Your CF card is going to use the USB interface which maxes out at about 40Mbps as opposed to using an internal SSD's SATAII interface which maxes at 300Mbps. Not quite an order of magnitude, but close.
There are three factual errors in that statement.
1. CF-cards can be connected directly to the ATA-port via a simple passive connector-adapter and therefor have a theoretical maximum transfer speed of 133MB/s, which roughly translates to 1300Mbps. There's even adapters with room for both a master and slave CF-card in the same shape, size and connector position as a 2.5" ATA drive, specifically made to use CF-cards in laptops.
2. USB is 480Mbps.
3. SATA is 3000Mbps
The big speed-difference between SSD and CF is due to the construction of the devices themselves, not the interface that connects them to the computer.
A fast CF-card can get you around 40MB/s and at the moment they also top out at 32GB sizes and they're not made to handle long term random write operations.
A fast SSD can get you all the way to the theoretical maximum of SATA, around 300MB/s, and are available in much bigger sizes.
/.Mattsson - My native language is not English, so please don't whine over linguistic errors. (That's lame anyway...)
SSDs have a different feel to them. The time to load a file is more consistent on my eeePC than on other laptops I own which have rotating disks.
http://michaelsmith.id.au
The modern hot-shit high-speed CF cards have wear leveling and do UDMA transfers, you get a CF to ATA adapter, not CF to USB, and they will outperform most hard disks.
"You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
Seems to me that Sun's zfs filesystem is ready to use the ssd storage. The copy-on-write strategy would seem to avoid the hot spots as zfs picks new blocks from the free pool rather than rewriting the same block.
Actually, given the X25-M's lack of TRIM support, using a log-structured filesystem, a write-anywhere filesystem, or a copy-on-write type system is actually a really bad use of the X25-M, since the X25-M will think the entire disk is in use. The X25-M is actually implemented to optimize for filesystems that reuse blocks as much as possible, since it is internally doing the equivalent of a log-structured filesystem to do wear leveling. TRIM support will obviously help, but for ZFS, the X25-M is probably not a good choice. A cheaper flash drive which doesn't try to be smart about wear leveling would actually be better for ZFS.
You can also get CFSATA adapters. I have a couple here
There are places where the networks are not touching,and there are places where they are-Boeing's Lori Gunter
My kingdom for a mod point today...
If you'll pay $400 for 256*MB*, I think you've got a little too much money and should give me some....
Yea, I just wanted to stress the fact that it's not USB more than anything; haven't tested personally the CF-->SATA bridges. They work well?
Karnal
That was my idea when I've proposed an "object storage system" here on /. a few months ago: associate type and metadata with every file, making them more "object-like" (as in object-oriented programming). The storage system would know the behaviour of each object (whether it is likely to grow, or more likely to be modified in place, or probably not modified at all, etc), and would choose the most efficient way of storing every particular kind of data. I've also proposed separate namespaces for each process, capability-based security, dropping paths in favour of non-hierarchical tags, and a few other "revolutionary" ideas that all had only one downside: nobody's going to break backwards compatibility, especially while the current system still "just works".
Its not the volume of supply which is causing the high prices.
They are inherently expensive to make with todays methods.
Good point, I will have to think about that...
Well, I fired up Ubuntu with the new configuration and I wasn't disappointed - WOW!
Booting is lightning quick - I am still doing a lot of downloads so I haven't had a chance as some real performance tests but from what I have seen so far the results are impressive.
My rights don't need management.
"tytso" is Theodore T'so.
He and Remy Card wrote ext2. He and Stephen Tweedie wrote ext3. He and Ming Ming Cao wrote ext4.
He maintains the filesystem repair tool (e2fsck) and resizing tool for those filesystems.
He also created the world's first /dev/random device, maintained the tsx-11.mit.edu Linux archive site for many years, and wrote a chunk of Kerberos. He's been the technical chairman for many Linux-related conferences. He pretty much runs the kernel summit.
He's certainly not a kid. I think he's about to turn 40.
Really, Intel ought to give tytso piles of free SSD hardware before it goes on sale. This would help Intel by encouraging tytso to optimize Linux for Intel's SSD hardware.
So many choices!
belt sander
nitric acid
cutting torch
charcoal and a blower
chip wired into an AC wall socket
thermite
repeated use as a model rocket blast deflector
drill press
I just recently put in two 128Gb SSD disks in a raid 0 set. I set up a ram drive for use as /tmp and have /var going to another partition on a standard SATA harddrive. I changed fstab to mount the drives noatime so it doesn't record file access times. I also made some other tweaks pointing any programs or services that write logs or use a temporary cache somewhere to use /tmp. Its a software raid I use so I'm using /dev/mapper/-- as the device so I'm not exactly sure how to use the schedular, although I have set a line in GRUB that I think does it.
Ubuntu 64bit boots up in about 10 seconds.
*DrugCheese rants*
In certain situations the increased performance of a SSD removes a bottleneck which would result in increased CPU/memory load. On certain platforms this means these components would spend less time in their lower power states, ie lowered cpu multiplier or core voltage level.
Tasks for task a SSD saves power, possibly more than would be lost by any higher CPU speed steps, but in something like a looping benchmark more work is done in the same time therefore more power draw.
This phenomena Had tom's hardware fooled http://www.tomshardware.com/reviews/ssd-hdd-battery,1955.html ("The SSD Power Consumption Hoax : Flash SSDs Donâ(TM)t Improve Your Notebook Battery Runtime â" they Reduce It")
They later posted a retraction after some people pointed out this flaw.
I would like to see optimizations in linux to take this into account this effect. Perhaps increasing power saving state thresholds to compensate.
After logging in slashdot still does not take you back to the page you were on. It's been that way for 20 years.
Hah. I think he meant 256GB.
Free Conference Call -- No Spam, High Quality
There is also the ability to "free" unused blocks (with CFA commands at least), maybe so they can be erased in the background, or freed from wear-leveling tracking. There is a commercial device-mapper plugin to force large physical sectors on devices that still use 512 byte logical sectors. Not much different than md-raid devices whose stripe width or stride is much like a large physical sector.
While the top quality stuff might last, my own personal experience with el cheapo SSDs is that they go bad quickly with moderate (in my case laptop) use due to shabby wear levelling. Others are also warning about (cheap) SSDs throwing away data too. Such SSDs are often the ones you are going to encounter so while the majority of SSDs out there show this behaviour I think it's a warning worth mentioning...
It's a lot of work to make even a PoC, and I've got work, school, a few other small projects, and a life. This kind of system would need a very careful design, a lot of experience, and deep knowledge of how the existing solutions work -- knowledge, skill, and experience isn't something you gain overnight. I'm sure that at some point in the future I will try actually implementing it, but at the moment this point seems a little bit distant.
nice one n/t
There are places where the networks are not touching,and there are places where they are-Boeing's Lori Gunter
I think so too, but I'm allowed to hope he's feeling overly rich and generous, aren't I?
The benchmark comes down to the individual CF card, so works well = works. I had them as a software striped RAID for a while to make write speed > 25Mb/s (firewire dv video speed) and achieved that. Sadly my firewire camera uses the protocol that Linux doesn't so that project is on the shelf.
There are places where the networks are not touching,and there are places where they are-Boeing's Lori Gunter
Frak me. Ronald, is that you....
Interstitial spaces are filled with cream.
When it comes to reads you can't really postpone them - if someone wants to play that music file now and you don't have it already in cache (and Linux already uses unused memory as cache) then you have to hit the disk.
When it comes to writes you you can often delay them so long as no one is waiting on you telling them that you have written the data to the disk (in which case you have to no choice but to hit the disk). This is already tunable via /proc/sys/vm/dirty_writeback_centisecs . Further there are things like /proc/sys/vm/laptop_mode that will try and batch writes when other I/O was going to happen (e.g. when you play that music file all the writes can happen too). Of course, in the event of a crash you lose much more data (as it wasn't on the disk) and you create more disk contention. See the Lesswatts disk tips page for more details.
Context sensitive defrag - sounds like good sense to me, whichever hardware you use.
Who said a brief historical summary of the life of PCs wouldn't look totally bonkers ?
Did anyone else feel this guy lost all credibility when they read the bit where he wastes 1gb on /boot and uses lvm for a single volume as a second partition?
It's an SSD dude, space and overhead are already major concerns and you just exploded them..
Interesting. My understanding is that HFS+ meets these requirements pretty well - would the X25-M be a good choice for a Macintosh system?
You just need a background task that zeros blocks used by deleted files. EasyCo's MFT implements such a log filesystem as a generic block device for x86 Linux and Windows.
Well I've noticed that I don't have a CF-IDE adapter that does 3.3v, so I'm being limited to ATA-4 instead of ATA-5 on my current card. Have you noticed/read anything similar?
Karnal
It's not obvious to me that X25-M treats a block that has been zero'ed out as an "unallocated block". It could do this, but it's not at all guaranteed that it does this. Do you know for certain (via an Intel specification sheet) that writing all ZERO's is the equivalent of an ATA TRIM?
I use 1GB for /boot because I'm a kernel developer and I end up experimenting with a large number of kernels (yes, on my laptop --- I travel way to much, and a lot of my development time happens while I'm on an airplane). In addition, SystemTap requires compiling kernels with debuginfo enabled, which makes the resulting kernels gargantuan --- it's actually not that uncommon for me to fill my /boot partition and need to garbage collect old kernels. So yes, I really do need a 1GB for /boot.
As far as LVM, of course I use more than a single volume; separate LV's get used for test filesystems (I'm a filesystem developer, remember), but more importantly, the most important reason to use LVM is because it allows you to take snapshots of your live filesystem and then run e2fsck on the snapshot volume --- if the e2fsck is clean you can then drop the snapshot volume, and run "tune2fs -C 0 -T now /dev/XXX" on the file system. This eliminates boot-time fsck's, while still allowing me to make sure the file system is consistent. And because I'm running e2fsck on the snapshot, I can be reading e-mail or browsing the web while the e2fsck is running in the background. LVM is definitely worth the overhead (which isn't that much, in any case).
What raises the question: does Hans Reiser have a laptop, and SVN access?
Reiser4 was supposed to have a lot of metadata, at least eventually.
Making use of the metadata is not the hard thing, the issue is to make it fast, and try not to break too many APIs. I trusted Reiser on that.
Another group that already had that idea is MS. They have been messing around with that WinFS thing for at least a decade. They were trying to use MS sql server, at some point, I think that approach is what is keeping them from succeeding.
It is not the equivalent, because more data needs to transfered. But the effect is the same, according to Intel. Also works for most SSDs.
Can you give me a URL or citation from someone official at Intel who has said this? As near as I can tell, Intel has been very tight-lipped about what the X25-M does internally.
I think slightly misinterpreted this article:
http://www.pcper.com/article.php?aid=669&type=expert&pid=5
Anyways, writing zeros, or writing something else sequentially should essentially be the same. More dumb Flash based SSDs actually respond to writing zeros, so I just remembered "here you can do the same zeroing trick".
Anyways, writing zeros, or writing something else sequentially should essentially be the same.
No writing sequentially is not the same as an ATA TRIM command, since the X25-M can't reuse the blocks for real data. It might (or might not) help the internal fragmentation of the X25-M's internal LBA redirection table --- but given that the PC Perspectives article pointed out that when things got bad, even a complete write pass across the entire disk was not sufficient to restore performance, I doubt it.
This makes sense, actually; without an ATA trim command, if you write the entire disk, the X25-M won't have much in the way of spare room in order for it to do its garbage collection/defragmentation operation. All it will have is the difference between 80 (real) GB (or GiB's for people who like that notation) and 80 (hd marketing) GB's. And apparently that is not enough.
I've had some people suggest that reserving a partition with a few gig's and never using it helps, since that provides some extra room for the X25-M to recover; but I don't have anything authoratative.
But back to the original point, what we really need is a way to tell the disk, "we don't care about the contents of the blocks any more". It *might* be that writing some magic pattern, whether all zero's or all one's --- and in fact, all one's makes more sense since an erased flash memory cell returns '1', not '0'. But the key question is whether or not the SSD's firmware treats this as "ok to reuse" or not. And for that we need a definitive answer from Intel.