Changes in HDD Sector Usage After 30 Years
freitasm writes "A story on Geekzone tells us that IDEMA (Disk Drive, Equipment, and Materials Association) is planning to implement a new standard for HDD sector usage, replacing the old 512-byte sector with a new 4096-byte sector. The association says it will be more efficient. According to the article Windows Vista will ship with this support already."
Why not a 4MB sector?
Well, CD-ROMs use 2352 bytes per sector, ending up with 2048 actual bytes after error correction. Looking at the size of the HDDs these days a 4096-byte sector seems pretty reasonable.
Serving time in Aristotelean prison for violating laws of physics
Is Ingo Molnar working on this?
I thought cluster sizes were already 4KB for efficiency, and LBA for larger drive sizes. So how does changing the sector size change things? (Especially when we don't access drives by sector/cylinder anymore?)
In Soviet Russia, articles before post read *you*!
so long as this new format is transparent, built internally in the drives and doesn't effect older hardware or software, there shouldn't be a problem. It also should not contain any DRM junk.
All to often an advantage in speed improvements and such are more than countered by adding overhead junk.
now maybe I should RTFA...
Will something like NTFS disk compression actually make it less than one sector, or are you right that every file will take up at least 4096 bytes? Does anyone know if this will do anything to improve retrieval speed?
So... If I write down a little 16-byte message to myself in Notepad containing a name and a phone number, it will take up 4096 bytes.
On most systems in use today, it already does.
Blame the file system, not the sector size on the media.
-jcr
The only title of honor that a tyrant can grant is "Enemy of the State."
You're thinking of 'cluster'. This is tied to the file system that is actually used on the disk. Even with the current 512-byte sector, a normal NTFS partition of, say, 200GB, uses 4KB cluster and a single file takes up a minimum of 4KB already.
Serving time in Aristotelean prison for violating laws of physics
Most "normal use" filesystems nowadays (FAT32, Ext3, HFS, Reiser) all use 4K blocks by default. That means that the smallest amount of data that you can change at a time is 4k, so every time you change a block, the HDD has to do 8 writes or reads. That would leave the drive preforming 8x the number of commands that it would need to.
As filesystems are slowly moving towards larger block sizes, now that the "wasted" space on drives due to unused space at the ends of blocks are not as noticable, moving up the size on the underlying hardware also makes sense. I don't think that this can make things too much faster, but it would allow SATA drives (and SCSI also) to quesu more commands in their internal buffers, as they will onyl be recieving one command per read/write that the filesystem does, instead of 8.
My UID is prime and so is this number: 09 F9 11 02 9D 74 E3 5B D8 41 56 C5 63 56 88 C0.
NTFS will write something that small into the MFT.
Best analogy is a gym locker room
You have say, 10 lockers up and 20 lockers accross
You can only put one thing in a locker, so you cant put your gym shorts in the same one as your shoes. But if you have lots of socks, you can pile them in, and take up two or three if neccessary.
Space is wasted if you have a really big locker, but it's only holding a sock.
Now, you've got to record where all of this stuff is, or you will take forever to find that sock. So you set asside a locker to hold the clipboard with designations.
Now to bring this back into real life. There are a _lot_ of sectors on a disk. So keeping track of all of them starts requiring a substantial amount of resources. I imagine they are finding it easier to justify wasting space for small files in order to make it easier to keep track of them. Average file sizes are also going up, so it's not as big of a problem as it used to be either. It's all relative...
No, I am not an English major. My posts are subject to typos and incorrect grammar. Do not expect perfection.
Small devices like cellphones typically save files of several kilobytes, whether they be the phonebook database or something like camera images. Whether the data is saved in a couple large sectors or 8 times that many small sectors isn't really an issue. Either way will work fine, as far as the data is concerned. The biggest problem is the amount of battery power used to transfer those files. If you have to re-issue a read or write command (well, the filesystem would do this) for each 512-byte block, that means that you will spend 8 times more energy (give or take a bit) to read or write the same 4k block of data.
Also, squaring away each sector after processing is a round trip back to the filesystem which can be eliminated by reading a larger sector size in the first place.
Some semi-ATA disks already force a minimum 4096-byte sector size. It's not necessarily the best way to get the most usage out of your disks, but it is one way of speeding up the disk just a little bit more to reduce power consumption.
It is perhaps already taking it or even more.
Today's filesystems are usually using larger chunks than 512 bytes to save data. But of course it depends of the filesystem you are using.
And judging of your talk about Notepad, you are using Windows. Windows NTFS uses 4k blocks by default on large (> 2GB) disks.
OSes allocate disk space by the sector. If the size of a file is less than 1 sector, the rest is wasted. This is called internal fragmentation. Every file wastes sector_size-file_size%sector_size bytes. On average, thats sector_size/2 bytes per file.
I still have more fans than freaks. WTF is wrong with you people?
...to back up the claim that this is more efficient, though it intuitively "feels" like it should be faster, but not necessarily more efficient in terms of space. Suppose a file with a size of 1 byte takes up 512 bytes of space on the disk. With this larger sector size, that file would take 4k. I don't see why this isn't an option that can be set through drive initialization parameters, and why you can't choose any size for the sectors, depending on whatever tweaking you can do to figure out what's best for your application.
Also, Solitaire will be replaced by Duke Nukem Forever on every shipped copy of Vista. And if you're one of the first 100 in line at any Best Buy when you pick up Vista, you will also get a free Phantom game console.
Well of course Vista will ship with this supported already. Just like WinFS...er..
Um, it already does take up 4K or more. Unless you have a hard disk smaller then 256MB.
p pro/reskit/c13621675.mspx and scroll down to Table 13-4
See: http://www.microsoft.com/technet/prodtechnol/winx
If you notice, in most of the useful cases the custer size is 4K. Making the hard disk match this seems like a good idea to me.
And EXT2 also uses a 4K block size.
Also remember it's for large disks, no FS that I know of supports a cluster (or block) size smaller then 4K for large disks.
-Ariel
To answer someone's question, if all disk transfers are multiples of 4K anyway, you're better off using that as the hardware sector size because there's less overhead -- less spatial overhead on the disk because you have fewer sector headers and intersector gaps, and less temporal overhead in the I/O protocol because you're only sending one transfer command instead of eight.
Your god may be dead, but mine aren't!
I'm willing to sell this account for the right price.
Taking the cost per GB currently and that most "small files" now are 10K+ does the overhead this cause by "wasted space" really need to matter. It still takes a hell of a lot of documents to fill up even a 250GB disk and as you can now get these disk for next to nothing I'm happy to get the extra performance
SolarVPS - Quality Windows and Linux Virtual Servers
Actually, if you're using NTFS, the data will be stored directly in the file entry in the MFT, taking zero dedicated clusters or sectors. The maximum size for this to happen is like 800 bytes.
Here's a short description of how NTFS allcates space. On volumes larger than 2GB, the cluster size (the granularity the FS uses to allocate space) was 4k already unless you specified something else when formatting the drive. Also, Windows NT has supported disk sector sizes larger than 512 bytes for a long time; it's just that anything else has been rare.
I'm sorry, Your response has to be in some form of star-trek (or sci-fi) I would have accepted this however...
Best analogy is Spock's gym locker room
Spock has say, 10 space lockers up and 20 space lockers accross
Spock can only put one thing in a locker, so Spock cant put his gym shorts in the same one as your shoes. But since Spock has lots of socks, He can pile them in, and take up two or three if neccessary.
Space is wasted if Spock uses a really big locker, but it's only holding a sock.
Now, you've got to record where all of this stuff is, or you will take forever to find that sock. (I guess the tricorders are broken) So Spock sets aside a locker to hold the clipboard with designations.
Now to bring this back into real life. There are a _lot_ of sectors on a disk. So keeping track of all of them starts requiring a substantial amount of resources. I imagine they are finding it easier to justify wasting space for small files in order to make it easier to keep track of them. Average file sizes are also going up, so it's not as big of a problem as it used to be either. It's all relative...
Isn't this what apple tried to do 5+ years ago with HFS+
Competent file system handlers can use disk blocks larger or smaller than the file system block size, but there are some benefits to using the same number for both. Although it may provide more data-per-drive to use larger blocks and you can index larger drives with 32-bit numbers, the drive has to use better (larger and more complex) CRCs to ensure sector data integrity integrity, the granularity of replacement blocks may end up wasting more space simply to provide an adequate count of replacements, and there are still some disk space management tools that insist on working in terms of "cylinders", regardless of the fact that the disk drives have had variable density zones for ages. The range from 4K (common disk block size) to 16K works as a decent compromise.
"Back in the day" running System V on SMD drives, where you could use almost any block size from 128 Bytes to 32K (the CRCs were weak after that) and control the cylinder-to-cylinder offset of block 0 from the index, I spent a few days trying different tuning parameters and found that, due to the 4K size of the CPU pages, and of the file blocks and swap it really did give a significant improvement in performance. I tried 8K and 16K, because the file system handler could be convinced to break them up, but didn't get any better performance, so used 4k for the spares granularity.
Perhaps I should take one of my late-model SCSI drives, which support low-level reformatting, and try the tests again. 16KByte file system blocks on 16KByte sectors might really be a win now. Have to do some research to see what I can do with CPU page sizes, too.
HDD manufacturers are looking to increase the amount of data stored on each platter. With larger sector sizes, the HDD vendor can use more efficient codes. This means better format efficieny and more bytes to the end user. The primary argument being that many OSes already use 4K clusters.
During the transition from 512-byte to 1K, and ultimately 4K sectors, HDDs will be able to emulate 512-byte modes to the host (i.e. making a 1K or 4K native drive 'look' like a standard 512-byte drive). If the OS is using 4K clusters, this will come with no performance decrease. For any application performing random single-block writes, the HDD will suffer 1 rev per write (for a read-modify-write operation), but that's really only a condition that would be found during a test.
Almost all filesystems I know of use at least 4Kb clusters. NTFS does come with 512 byte on smaller partitions.
LBA accesses on sector boundaries, so for larger HDD's, you need more bits (currently 28-bit LBA, which some older bioses support, means a maximum of 128GB- 2^28*512=2^28*2^9=2^37) Since 512-bytes were used for 30 years, I think it is easy to assume it will not last for 10 more years (getting to LBA32 limit). So why not shave off 3 bits and also make it an even number of bits (12 against 9).
Also there is something called "multible block access" where you make only one request for up to 16 (on most HDD's) sectors. For 512-byte sectors you have 8K, but for 4K sectors that means 64K. Great for large files (IO overdead and stuff).
On the application side this sould not affect anyone using 64-bit sizes (since only the OS would know of sector sizes), as for 32-bit sizes it already is a problem (4G limit).
So this sould not be a problem because on a large partition you will not have too much wasted space (i have around 40MB wasted space on my OS drive for 5520MB of files, and I would even accept 200MB)
Finally! This is what I really wanted for years. Cant believe this innovation has not been materialized earlier. This is great for perfomance, TCO, iPods, everything. I cant wait to get my hands on one of those new goodies!
It can't be, at least not efficiently. Like flash devices, it's impossible to write less than a sector at a time.
If this were transparently implemented by the hardware, the OS would frequently try to write a single 512 byte sector. In order for this to work, the hard drive controller would have to read the existing sector then write it back with the 512 bytes changed. This is a big waste, as a read then a write costs at least a full platter rotation (1/7200 second). Do this hundreds or thousands of times per second, and you have a nice slow hard drive.
Melissa
"Screw Sun, cross-platform will never work. Let's move on and steal the Java language." - Visual J++ Product Manager
Funniest comment I have read in ages...
:(){
Want to write a single byte? Then read 4MB, modify 1 byte, and write 4MB back to the disk.
Ok lets all get real for a minute if block size is not consistent how the hell will disk defragment and optimizer soft wear work. The Whole trick to those soft wear is moving the blocks around. Now if you want to loose the ability to fix and speed up your hard drive go for variable block size. But those of us who run multiple large drives want a small fixed block (to save space on things like fonts and e mail) otherwise our drives will quickly become unmanageable. Dan
That's a bonus for all those boot-sector virus writers - 8 times more space to do their dirty deeds...
Who the hell uses notepad to make notes?
But really...think about this: if each sector has overhead, then any file over 512 bytes will have less overhead, and you'll effectively get more space in most cases. What percentage YOUR files are less than 4k?
like /swap?
Man that takes me back. Where's my toupee....
Do not mock my vision of impractical footwear
If you have to re-issue a read or write command (well, the filesystem would do this) for each 512-byte block, that means that you will spend 8 times more energy (give or take a bit) to read or write the same 4k block of data.
Well sorry, but that's the way it is.
Hard drives generally have the ability to read/write multiple sectors with a single command. (Go read the ATA standards). And DMA is usually used [ program I/O just plain sucks].
I don't see how changing the sector size is going to save power... Either way they have to increase the size of the buffers for the read/write multiple operations. So these could just be increased while keeping 512-byte sectors and the same benefit would result.
The real reason for this is that as densities go up, the number of bits affected by a bad spot goes up. So it's desirable to error correct over longer bit strings. The issue is not the size of the file allocation unit; that's up to the file system software. It's the size of the block for error correction purposes. See Reed-Solomon error correction.
Modern DASD architecture is almost completely hidden from the user. In the (good?) old days system software needed to interface closely with the DASD and needed to understand the hardware architecture to gain maximum performance from the devices (I know because I work on such systems within an IBM mainframe environment on airline systems which require extremely high speed data access).
Nowadays the disk 'address' of where the data actually resides is still couched in terms that appear to refer to the hardware itself but in 'serious' DASD subsystems (e.g. the IBM DS8000 enterprise storage systems )the actual way in which the hardware handles the data is masked from the operating system. Data for the same file is spread across many physical devices and some version of RAID is used for integrity.
The 4096 value for data 'chunks' has to do with the most efficient amount of data that can be transmitted down a DASD channel (between a host and storage in large systems or the bus in self-contained systems)
The idea of a 'file address' would cease to exist and it would be replaced by a generic 'data address' if it weren't for the in-built assumptions about data retrieval within all current Operating Systems.
5 K becomes 8 K.. times 500 ... is a whopping 1.5 MEGAbytes wasted. I mean, that is more than fits on a floppy. What a waste.
Actually a 200 GB drive can still store 25 million files. How many fonts do you have?
FWIW the advantage is in the error correction. For a 1 bit secotro size, you'd need 3 bits to store it with error correction. As the block becomes larger, the error correction becomes more powerful. That is where the advantage is.
Of course data can still be stored byte-wise on the disk - it is only that a small update will require a read-modify-write transaction.
Windows Vista will ship with this support already.
Oh YEAH? Well Linux has had support for it for eleventeen years, and the Linux approach is more streamlined anyway!
What would you use, Word? For simple text-only notes, less is better. If I need to jot down a phone number quickly before I forget, I'm just gonna hit notepad (or pico on unix) and be typing it in seconds; I don't want to wait several minutes for Word to get it's bloated ass in gear.
What about /swap? Do you mean the swapfile partition in linux? That too uses a 4K page size.
-Ariel
Nevertheless, Heineman was still one heck of an airplane designer.
As far as the 8" floppy, ISTR that they were intended to replace punched cards, 77 tracks with 26 sectors (hard coded) came out to be pretty close to a box of 2000 hollerith cards (80 columns with 12 bits per column). 8" drives were available before the end of 1975, and the VAX came out in 1977(?). One of the uses for the flopies was loading the microprogram store on the VAX and IBM machines of the same era.
Anyway, why do you give a flying fuck? You can get a 250 GB hard drive for less than $100 these days ... at that rate, that 4096 bytes costs you about $100/64000000~= $.0000016. Was that really worth your time bitching? I didn't think so.
The 11/750 loaded its microcode from a little magnetic tape. It used to take (seemingly) ages to get going. I used to boot PDP 11/84's and 83's from TK50 tape. This was in traffic signal cabins out in the middle of nowhere, usually at 0200 or so. I could go for a walk and listen for the console printer to start chattering as the system came up.
I think the pdp's are the reason I now use NetBSD. Not sure why. Just a similar feel.
http://michaelsmith.id.au
I have my eyes peeled for a bio-drive, something noxious smelling that you feed with potato rinds which stores your data directly in its DNA. What d'you reckon? Another thirty years.
Informative, but wrong.
Some file systems can pack multiple tail fragments into one block.
Watch this Heartland Institute video
You've got porn vids of less than 4KB? Cool!
You all miss the point. To answer coward over 1100 plus 1300 active emails mostly text. Under 6k Unlike you I use HFS+. When I copied my font and email over to HFS it grew by over 2gigs, I still don't get the error thing error correction should not be affected by block allocation size. as it just links to the next block. Then we still have the defrag and optimize problem.
https://addons.mozilla.org/extensions/moreinfo.php ?id=2011&application=firefox
Uh, don't most email programs store mail as a single large database file, rather than zillions of individual files?
All that is still trivial on even the smallest disks supplied today. In any case, your OS is already using 4K blocks so you're already doing that anyway regardless of the underlying sector size.
Oolite: Elite-like game. For Mac, Linux and Windows
Hmm. This reminds me of the time when I bought my first external Firewire drive (120Gb) and used it to back up my 10Gb iMac, which had lots of small files (fonts, Word 5.1 documents, etc). Those 10Gb of backups ended up occupying 90Gb of drive space because the external drive had been pre-formatted with some large sector size, and even the smallest file took up half a megabyte! So I had to reformat the drive and start again...
You must think in Russian.
All modern operating systems do demand page loading of executables and use paging space on disk (the swapper). Memory pages are all 4Kbyte on all the CPU architectures we are using at the moment in a personal computer. Therefore, 4Kbyte is probably the ideal size (since now loading a page into memory takes only one read command instead of 8). Making it bigger than 8Kbyte would complicate VMM design (since if you only need to load one page, you now wind up loading two and having to throw one away, or at best, you'd wait twice as long while 8kbyte loads instead of 4kbyte).
Oolite: Elite-like game. For Mac, Linux and Windows
Actually, this almost can't be anything but a good thing.
First of all, most OSes these days use a memory page size of 4k. Having your IO system page match your CPU page makes it much more efficient to DMA data and the like. Testing has shown that this is generally a helpful.
Second, RAID will benefit here. Larger blocks mean larger disk reads and writes. In terms of RAID performance, this is probably a good thing. Of course, the real performance comes from the size of the drive cache, but don't underestimate the benefit of larger blocks. Larger blocks mean the RAID system can spend more time crunching the data and less time handling block overhead. The fact that more data must be crunched for a sector write is of concern, but I'd bet it won't matter too much (it only really matters for massive small writes, not generally a RAID use case).
Third, (and EVERYONE seems to be missing this) some file systems DON'T waste slack space in a sector. Reiserfs (v3 and v4) actually takes the underused blocks at the end of the files (called the "tail" of the file) and creates blocks with a bunch of them crammed together (often mixed in with metadata). This has been shown to actually increase performance, because the tail of files are usually where they are most active and tail blocks collect those tails into often accessed blocks (which have a better chance of being in the disk cache).
Netware 4 did something called Block Suballocation. While not as tightly packed as Reiser tail blocks, it did take their larger 32kb or 64kb blocks (which were chosen to keep block addresses small and large file streaming faster) into disk sectors and storing tails in them.
NTFS has block suballocation akin to Netware, but Windows users are, to my knowledge, out of luck until MS finally addresses their filesystem (they've been putting this off forever). Windows really would benefit from tail packing (although the infrastructure to support it would make backwards compatability near impossible).
To my knowledge, ReiserFS is the only filesystem with tail packing. If you are really interested in this, see your replacement brain on the Internet.
Fourth, larger sectors means smaller sector numbers. Any filesystem that needs to address sectors usually has to choose a size for the sector addresses. Remember FAT8, FAT12, FAT16, and FAT32? Each of those numbers were the size of sector references (and thus, how big of a filesystem they could address). This will prevent us from needing to crank up the size of filesystem references eventually.
Finally, someone mentioned sector size issues with defragmenters and disk optimizers. These programs don't really care as long as all of the sectors on the system are the same size. Additionally, they could be modified to deal with different sector sizes. Ironically, modern filesystems don't really require defragmentation, as they are designed to keep fragments small on their own (usually using "extents"). Ext2, Ext3, Reiserfs and the like do this. NTFS does it too, although it can have problems if the disk ever gets full (basically, magic reserved space called the MFT gets data stored in it and the management information for the disk gets fragmented permenantly). If it weren't for a design choice (I wouldn't call it a flaw as much as a compromise) NTFS wouldn't really need defragmentation. ReiserFS can suffer from a limited form of fragmentation. However, v4 is getting a repacker that will actively defragment and optimize (by spreading out the free space evenly to increase performance) the filesystem in the background.
I really don't see how this can be bad unless somebody makes a mistake on backwards compatability. For those Linux junkies, I'm not sure about the IDE code, but I bet the SATA code will be overhauled to support it in a matter of weeks (if not a single weekend).
I think Mauve has the most RAM. --PHB (Dilbert Comic)
Thanks for the first laugh of the day!!! :)
http://www.intellipool.se/ - Intellipool Network Monitor
oh, come on. A vulcan would never store his gym shorts in someone elses locker.
Stasis is death. Embrace change.
This can be a little inconvenient if you deal with a lot of small files though. Logging for the application I develop generates 2500 files. This is 10Megs of wasted space for each run. Okay, this stil isn't all that significant but there may be people who generate considerably more than that.
Coward I'm beginning to see why you use your handle. Disk block, Disk Sector size and cluster size have nothing to do with each other
There have been some good comments about FS/sectors and such. I think it can be dumbed down to 2 options:
Create a file system and sector size to maximise capacity or.....
Create a FS and sector combo to maximise perfomance (speed).
As far as the defragmentation issue, this could be lessened by creating a 'system managed' partitioning structure that allows file reads and writes only on the drive surface it actually needs: ie a partition that grows. The less mapping it has to do- the faster it is. I really think that the HD logic can really be tweaked on this one.
Don't be apathetic. Procrastinate!
I was going to write the same.
:-( ), but sector size matching memory page size can increase performances.
Allocation size is irrelevant as many advanced systems are supporting fragments (however still not implemented in ext2/ext3
And from a past discussion some people are thinking that the 512 bytes comes from the memory page size of the VAX.
I have discovered a truly marvelous proof of killer sig, which this margin is too narrow to contain.
On Linux, the ReiserFS and Reiser4 filesystems "squish small files together" to avoid this problem.
On the alien planet Kamarr they can compress small file sizes together natively for the past 5 versions of Door. However we do not live on the planet Kamarr.
Wow, finally, a new block size, never heard of that idea before.
0 0/ul10k300.htm allows 512, 516, 520, 524, 528 but there are devices that do several steps between 128 and 2k or so...)
Doesn't anyone remember that SCSI-drives that support a changeable block-size are around since basically forever? Of course with harddisks it was used mostly to account for additional error-correcting / parity bits, but also magneto-optical media could be written with 512 or 2k (if I remember correctly).
(first hit I found: http://www.starline.de/en/produkte/hitachi/ul10k3
Yesterday, we had Soviet Britannia. Today we have this.
What're we going to get to top that tomorrow?
Isn't there some truth to this? I thought there was currently a 512byte limit on bootsector virus sizes, or at least a 512byte limit to tell the system which and how to execute the next block.
Won't this also affect lilo and the like? Now I foresee all sorts of things needing to be rewritten, so that's why Microsoft knew they wouldn't ship till after Christmas. Wow, they're so clairvoyant! But honestly, the current forms should still work, just not take up all the space, eh? How does this affect the linux boot sector limit? It probably won't.
2^3 * 31 * 647
You're all missing one key point. Your 512 byte sector is NOT 512 bytes on disk. The drive stores extra track/ecc/etc information. So a 4096-byte sector means less waste, more sectors, more useable space.
Tom
Someday, I'll have a real sig.
Uhmm... NO!
This is a quick and dirty hack to check that the generated data is correct. I'm not going to spend weeks designing a data file format, and an API plus conversion tools to export the files to an excel compatible format.just because I've got an inefficient file system.
A new hard drive would be a better investment. Or alternatively just ignore the problem since NTFS seems to hande these adequately.
And sometimes its simply impossible to write a solution that will work like this. Some applications require a large number of discrete files.
I could make this joke every fucking day on this site. To be quite frank, 90% of the people who post on this site could step into my shoes quite easily.
I really don't know much about how drives store data. So this may be a really stupid question. But do larger sectors also mean the boot sector? Is this good news for boot loaders?
Did you come up with the name BadAnalogyGuy? It sounds like it could have been one of the superhero names on "Whose Line Is It Anyway?"
1983, trying to convince the CDC engineer that yes, I did want him to configure the disk for 336 byte sectors.
Ah the joys of using a Harris 24 bit word/8 bit byte/112 word disk sector machime.
Watch this Heartland Institute video
Back it the late 70's (1979?) Digital Research of CP/M fame, provided the same capability in CP/M 2.0. They called it their Sector Blocking / Deblocking algorithm. As you increase the sector size, which has NOTHING to do with the minimum allocated size the OS uses, you get more disk space per track. I played around with sector blocking / deblocking on my (still functioning) Thinker Toy's 2D disk controller (double density floppy disk controller). I was able to increase the size of the sector from 256 bytes to 1024 bytes which gave me extra space. I don't remember the differences now, but disk space went from something like 490 KB to 596 KB for a single sided 8 inch floppy. A few years later, I had access to the Shugart 14 inch Winchester Hard disk, which at 256 bytes per sector gave 20 MB of space. The drive allowed bigger sectors and I played around with 1024 byte sectors, and it gave over 26 MB of disk space.
Since the OS determines the size of the minumum number of "logical" sectors per allocation unit, this determines how efficient the file system is. Also big files like big allocation units, while small files like small allocation units. It's just a trade off for speed against performance.
Who's bitching? I merely make the observation that inefficient filesystems can cause a problem with small files if you have a large number of small files, illustrating it with an example of a typical case. Then you bitched about my code quality, and suggested an alternative that aside from being useless, is in certain cases impossible, and where possible doesn't solve the problem.
"The operating system does/should not "know" anything about how the data is physically stored by a device"
You're talking about LBA, but that only applies to cylinders/heads. The OS does map to the sector (eg, file inode stored at sector 12345 from the partition beginning, which says that file begins on sector 23123 etc). If it didn't use sectors, it would need an extra 7 bits to store the location of everything within the filesystem.
The filesystem also communicates with the driver using sector numbers. It's only when you reach the 'file' abstraction level (either IO calls or memory mapped) that you switch to using bytes.
Although modern FS's will share a block for small files (or the tails of files), I don't think they do this for the actual FS data structures, which I'd guess are block quantized, so they would need to be aware of the block change to make use of the rest of the sector (either storing more info per inode, or more than one inode per sector).
The revolution will not be televised... but it will have a page on Wikipedia
That post was illogical.
sig?
I'm sure they probably did something like that on the show. But this name itself came to me like a bolt out of the blue. Angels sang and birds chirped and thus I was born, from the forehead of Zeus, you could say.
sorry, early-after-waking slashdot post :-p
The filesystem does communicate with the driver with sector numbers, but it uses its own block size for addressing, and then shifts the address to get the sector number.
Scratch pretty much all else I said!
The revolution will not be televised... but it will have a page on Wikipedia
...the new bootstrap loader for Vista will be a mini VBScript interpreter... and built into the shell... oh those clever MS folk
The revolution will not be televised... but it will have a page on Wikipedia
It was done with hard drives too. Back around 1986 I was working in data recovery. Tandon computers used to sell MS-DOS machines with 1KiB sectors. They ran a specially modified version of DOS.
The problem, of course, was that people wanted to upgrade to the latest MS-DOS from Microsoft. So they would replace Tandon DOS with MS-DOS, and suddenly their entire hard drive would be scrambled.
And then they'd call us.
GCHQ Quantum Insert installed. If only our tongues were made of glass, how much more careful we would be when we speak
Statistically speaking, on an FS that allocates whole blocks, the waste space will be the block size * half the number of files on the drive.
;-)
Lets not kid ourselves, this is mostly gonna be useful for the drive we store our movies on
The revolution will not be televised... but it will have a page on Wikipedia
Reiser is not the first file system with this idea:"Third, (and EVERYONE seems to be missing this) some file systems DON'T waste slack space in a sector. Reiserfs (v3 and v4) actually takes the underused blocks at the end of the files (called the "tail" of the file) and creates blocks with a bunch of them crammed together (often mixed in with metadata). This has been shown to actually increase performance, because the tail of files are usually where they are most active and tail blocks collect those tails into often accessed blocks (which have a better chance of being in the disk cache)."
Most filesystems don't, and haven't for decades, wasted these final blocks: "With larger block sizes, disks with many small files would waste a lot of space, so BSD added block level fragmentation, where the last partial block of data from several files may be stored in a single "fragment" block instead of multiple mostly empty blocks."
The performance boost is that we can store small dot files together in 1 block, and (with readahead) speed up things like logins and other operations that read in this small set of data.
FAT and FAT32 couldn't handle this, though (nor many other FS features). While I haven't studied NTFS in the detail that I've studied UNIX file systems, I doubt they don't have support for this. NTFS as of 5.0 supports pretty much every feature of a modern UNIX file system.
--
Internet Explorer (n): Another bug -- that is, a feature that can't be turned off -- in Windows.
About 15 years ago, when the 3.5 in HDD held 500Mbyte, you could reformat your SCSI disk to get an optimim sector size. The disks I was using handled, I think, any even sector size from 128 bytes to 4096 bytes.
Because the disks were so low capacity, you wanted to use every byte, so I reformatted the disks to an optimum sector size for my application, which was about 1812 bytes IIRC. This achieved about 5% extra useful data on the system.
I think there was at one time a need for 2040 byte sectors for the IBM System/38, which had a 33rd bit on each word, which had to be saved to disk.
When the generation of disk changed to 1Mbyte, the controllers had an error one in every few tens of thousands of reads: it simply never completed the transaction. It never happened with 512 byte sectors, and the drive manufacturer only tested at 512 bytes, so we switched back to 512 and have been there ever since.
Consciousness is an illusion caused by an excess of self consciousness.
ReiserFS with its "tail conversion" roxorz.
Don't they read the whole track into a cacheline anyway?
The revolution will not be televised... but it will have a page on Wikipedia
It's like the film... I, DEMA... about an intelligent disk drive who err... needed to save the world *cough*
The revolution will not be televised... but it will have a page on Wikipedia
Uhm. This is a discussion forum. We're discussing whether the overhead is worth it for the improvements in I/O efficiency.
I think it all started with the first Vax 780, or possibly the first IBM 370 channel controller. Those old machines booted with a 7" floppy that had a capacity of 0.5k. Yep, 512 bytes. Early bootstraps could store the entire contents on to a hard disk with very few instructions if the sector size matched.
Man that takes me back. Where's my toupee....
I think it's in the same place as your vacuum tubes.
haha. I'll take it!
;)
Yah, definately a bad analogy but I couldn't resist
No, I am not an English major. My posts are subject to typos and incorrect grammar. Do not expect perfection.
In 1963, when IBM was still firmly committed to variable length records on disks, DEC was shipping a block-replacable personal storage device called the DECtape. This consisted of a wide magnetic tape wrapped around a wheel small enough to fit in your pocket. Unlike the much larger IBM-compatible tape drives, DECtape drives could write a block in the middle of the tape without disturbing other blocks, so it was in effect a slow disk. To make block replacement possible all blocks had to be the same size, and on the PDP-6 DEC set the size to 128 36-bit words, or 4608 bits. This number (or 4096, a rounder number for 8-bit computers) carried over into later disks which also used fixed sector sizes. As time passed, there were occasional discussions about the proper sector size, but at least once the argument to keep it small won based on the desire to avoid wasting space within a sector, since the last sector of a file would on average be only half full.
For simple text-only notes, less is better.
But only for viewing.
Give me Classic Slashdot or give me death!
I hope that this backward compatibility remains for a very long time or archives that are stored on older drives will be lost eventually.
I know that the expected life of data on magnetic drives is not all that long but it does not mean we do not try to recover ancient data. I still pull stuff from reel tape using dd from over 10 years ago. Migration is a solution but I am too lazy to do it.
In my first real job, I worked for Prime Computer. They stored a bit of data along with each filesystem 512 byte block. If I remember correctly, they stored a forward and a backward link. If the filesystem was corrupted, they could restore many of the files because of the links in the data portion of the disk. This led to 528 byte sectors (if I remember correctly).
Having a weird sector size limited the manufacturers who would supply disks.
This decision also made file deletion slow -- as every block of a file was re-written.
The filesystem supported Several file-organization types. The two most prevelant were sequential access and dynamic access. Sequentlial access files had a pointer from the directory to the file. The file had forward and backward links at the head of each block. To read the 300th block, it would be necessary to read the first 299 blocks before reading the 300th block. To erase the file, each block would need to be read (to obtain the forward link to the next block), and then written.
Dynamic access files would have a table of pointers to disk block - residing in the first blocks of the file. Normal reads wouldn't see the data in the first block directly.
Amazing what cruft sits in my brain after over a decade.
Where law ends, tyranny begins -- William Pitt
"I see you are trying to use your computer, would you like me to use it as a spam server?"
It is by the juice of the coffee bean that thoughts acquire speed, the teeth acquire stains. The stains become a warning
Or Knotes (or the Gnome equivalent)
May contain traces of nut.
Made from the freshest electrons.
About time. SCSI has a 32-bit block number built into the protocol. With 512 byte blocks, that a 2 TB architectural limit. We can expect to be able to buy a 2 TB disk off the retailer's shelf in another two or three years.
We're not talking about filesystems. If we were I can remember choosing my ext2 block size anything between 1k and 4k about six years ago. Apple is easily predated here.
Malike Bamiyi wanted my assistance.
When using DMA it doesn't really matter since the whole transfer is performed by the drive. When using PIO, software can use MULTIPLE MODE versions of read and write commands to transfer more than one sector at a time. This results in better performance only because of host software and interface issues. It's not a hard drive internals thing. Multiple mode was added to IDE in the late 80's. You're a little out of date.
Changing the sector size will in no way reduce "the number of error checks and sector head seeks (sic)". It increases the density of user data on the disk with essentially no downside. PC's have not been able to use different sector sizes in the past since BIOS'es assume 512 bytes. It's about time that got fixed.
Forcing a 4K sector size means that the smallest IO will now be 4K. Disk and storage controller vendors will like that I'm sure. Designing high performance storage while worrying about 512 byte performance is ridiculous. Caches no longer have to worry about small sectors. Hoorah!
This is true by default, but is not a requirement. I just got a new 160 GB drive. See http://predelusional.blogspot.com/2006/03/using-pr icewatch-effectively.html
By using a command like /dev/hdb1
mke2fs -j -b 1024 -m 0 -N 2000000
one can have a 1K sized blocks. I argue that with mostly continous filesystems, the block size is largely irrelevant to performance. A small block size reduces the wasted space at the ends of files. For me, going from 4K to 1K gave me 4 GB more free space on my 160 GB drive for my current files - 2.5%.
System supported file compression would yield more, of course. Sure, the mp3 & jpg files won't compress, but I have lots of text files. A good compression system will know this and not bother compressing files that are incompressible. It might even use a file like magic cookie mechanism so it doesn't have to attempt compression to find out it's pointless.
If the improvement in space due to consolodated ECC codes and other overhead saves more total space than the end-of-file wastage, it would still be a win. The article doesn't say how much better the new standard will be.
It may be that Linux will provide 1K filesystem blocks on top of the 4K phsyical blocks. Performance will be worse. However, the original ext filesystem provided half K blocks, and that option is now moribund.
For new systems, this standard is fine. However, I run my machines into the ground, and the better ones have lasted fifteen years. Given that my current machine could last another ten years, it would be a shame to have to toss it into the landfill in five years because the disk drives can't be replaced. Progress is good. Forced upgrades are not.
-- Stephen.
Yeah, but does it run Linux?
Use some better filesystem then. ReiserFS can group small files so they won't take that much space.
You can get a 250 GB hard drive for less than $100 these days ... at that rate, that 4096 bytes costs you about $100/64000000~= $.0000016. Was that really worth your time bitching?
Maildir.
Note to ACs: I usually delete AC replies without reading them. If you want to talk to me, log in.
most modern disk utils let you choose between 512b 1k 2k and 4k sectors
what we will start to see here is our inodes will fill up faster than our disks.
or, on NTFS our $MFT will fill up past the files.
I format all my drives with 512 byte sectors.
this means more sectors on a disk than with 4096 byte sectors.
if you have a lot of small files you should use 512 byte sectors.
the caveat is more sectors means more to go wrong.
They're using their grammar skills there.
Vista will also ship with support for holographic storage, 3D-free air holographic display drivers, and cerebral direct interface.
Vista, like Visa, will leave you further in debt and miserable about it.
Athiesm is a religion like not collecting stamps is a hobby.
Not always the case. In many "modern" file systems (jfs, ReiserFS, xfs(?), and likely others), small files can be stored directly within the directory structure of the file system itself - eg, as part of the B*-tree that forms the file index. So you'll actually have a number of tiny files together taking up only one disk sector. Obviously that's not possible on older filesystem architectures (ext2/3, FAT, etc) where each inode (or the moral equivalent) points directly at the unique start sector for the file.
Don't know how NTFS works - someone more knowledgeable can chime in here.
--
/tsg/
It's a tradeoff between storage granularity and potential for speed. If a system can read or write more in a given cycle, that can be exploited for performance. A big sector or cluster size can mean much better performance for some types of data, specifically, A/V media. The sector size on a low level format also dictates the way the firmware is written, and a bigger sector size might mean an order of magnitude more address space for the controller; I wonder if that's becoming some kind of bottleneck for the manufacturers? (I didn't RTFA, it was slashdotted.)
-fb Everything not expressly forbidden is now mandatory.
Efficient support for very small files would allow a lot of crap on Windows/Linux (such as the Registry and the Gnome copy of it) to be eliminated, and allow all "metadata" (things like the artist in a song) to be stored as files. Read up on the ReiserFS plans. Ideally 99.99% of the files on a disk would be less than 100 bytes.
However this is best solved by writing the filesystem to put these small files into the blocks with other data, such as many small files together, or inserted directly into the directory. Correctly done you would get a bunch of small files at once with a single read, and since use of this would probably need to look at many at once, this could be very efficient. In any case larger sectors are harmless.
This is nonsense. All OS'es allocate space by the cluster, typically 4K up to 32K for fat32. The driver talks to the drive in 512-byte sectors, something which is never visible to the end user. If the sector size changes from 512 bytes to 4K it would have no effect on any Windows version, wastage would remain the same. Only harddisks and drivers would become more efficient and maybe a little faster. The only users affected would be people using fdisk-like tools that allow you to work in head/track/sector notation.
I have some 5+ year old scsi disks that can be low-level formatted with 4k sectors. This is nothing new, aside from the 'standard' increasing the sector size.
Memory serving, the scsi-2 spec also allows sector sizes up to 16k or more during a format, if the drive supports it.
And if you read a tail end of a file, or a very small file, it still has to read the block, mask out the other data and shift it to the beginning of a block in memory and then return it.
Worse, a normal block write is just a write, but a small block write needs a read, an update in memory and a write.
With 300 GB disks the space saved is probably $0.0001 cent worth, the slow speed remains. If you keep 100 Gb free space anyway, wouldn't you want to turn off this 'useful' feature ?
Anyone know anything about this?
Ya, it's BS and totally untrue. I've got two Hitachi drives in RAID-0 using the built in Promiss controller on my MB. I've never had a single problem.
Data in RAID-0 is twice as likely to get corrupted however if you have bad memory or memory timing. This happens a LOT. Only way to ensure data is getting read from and written too correctly is to make sure the bits aren't flipping in RAM. Run Memtest 86 Plus to verify. Also, people like to mount two hard drives next to each other without proper cooling. A drive that overheats will cause all sorts of controller problems and even stress on the barrings (spindle and actuactor arm).
Life is not for the lazy.
"This is true by default, but is not a requirement."
/dev/hdb1
:) Actual testing finds you to be wrong. Remember there is 4 times the overhead in dealing with smaller block sizes - and also that when you don't match the page size of the CPU in question it adds even more overhead. You have a blocks in use map that is 4 times larger, and for every file the block list is 4 times larger. That wastes a little bit of space, and more overhead.
I know. It's just a good idea to do so.
"By using a command like
mke2fs -j -b 1024 -m 0 -N 2000000
one can have a 1K sized blocks. I argue that with mostly continous filesystems, the block size is largely irrelevant to performance."
And there you have the difference between theory and reality
"A small block size reduces the wasted space at the ends of files. For me, going from 4K to 1K gave me 4 GB more free space on my 160 GB drive for my current files - 2.5%."
No it didn't. If you assume a maximum waste of 3K per file that would mean you have more then 1 million (1,048,576) files! I somehow doubt that. Esecially in light of the next paragraph:
The actual savings came from using -N 2000000. You significantly reduced the number of inodes you have on the hd. inodes take up space and that's where your 4GB came from. And here's the best part - if you really do have millions of files on the system you are in trouble because you only have 2 million inodes!
"It may be that Linux will provide 1K filesystem blocks on top of the 4K phsyical blocks. Performance will be worse. However, the original ext filesystem provided half K blocks, and that option is now moribund."
I quite double that it would do so. Why would anyone want that? Performance would be embarassing. And half K blocks are no problem when the sector size is half a K. You can't have a block size smaller then the sector size.
"For new systems, this standard is fine. However, I run my machines into the ground, and the better ones have lasted fifteen years. Given that my current machine could last another ten years, it would be a shame to have to toss it into the landfill in five years because the disk drives can't be replaced. Progress is good. Forced upgrades are not."
Did you even read the RTF? First of all the large sector sizes are only for large hd's. Buy a small one and you'll be fine. Second, they have backward compatibility modes - not for the hardware, but for the OS (i.e. windows) which can't deal with unexpected sector sizes. Linux can handle the sector sizes no problem, so no land fill for you.
-Ariel
Or will it hurt overall storage because a Bad Sector now requires a full 4KB spare?
"It's the height of ridiculousness to say for those 9 lines you get hundreds of millions."
Physical Disk Sector Sizes Supported
512 bytes through to 32 kilobytes (in powers of 2), with the caveat that the sector size must be less than or equal to the filesystem blocksize.
http://oss.sgi.com/projects/xfs/
Building a better backup.
Zettabyte Storage
Bumping up the sector size from 512 to 4096 means we can access disks 8 times larger without widening the address registers. We already have 48bit sector addressing which yields up to 128 petabytes PER DISK. 10 years ago drives were 1/1000th the size of current models, perhaps in 10 years we will be seeing drives approaching the petabyte range.. perhaps not. Either way we don't need enlarged sectors for capacity reasons yet.
If this helps to reduce overhead in the design and manufacturing of hardware, then be my guest! What I would really love to see though, is more speed. Capacity grows several orders of magnitude faster than speed, and it is a significant bottleneck for most data-intensive jobs. That's why we have RAID in desktop rigs. Why not implement a black-box raid-like solution for future hard drives ? To hell with form factors, give me a 5.25" height hard drive that boasts 2 terabytes across 8 platters running in parallel, just like a bigass raid-0 stripe, but transparent. 8 platters times 512 bytes per sector = 4096 byte striped sectors. The real advantage would be the 400mb/sec sustained transfer rate (or better). That sort of performance leap would warrant a new interface. SATA is convenient, but we went from 133mb/sec with ATA-133 to 150mb.. coitus interruptus ?
-Billco, Fnarg.com
How do you figure? I'd rather have annoying bloat slowing me down while reading a note than while writing one.
Indeed, that is useful. But I often have a terminal window open anyway, and always a blank Textedit window (I'm on OSX for normal usage), so it's just as easy to simply alt-tab over.
Wouldnt this waste a *lot* of space for items like shortcuts, weblinks, C modules..
I realize that drives are getting bigger and bigger, but is that an excuse to waste ?
---- Booth was a patriot ----
Good news. This will be nice to have in software, a 4096-byte sector. As others have said, this will exactly match the cluster size used by most modern filesystems, and the page size used by the Intel x86 architecture. This happy coincidence will mean that operating systems can just do a 1-to-1 read/write, and not need to waste time blocking/deblocking.
:)
Isn't this already done in drive hardware? I thought, that in order to save physical space on the disc surface by reducing inter-sector overhead, the drive already internally uses much larger sectors than the current 512-byte standard. The disc controller just takes care of this automatically, doing the blocking/deblocking transparently. It reads/writes the larger sectors just fine, but just serves 512 bytes at a time to the computer, buffering the rest.
From a software point of view, 4096-byte sectors will be nice. I hope they take the time to get in a new partition table format! Drop the obsolete CHS fields, as they've been maxed out for a long time now. As for the LBA fields, widen them to 64 bits, so that there's plenty of room for the future. With 512-byte sectors and the current 32-bit LBA fields, the maximum is 4G sectors, 2TB. With RAID becoming popular these days, this limit is easily reachable! Going to 4096-byte sectors will push this limit back to 16TB, a good thing, but widening the fields to 64 bits will really push this limit out of sight. 72ZB. Maybe that will even be enough for Google?
Dr. Demento On The 'Net!
Why not just use a 16-character filename on an empty file?
my sig's at the bottom of the page.
You should get a prize for completely missing the point. http://www.greenwoodsoftware.com/less/
Oh, god, I can't believe I missed that. I'm going to go hide in the corner now.
And SATA2 has already doubled to 300mb, a la 33mb > 66mb > 100mb > 133mb of ATA. That leaves 450mb & 600mb ready for the future, and possibly more (I don't know enough about SATA to specualte).
Short story: the comparison isn't 133mb to 150mb, but 33mb to 150mb (first gen of each).
-bZj
.sig
Those who sacrifice security to condemn liberty deserve to repeat history or something. - Benjamin Santayana
"4096-bytes should be enough for anyone"
And yes, I remember booting 750's, 780's, 785's, 8550's, 6550's ... all manner of Vaxen, repeatedly, about 10 years of it.
Mind you, at least 7 years of that was waiting for the 750 microcode to load...
Hardware was all very compatible, worked very well when it was working at all -- sort of like a Citroen.
But I still miss DCL, and logical names. They waz cool.
I'm standing here, watching a tape drive, spinning around, and talking...
Do not mock my vision of impractical footwear
Shall we compare to the lovely rates of ATARI hard disks ? I had a fun little 20mb CoolDisk that was probably in the 100's of kb/sec range.
:p Sure, SATA-150 is ten times faster than the first ATA hard drives from over a decade ago, but what does that prove ? That new interfaces are faster than older.. big surprise.
.. SATA-1200, now that would be something to sing about! But then we'll face the problem of a slow main bus across the motherboard.
I don't think there's much point in comparing first gen ATA with first gen SATA, especially when it was initially 16mb/sec and not 33
My original point was that SATA-150 isn't much of an improvement over ATA-133. Reading between the lines, that means I think the SATA working group should have aimed higher. When designing something next-gen, shoot for the stars, especially when dealing with a huge partnership of bureaucrats that will take years before actually producing something tangible. At least take that serial bus and parallelize it so I can keep using 80-pin conductors and get 8x the performance
This ain't rocket science, it's data transmission over copper. Why can't we have one general-purpose system with extremely high speeds that everything could plug into ? Many specialized server devices (blade systems especially) have crazy fast master busses while the lowly PC is still trucking along with 33-mhz PCI that the average FPGA-hacking toddler can max out, yet we change CPU sockets every 10 months, and VGA slots every 2 years.. Lose the legacy!
-Billco, Fnarg.com
Actually, the floppies on the VAX 11/780 were 8", and they held about 243 kilobytes. Still pretty tiny by today's standards.
Doh! I will shut up and go back to my RFP's now.
Do not mock my vision of impractical footwear