Changes in HDD Sector Usage After 30 Years

← Back to Stories (view on slashdot.org)

Changes in HDD Sector Usage After 30 Years

Posted by ryuzaki0 on Thursday March 23, 2006 @07:07PM from the new-and-improved dept.

freitasm writes "A story on Geekzone tells us that IDEMA (Disk Drive, Equipment, and Materials Association) is planning to implement a new standard for HDD sector usage, replacing the old 512-byte sector with a new 4096-byte sector. The association says it will be more efficient. According to the article Windows Vista will ship with this support already."

14 of 360 comments (clear)

Min score:

Reason:

Sort:

Ah, error correction. by wesley96 · 2006-03-23 19:10 · Score: 5, Insightful

Well, CD-ROMs use 2352 bytes per sector, ending up with 2048 actual bytes after error correction. Looking at the size of the HDDs these days a 4096-byte sector seems pretty reasonable.

--
Serving time in Aristotelean prison for violating laws of physics
Hrm, that kind of makes sense... by I+kan+Spl · 2006-03-23 19:17 · Score: 2, Insightful

Most "normal use" filesystems nowadays (FAT32, Ext3, HFS, Reiser) all use 4K blocks by default. That means that the smallest amount of data that you can change at a time is 4k, so every time you change a block, the HDD has to do 8 writes or reads. That would leave the drive preforming 8x the number of commands that it would need to.

As filesystems are slowly moving towards larger block sizes, now that the "wasted" space on drives due to unused space at the ends of blocks are not as noticable, moving up the size on the underlying hardware also makes sense. I don't think that this can make things too much faster, but it would allow SATA drives (and SCSI also) to quesu more commands in their internal buffers, as they will onyl be recieving one command per read/write that the filesystem does, instead of 8.

--
My UID is prime and so is this number: 09 F9 11 02 9D 74 E3 5B D8 41 56 C5 63 56 88 C0.
Re:4MB by LardBrattish · 2006-03-23 19:31 · Score: 3, Insightful

Simple answer - every file would then have a minimum size of 4MB

--
What are you listening to? (http://megamanic.blogetery.com/)
Re:4MB by Anonymous Coward · 2006-03-23 19:41 · Score: 2, Insightful

It only means that a 4MB block would be the smallest atomic unit you could write on a disk. Writing to parts of it would require to first read it, then modify it, then write it. A lot of FS would implement this by always caching full blocks. But you could still pack many files in a single block. Most FS already work with pretty large (logical) block sizes (16KB ain't uncommon) and will "fragment" them for very small files. Databases often compact records end to end in a block.

But of course, 4MB is fscking large. One problem would be to make them truly atomic. Current drives are supposed to have enough power to be able to complete a 512 bytes writes even if power is lost and stuff likes that.
Because of R-M-W by alanmeyer · 2006-03-23 19:42 · Score: 3, Insightful

Want to write a single byte? Then read 4MB, modify 1 byte, and write 4MB back to the disk.
You're all complaining about tiny files... by NalosLayor · 2006-03-23 19:54 · Score: 2, Insightful

But really...think about this: if each sector has overhead, then any file over 512 bytes will have less overhead, and you'll effectively get more space in most cases. What percentage YOUR files are less than 4k?
How do you know how your data is actually stored ? by Horus1664 · 2006-03-23 20:07 · Score: 4, Insightful

Modern DASD architecture is almost completely hidden from the user. In the (good?) old days system software needed to interface closely with the DASD and needed to understand the hardware architecture to gain maximum performance from the devices (I know because I work on such systems within an IBM mainframe environment on airline systems which require extremely high speed data access).
Nowadays the disk 'address' of where the data actually resides is still couched in terms that appear to refer to the hardware itself but in 'serious' DASD subsystems (e.g. the IBM DS8000 enterprise storage systems )the actual way in which the hardware handles the data is masked from the operating system. Data for the same file is spread across many physical devices and some version of RAID is used for integrity.
The 4096 value for data 'chunks' has to do with the most efficient amount of data that can be transmitted down a DASD channel (between a host and storage in large systems or the bus in self-contained systems)
The idea of a 'file address' would cease to exist and it would be replaced by a generic 'data address' if it weren't for the in-built assumptions about data retrieval within all current Operating Systems.
Re:file size by Anonymous Coward · 2006-03-23 20:14 · Score: 2, Insightful

5 K becomes 8 K.. times 500 ... is a whopping 1.5 MEGAbytes wasted. I mean, that is more than fits on a floppy. What a waste.

Actually a 200 GB drive can still store 25 million files. How many fonts do you have?

FWIW the advantage is in the error correction. For a 1 bit secotro size, you'd need 3 bits to store it with error correction. As the block becomes larger, the error correction becomes more powerful. That is where the advantage is.

Of course data can still be stored byte-wise on the disk - it is only that a small update will require a read-modify-write transaction.
Re:That's nice by arrrrg · 2006-03-23 20:34 · Score: 1, Insightful

Anyway, why do you give a flying fuck? You can get a 250 GB hard drive for less than $100 these days ... at that rate, that 4096 bytes costs you about $100/64000000~= $.0000016. Was that really worth your time bitching? I didn't think so.
Re:That's nice by Eunuchswear · 2006-03-23 20:40 · Score: 2, Insightful

Informative, but wrong.

Some file systems can pack multiple tail fragments into one block.

--
Watch this Heartland Institute video
System Pages, RAID, Tail Blocks, and Addressing by KagatoLNX · 2006-03-23 21:07 · Score: 4, Insightful

Actually, this almost can't be anything but a good thing.

First of all, most OSes these days use a memory page size of 4k. Having your IO system page match your CPU page makes it much more efficient to DMA data and the like. Testing has shown that this is generally a helpful.

Second, RAID will benefit here. Larger blocks mean larger disk reads and writes. In terms of RAID performance, this is probably a good thing. Of course, the real performance comes from the size of the drive cache, but don't underestimate the benefit of larger blocks. Larger blocks mean the RAID system can spend more time crunching the data and less time handling block overhead. The fact that more data must be crunched for a sector write is of concern, but I'd bet it won't matter too much (it only really matters for massive small writes, not generally a RAID use case).

Third, (and EVERYONE seems to be missing this) some file systems DON'T waste slack space in a sector. Reiserfs (v3 and v4) actually takes the underused blocks at the end of the files (called the "tail" of the file) and creates blocks with a bunch of them crammed together (often mixed in with metadata). This has been shown to actually increase performance, because the tail of files are usually where they are most active and tail blocks collect those tails into often accessed blocks (which have a better chance of being in the disk cache).

Netware 4 did something called Block Suballocation. While not as tightly packed as Reiser tail blocks, it did take their larger 32kb or 64kb blocks (which were chosen to keep block addresses small and large file streaming faster) into disk sectors and storing tails in them.

NTFS has block suballocation akin to Netware, but Windows users are, to my knowledge, out of luck until MS finally addresses their filesystem (they've been putting this off forever). Windows really would benefit from tail packing (although the infrastructure to support it would make backwards compatability near impossible).

To my knowledge, ReiserFS is the only filesystem with tail packing. If you are really interested in this, see your replacement brain on the Internet.

Fourth, larger sectors means smaller sector numbers. Any filesystem that needs to address sectors usually has to choose a size for the sector addresses. Remember FAT8, FAT12, FAT16, and FAT32? Each of those numbers were the size of sector references (and thus, how big of a filesystem they could address). This will prevent us from needing to crank up the size of filesystem references eventually.

Finally, someone mentioned sector size issues with defragmenters and disk optimizers. These programs don't really care as long as all of the sectors on the system are the same size. Additionally, they could be modified to deal with different sector sizes. Ironically, modern filesystems don't really require defragmentation, as they are designed to keep fragments small on their own (usually using "extents"). Ext2, Ext3, Reiserfs and the like do this. NTFS does it too, although it can have problems if the disk ever gets full (basically, magic reserved space called the MFT gets data stored in it and the management information for the disk gets fragmented permenantly). If it weren't for a design choice (I wouldn't call it a flaw as much as a compromise) NTFS wouldn't really need defragmentation. ReiserFS can suffer from a limited form of fragmentation. However, v4 is getting a repacker that will actively defragment and optimize (by spreading out the free space evenly to increase performance) the filesystem in the background.

I really don't see how this can be bad unless somebody makes a mistake on backwards compatability. For those Linux junkies, I'm not sure about the IDE code, but I bet the SATA code will be overhauled to support it in a matter of weeks (if not a single weekend).

--
I think Mauve has the most RAM. --PHB (Dilbert Comic)
1. Re:System Pages, RAID, Tail Blocks, and Addressing by McSnarf · 2006-03-23 23:43 · Score: 2, Insightful
  
  Forget waste of space in something as small as a sector.
  If this is an issue, you use the wrong application - one word file per phone number?
  File systems became simpler over time. This is a GOOD THING AND THE ONLY WAY TO GO.
  If you try to optimize too much, you end up with something like the IBM mainframe file systems from the 70s, which are still somewhat around.
  Create a simple file, called a data set ? Sure, in TSO (what passes for a shell, more or less), you use the ALLOCATE command: http://publibz.boulder.ibm.com/cgi-bin/bookmgr_OS3 90/BOOKS/IKJ4C550/1.7.5?SHELF=&DT=20040721160158&C ASE=
  Simple, isn't it ?
  Forget complicated file systems, let the hardware handle speed. Ans possibly defragmentation.
Re:That's nice by odaen · 2006-03-23 22:11 · Score: 1, Insightful

On the alien planet Kamarr they can compress small file sizes together natively for the past 5 versions of Door. However we do not live on the planet Kamarr.
Configureable Sector Size by cnvogel · 2006-03-23 22:12 · Score: 2, Insightful

Wow, finally, a new block size, never heard of that idea before.

Doesn't anyone remember that SCSI-drives that support a changeable block-size are around since basically forever? Of course with harddisks it was used mostly to account for additional error-correcting / parity bits, but also magneto-optical media could be written with 512 or 2k (if I remember correctly).

(first hit I found: http://www.starline.de/en/produkte/hitachi/ul10k30 0/ul10k300.htm allows 512, 516, 520, 524, 528 but there are devices that do several steps between 128 and 2k or so...)