Changes in HDD Sector Usage After 30 Years
freitasm writes "A story on Geekzone tells us that IDEMA (Disk Drive, Equipment, and Materials Association) is planning to implement a new standard for HDD sector usage, replacing the old 512-byte sector with a new 4096-byte sector. The association says it will be more efficient. According to the article Windows Vista will ship with this support already."
Why not a 4MB sector?
Well, CD-ROMs use 2352 bytes per sector, ending up with 2048 actual bytes after error correction. Looking at the size of the HDDs these days a 4096-byte sector seems pretty reasonable.
Serving time in Aristotelean prison for violating laws of physics
Is Ingo Molnar working on this?
So... If I write down a little 16-byte message to myself in Notepad containing a name and a phone number, it will take up 4096 bytes. That's good. Thanks. I guess since disks are getting so large, they have to find a way to help me waste the space even faster.
hello dear sirs my name is jamesh i are india (bihar) can u guide me install red had linux 9?
Someone explain to me how this works! why is this better? oh yeah you can only use sci-fi/startrecky terminology or else it doesn't count.
I thought cluster sizes were already 4KB for efficiency, and LBA for larger drive sizes. So how does changing the sector size change things? (Especially when we don't access drives by sector/cylinder anymore?)
In Soviet Russia, articles before post read *you*!
so long as this new format is transparent, built internally in the drives and doesn't effect older hardware or software, there shouldn't be a problem. It also should not contain any DRM junk.
All to often an advantage in speed improvements and such are more than countered by adding overhead junk.
now maybe I should RTFA...
You're thinking of 'cluster'. This is tied to the file system that is actually used on the disk. Even with the current 512-byte sector, a normal NTFS partition of, say, 200GB, uses 4KB cluster and a single file takes up a minimum of 4KB already.
Serving time in Aristotelean prison for violating laws of physics
Most "normal use" filesystems nowadays (FAT32, Ext3, HFS, Reiser) all use 4K blocks by default. That means that the smallest amount of data that you can change at a time is 4k, so every time you change a block, the HDD has to do 8 writes or reads. That would leave the drive preforming 8x the number of commands that it would need to.
As filesystems are slowly moving towards larger block sizes, now that the "wasted" space on drives due to unused space at the ends of blocks are not as noticable, moving up the size on the underlying hardware also makes sense. I don't think that this can make things too much faster, but it would allow SATA drives (and SCSI also) to quesu more commands in their internal buffers, as they will onyl be recieving one command per read/write that the filesystem does, instead of 8.
My UID is prime and so is this number: 09 F9 11 02 9D 74 E3 5B D8 41 56 C5 63 56 88 C0.
Small devices like cellphones typically save files of several kilobytes, whether they be the phonebook database or something like camera images. Whether the data is saved in a couple large sectors or 8 times that many small sectors isn't really an issue. Either way will work fine, as far as the data is concerned. The biggest problem is the amount of battery power used to transfer those files. If you have to re-issue a read or write command (well, the filesystem would do this) for each 512-byte block, that means that you will spend 8 times more energy (give or take a bit) to read or write the same 4k block of data.
Also, squaring away each sector after processing is a round trip back to the filesystem which can be eliminated by reading a larger sector size in the first place.
Some semi-ATA disks already force a minimum 4096-byte sector size. It's not necessarily the best way to get the most usage out of your disks, but it is one way of speeding up the disk just a little bit more to reduce power consumption.
...to back up the claim that this is more efficient, though it intuitively "feels" like it should be faster, but not necessarily more efficient in terms of space. Suppose a file with a size of 1 byte takes up 512 bytes of space on the disk. With this larger sector size, that file would take 4k. I don't see why this isn't an option that can be set through drive initialization parameters, and why you can't choose any size for the sectors, depending on whatever tweaking you can do to figure out what's best for your application.
Also, Solitaire will be replaced by Duke Nukem Forever on every shipped copy of Vista. And if you're one of the first 100 in line at any Best Buy when you pick up Vista, you will also get a free Phantom game console.
Well of course Vista will ship with this supported already. Just like WinFS...er..
To answer someone's question, if all disk transfers are multiples of 4K anyway, you're better off using that as the hardware sector size because there's less overhead -- less spatial overhead on the disk because you have fewer sector headers and intersector gaps, and less temporal overhead in the I/O protocol because you're only sending one transfer command instead of eight.
Your god may be dead, but mine aren't!
Taking the cost per GB currently and that most "small files" now are 10K+ does the overhead this cause by "wasted space" really need to matter. It still takes a hell of a lot of documents to fill up even a 250GB disk and as you can now get these disk for next to nothing I'm happy to get the extra performance
SolarVPS - Quality Windows and Linux Virtual Servers
Isn't this what apple tried to do 5+ years ago with HFS+
Competent file system handlers can use disk blocks larger or smaller than the file system block size, but there are some benefits to using the same number for both. Although it may provide more data-per-drive to use larger blocks and you can index larger drives with 32-bit numbers, the drive has to use better (larger and more complex) CRCs to ensure sector data integrity integrity, the granularity of replacement blocks may end up wasting more space simply to provide an adequate count of replacements, and there are still some disk space management tools that insist on working in terms of "cylinders", regardless of the fact that the disk drives have had variable density zones for ages. The range from 4K (common disk block size) to 16K works as a decent compromise.
"Back in the day" running System V on SMD drives, where you could use almost any block size from 128 Bytes to 32K (the CRCs were weak after that) and control the cylinder-to-cylinder offset of block 0 from the index, I spent a few days trying different tuning parameters and found that, due to the 4K size of the CPU pages, and of the file blocks and swap it really did give a significant improvement in performance. I tried 8K and 16K, because the file system handler could be convinced to break them up, but didn't get any better performance, so used 4k for the spares granularity.
Perhaps I should take one of my late-model SCSI drives, which support low-level reformatting, and try the tests again. 16KByte file system blocks on 16KByte sectors might really be a win now. Have to do some research to see what I can do with CPU page sizes, too.
HDD manufacturers are looking to increase the amount of data stored on each platter. With larger sector sizes, the HDD vendor can use more efficient codes. This means better format efficieny and more bytes to the end user. The primary argument being that many OSes already use 4K clusters.
During the transition from 512-byte to 1K, and ultimately 4K sectors, HDDs will be able to emulate 512-byte modes to the host (i.e. making a 1K or 4K native drive 'look' like a standard 512-byte drive). If the OS is using 4K clusters, this will come with no performance decrease. For any application performing random single-block writes, the HDD will suffer 1 rev per write (for a read-modify-write operation), but that's really only a condition that would be found during a test.
Almost all filesystems I know of use at least 4Kb clusters. NTFS does come with 512 byte on smaller partitions.
LBA accesses on sector boundaries, so for larger HDD's, you need more bits (currently 28-bit LBA, which some older bioses support, means a maximum of 128GB- 2^28*512=2^28*2^9=2^37) Since 512-bytes were used for 30 years, I think it is easy to assume it will not last for 10 more years (getting to LBA32 limit). So why not shave off 3 bits and also make it an even number of bits (12 against 9).
Also there is something called "multible block access" where you make only one request for up to 16 (on most HDD's) sectors. For 512-byte sectors you have 8K, but for 4K sectors that means 64K. Great for large files (IO overdead and stuff).
On the application side this sould not affect anyone using 64-bit sizes (since only the OS would know of sector sizes), as for 32-bit sizes it already is a problem (4G limit).
So this sould not be a problem because on a large partition you will not have too much wasted space (i have around 40MB wasted space on my OS drive for 5520MB of files, and I would even accept 200MB)
Finally! This is what I really wanted for years. Cant believe this innovation has not been materialized earlier. This is great for perfomance, TCO, iPods, everything. I cant wait to get my hands on one of those new goodies!
It can't be, at least not efficiently. Like flash devices, it's impossible to write less than a sector at a time.
If this were transparently implemented by the hardware, the OS would frequently try to write a single 512 byte sector. In order for this to work, the hard drive controller would have to read the existing sector then write it back with the 512 bytes changed. This is a big waste, as a read then a write costs at least a full platter rotation (1/7200 second). Do this hundreds or thousands of times per second, and you have a nice slow hard drive.
Melissa
"Screw Sun, cross-platform will never work. Let's move on and steal the Java language." - Visual J++ Product Manager
Want to write a single byte? Then read 4MB, modify 1 byte, and write 4MB back to the disk.
Ok lets all get real for a minute if block size is not consistent how the hell will disk defragment and optimizer soft wear work. The Whole trick to those soft wear is moving the blocks around. Now if you want to loose the ability to fix and speed up your hard drive go for variable block size. But those of us who run multiple large drives want a small fixed block (to save space on things like fonts and e mail) otherwise our drives will quickly become unmanageable. Dan
That's a bonus for all those boot-sector virus writers - 8 times more space to do their dirty deeds...
But really...think about this: if each sector has overhead, then any file over 512 bytes will have less overhead, and you'll effectively get more space in most cases. What percentage YOUR files are less than 4k?
A 5K font file now uses 8k put 500 fonts on your machine as a designer you see the problem. Email !.2k average file 4k How much mail do you keep. It keeps adding up and you loose massave amounts of space
If you have to re-issue a read or write command (well, the filesystem would do this) for each 512-byte block, that means that you will spend 8 times more energy (give or take a bit) to read or write the same 4k block of data.
Well sorry, but that's the way it is.
Hard drives generally have the ability to read/write multiple sectors with a single command. (Go read the ATA standards). And DMA is usually used [ program I/O just plain sucks].
I don't see how changing the sector size is going to save power... Either way they have to increase the size of the buffers for the read/write multiple operations. So these could just be increased while keeping 512-byte sectors and the same benefit would result.
The real reason for this is that as densities go up, the number of bits affected by a bad spot goes up. So it's desirable to error correct over longer bit strings. The issue is not the size of the file allocation unit; that's up to the file system software. It's the size of the block for error correction purposes. See Reed-Solomon error correction.
Modern DASD architecture is almost completely hidden from the user. In the (good?) old days system software needed to interface closely with the DASD and needed to understand the hardware architecture to gain maximum performance from the devices (I know because I work on such systems within an IBM mainframe environment on airline systems which require extremely high speed data access).
Nowadays the disk 'address' of where the data actually resides is still couched in terms that appear to refer to the hardware itself but in 'serious' DASD subsystems (e.g. the IBM DS8000 enterprise storage systems )the actual way in which the hardware handles the data is masked from the operating system. Data for the same file is spread across many physical devices and some version of RAID is used for integrity.
The 4096 value for data 'chunks' has to do with the most efficient amount of data that can be transmitted down a DASD channel (between a host and storage in large systems or the bus in self-contained systems)
The idea of a 'file address' would cease to exist and it would be replaced by a generic 'data address' if it weren't for the in-built assumptions about data retrieval within all current Operating Systems.
Windows Vista will ship with this support already.
Oh YEAH? Well Linux has had support for it for eleventeen years, and the Linux approach is more streamlined anyway!
It will also come bundled with a universal translator and cold fusion power source (both to be developed circa 2200 AD)
Bumping this up makes sense; however, I wonder if 4KB (KiB?) is a bit low. I know that hard drive sizes have gone up way more than 8x. Let's see, I started with a 20MB drive in an XT. My desktop downstairs has a 200GB drive. So, that's about 10,000 times bigger. Intuitively, a 8x bump in sector size seems a bit small. Wouldn't something like 16KB be useful for a few extra years?
Nevertheless, Heineman was still one heck of an airplane designer.
As far as the 8" floppy, ISTR that they were intended to replace punched cards, 77 tracks with 26 sectors (hard coded) came out to be pretty close to a box of 2000 hollerith cards (80 columns with 12 bits per column). 8" drives were available before the end of 1975, and the VAX came out in 1977(?). One of the uses for the flopies was loading the microprogram store on the VAX and IBM machines of the same era.
I have my eyes peeled for a bio-drive, something noxious smelling that you feed with potato rinds which stores your data directly in its DNA. What d'you reckon? Another thirty years.
You all miss the point. To answer coward over 1100 plus 1300 active emails mostly text. Under 6k Unlike you I use HFS+. When I copied my font and email over to HFS it grew by over 2gigs, I still don't get the error thing error correction should not be affected by block allocation size. as it just links to the next block. Then we still have the defrag and optimize problem.
Hmm. This reminds me of the time when I bought my first external Firewire drive (120Gb) and used it to back up my 10Gb iMac, which had lots of small files (fonts, Word 5.1 documents, etc). Those 10Gb of backups ended up occupying 90Gb of drive space because the external drive had been pre-formatted with some large sector size, and even the smallest file took up half a megabyte! So I had to reformat the drive and start again...
You must think in Russian.
Actually, this almost can't be anything but a good thing.
First of all, most OSes these days use a memory page size of 4k. Having your IO system page match your CPU page makes it much more efficient to DMA data and the like. Testing has shown that this is generally a helpful.
Second, RAID will benefit here. Larger blocks mean larger disk reads and writes. In terms of RAID performance, this is probably a good thing. Of course, the real performance comes from the size of the drive cache, but don't underestimate the benefit of larger blocks. Larger blocks mean the RAID system can spend more time crunching the data and less time handling block overhead. The fact that more data must be crunched for a sector write is of concern, but I'd bet it won't matter too much (it only really matters for massive small writes, not generally a RAID use case).
Third, (and EVERYONE seems to be missing this) some file systems DON'T waste slack space in a sector. Reiserfs (v3 and v4) actually takes the underused blocks at the end of the files (called the "tail" of the file) and creates blocks with a bunch of them crammed together (often mixed in with metadata). This has been shown to actually increase performance, because the tail of files are usually where they are most active and tail blocks collect those tails into often accessed blocks (which have a better chance of being in the disk cache).
Netware 4 did something called Block Suballocation. While not as tightly packed as Reiser tail blocks, it did take their larger 32kb or 64kb blocks (which were chosen to keep block addresses small and large file streaming faster) into disk sectors and storing tails in them.
NTFS has block suballocation akin to Netware, but Windows users are, to my knowledge, out of luck until MS finally addresses their filesystem (they've been putting this off forever). Windows really would benefit from tail packing (although the infrastructure to support it would make backwards compatability near impossible).
To my knowledge, ReiserFS is the only filesystem with tail packing. If you are really interested in this, see your replacement brain on the Internet.
Fourth, larger sectors means smaller sector numbers. Any filesystem that needs to address sectors usually has to choose a size for the sector addresses. Remember FAT8, FAT12, FAT16, and FAT32? Each of those numbers were the size of sector references (and thus, how big of a filesystem they could address). This will prevent us from needing to crank up the size of filesystem references eventually.
Finally, someone mentioned sector size issues with defragmenters and disk optimizers. These programs don't really care as long as all of the sectors on the system are the same size. Additionally, they could be modified to deal with different sector sizes. Ironically, modern filesystems don't really require defragmentation, as they are designed to keep fragments small on their own (usually using "extents"). Ext2, Ext3, Reiserfs and the like do this. NTFS does it too, although it can have problems if the disk ever gets full (basically, magic reserved space called the MFT gets data stored in it and the management information for the disk gets fragmented permenantly). If it weren't for a design choice (I wouldn't call it a flaw as much as a compromise) NTFS wouldn't really need defragmentation. ReiserFS can suffer from a limited form of fragmentation. However, v4 is getting a repacker that will actively defragment and optimize (by spreading out the free space evenly to increase performance) the filesystem in the background.
I really don't see how this can be bad unless somebody makes a mistake on backwards compatability. For those Linux junkies, I'm not sure about the IDE code, but I bet the SATA code will be overhauled to support it in a matter of weeks (if not a single weekend).
I think Mauve has the most RAM. --PHB (Dilbert Comic)
The FAT32 file system already uses a 4KB cluster size or larger. I doubt changing the hard disk sector size to 4KB will signifcantly change the disk performance.
There have been some good comments about FS/sectors and such. I think it can be dumbed down to 2 options:
Create a file system and sector size to maximise capacity or.....
Create a FS and sector combo to maximise perfomance (speed).
As far as the defragmentation issue, this could be lessened by creating a 'system managed' partitioning structure that allows file reads and writes only on the drive surface it actually needs: ie a partition that grows. The less mapping it has to do- the faster it is. I really think that the HD logic can really be tweaked on this one.
Don't be apathetic. Procrastinate!
I was going to write the same.
:-( ), but sector size matching memory page size can increase performances.
Allocation size is irrelevant as many advanced systems are supporting fragments (however still not implemented in ext2/ext3
And from a past discussion some people are thinking that the 512 bytes comes from the memory page size of the VAX.
I have discovered a truly marvelous proof of killer sig, which this margin is too narrow to contain.
OS is already using 4K
Yep. Unless some witless fool alters it from the default to a smaller size, thereby increasing VM overhead.
The comments to this story indicate that most people have no clue what a sector is. They think "cluster" or "block" size has something to do with disk sectors.
Wow, finally, a new block size, never heard of that idea before.
0 0/ul10k300.htm allows 512, 516, 520, 524, 528 but there are devices that do several steps between 128 and 2k or so...)
Doesn't anyone remember that SCSI-drives that support a changeable block-size are around since basically forever? Of course with harddisks it was used mostly to account for additional error-correcting / parity bits, but also magneto-optical media could be written with 512 or 2k (if I remember correctly).
(first hit I found: http://www.starline.de/en/produkte/hitachi/ul10k3
Isn't there some truth to this? I thought there was currently a 512byte limit on bootsector virus sizes, or at least a 512byte limit to tell the system which and how to execute the next block.
Won't this also affect lilo and the like? Now I foresee all sorts of things needing to be rewritten, so that's why Microsoft knew they wouldn't ship till after Christmas. Wow, they're so clairvoyant! But honestly, the current forms should still work, just not take up all the space, eh? How does this affect the linux boot sector limit? It probably won't.
2^3 * 31 * 647
You're all missing one key point. Your 512 byte sector is NOT 512 bytes on disk. The drive stores extra track/ecc/etc information. So a 4096-byte sector means less waste, more sectors, more useable space.
Tom
Someday, I'll have a real sig.
There are two parts to an ATA operation.
The first is the command. This is accomplished by filling in the fields of a data structure called a "taskfile". It contains the 48-bit LBA to begin the operation, the number of sectors to read, the ATA command to issue (e.g. read or write), and other information which is pertinent to the command. This taskfile is then written to the taskfile registers of the ATA drive. The drive reads the taskfile registers and begins the operation specified by the command. A multi-sector command is easy to make, and it is the norm to make multi-sector requests.
The second part is the data transfer (for data transfer commands). This consists of the drive data being transferred in 1 sector blocks. If DMA is enabled, the data is transferred automatically. If PIO is used, the data must be read from the ATA data registers one block (16 or 32-bit data chunks) at a time. Either way, the sector must be read one at a time. When the sector is transferred, the drive checks the data for coherence (no errors) and reports an error or not in the ATA error register.
With a larger sector size, the number of error checks and sector head seeks is reduced. This results in a higher speed data transfer due to the elimination of those things.
I really don't know much about how drives store data. So this may be a really stupid question. But do larger sectors also mean the boot sector? Is this good news for boot loaders?
Did you know you can't access a disk with a sector size diffrent from 512 in Windows, any windows (9x, NT). It can not be donne with Windows API. You have to call BIOS interrupts in Windows 9x, thus running in real mode, or use a driver like http://simonowen.com/fdrawcmd/fdrawcmd.sys in NT. 30 years ago, CP/M OS could use any sector size...
WTF does the "I" stand for?
1983, trying to convince the CDC engineer that yes, I did want him to configure the disk for 336 byte sectors.
Ah the joys of using a Harris 24 bit word/8 bit byte/112 word disk sector machime.
Watch this Heartland Institute video
Back it the late 70's (1979?) Digital Research of CP/M fame, provided the same capability in CP/M 2.0. They called it their Sector Blocking / Deblocking algorithm. As you increase the sector size, which has NOTHING to do with the minimum allocated size the OS uses, you get more disk space per track. I played around with sector blocking / deblocking on my (still functioning) Thinker Toy's 2D disk controller (double density floppy disk controller). I was able to increase the size of the sector from 256 bytes to 1024 bytes which gave me extra space. I don't remember the differences now, but disk space went from something like 490 KB to 596 KB for a single sided 8 inch floppy. A few years later, I had access to the Shugart 14 inch Winchester Hard disk, which at 256 bytes per sector gave 20 MB of space. The drive allowed bigger sectors and I played around with 1024 byte sectors, and it gave over 26 MB of disk space.
Since the OS determines the size of the minumum number of "logical" sectors per allocation unit, this determines how efficient the file system is. Also big files like big allocation units, while small files like small allocation units. It's just a trade off for speed against performance.
"The operating system does/should not "know" anything about how the data is physically stored by a device"
You're talking about LBA, but that only applies to cylinders/heads. The OS does map to the sector (eg, file inode stored at sector 12345 from the partition beginning, which says that file begins on sector 23123 etc). If it didn't use sectors, it would need an extra 7 bits to store the location of everything within the filesystem.
The filesystem also communicates with the driver using sector numbers. It's only when you reach the 'file' abstraction level (either IO calls or memory mapped) that you switch to using bytes.
Although modern FS's will share a block for small files (or the tails of files), I don't think they do this for the actual FS data structures, which I'd guess are block quantized, so they would need to be aware of the block change to make use of the rest of the sector (either storing more info per inode, or more than one inode per sector).
The revolution will not be televised... but it will have a page on Wikipedia
NTFS doesn't necessarily allocate a cluster for each file. Small files are stored entirely in the MFT record. You probably meant so me other filesystems.
sorry, early-after-waking slashdot post :-p
The filesystem does communicate with the driver with sector numbers, but it uses its own block size for addressing, and then shifts the address to get the sector number.
Scratch pretty much all else I said!
The revolution will not be televised... but it will have a page on Wikipedia
How will du work now?
...the new bootstrap loader for Vista will be a mini VBScript interpreter... and built into the shell... oh those clever MS folk
The revolution will not be televised... but it will have a page on Wikipedia
It was done with hard drives too. Back around 1986 I was working in data recovery. Tandon computers used to sell MS-DOS machines with 1KiB sectors. They ran a specially modified version of DOS.
The problem, of course, was that people wanted to upgrade to the latest MS-DOS from Microsoft. So they would replace Tandon DOS with MS-DOS, and suddenly their entire hard drive would be scrambled.
And then they'd call us.
GCHQ Quantum Insert installed. If only our tongues were made of glass, how much more careful we would be when we speak
Statistically speaking, on an FS that allocates whole blocks, the waste space will be the block size * half the number of files on the drive.
;-)
Lets not kid ourselves, this is mostly gonna be useful for the drive we store our movies on
The revolution will not be televised... but it will have a page on Wikipedia
Reiser is not the first file system with this idea:"Third, (and EVERYONE seems to be missing this) some file systems DON'T waste slack space in a sector. Reiserfs (v3 and v4) actually takes the underused blocks at the end of the files (called the "tail" of the file) and creates blocks with a bunch of them crammed together (often mixed in with metadata). This has been shown to actually increase performance, because the tail of files are usually where they are most active and tail blocks collect those tails into often accessed blocks (which have a better chance of being in the disk cache)."
Most filesystems don't, and haven't for decades, wasted these final blocks: "With larger block sizes, disks with many small files would waste a lot of space, so BSD added block level fragmentation, where the last partial block of data from several files may be stored in a single "fragment" block instead of multiple mostly empty blocks."
The performance boost is that we can store small dot files together in 1 block, and (with readahead) speed up things like logins and other operations that read in this small set of data.
FAT and FAT32 couldn't handle this, though (nor many other FS features). While I haven't studied NTFS in the detail that I've studied UNIX file systems, I doubt they don't have support for this. NTFS as of 5.0 supports pretty much every feature of a modern UNIX file system.
--
Internet Explorer (n): Another bug -- that is, a feature that can't be turned off -- in Windows.
About 15 years ago, when the 3.5 in HDD held 500Mbyte, you could reformat your SCSI disk to get an optimim sector size. The disks I was using handled, I think, any even sector size from 128 bytes to 4096 bytes.
Because the disks were so low capacity, you wanted to use every byte, so I reformatted the disks to an optimum sector size for my application, which was about 1812 bytes IIRC. This achieved about 5% extra useful data on the system.
I think there was at one time a need for 2040 byte sectors for the IBM System/38, which had a 33rd bit on each word, which had to be saved to disk.
When the generation of disk changed to 1Mbyte, the controllers had an error one in every few tens of thousands of reads: it simply never completed the transaction. It never happened with 512 byte sectors, and the drive manufacturer only tested at 512 bytes, so we switched back to 512 and have been there ever since.
Consciousness is an illusion caused by an excess of self consciousness.
Windows really would benefit from tail packing (although the infrastructure to support it would make backwards compatability near impossible).
And here I thought Windows user been taking it up the ass for a long time!
"...and is the only system capable of solving AI-hard problems (after sufficient programming)."
Like telling good jokes.
In 1963, when IBM was still firmly committed to variable length records on disks, DEC was shipping a block-replacable personal storage device called the DECtape. This consisted of a wide magnetic tape wrapped around a wheel small enough to fit in your pocket. Unlike the much larger IBM-compatible tape drives, DECtape drives could write a block in the middle of the tape without disturbing other blocks, so it was in effect a slow disk. To make block replacement possible all blocks had to be the same size, and on the PDP-6 DEC set the size to 128 36-bit words, or 4608 bits. This number (or 4096, a rounder number for 8-bit computers) carried over into later disks which also used fixed sector sizes. As time passed, there were occasional discussions about the proper sector size, but at least once the argument to keep it small won based on the desire to avoid wasting space within a sector, since the last sector of a file would on average be only half full.
I hope that this backward compatibility remains for a very long time or archives that are stored on older drives will be lost eventually.
I know that the expected life of data on magnetic drives is not all that long but it does not mean we do not try to recover ancient data. I still pull stuff from reel tape using dd from over 10 years ago. Migration is a solution but I am too lazy to do it.
In my first real job, I worked for Prime Computer. They stored a bit of data along with each filesystem 512 byte block. If I remember correctly, they stored a forward and a backward link. If the filesystem was corrupted, they could restore many of the files because of the links in the data portion of the disk. This led to 528 byte sectors (if I remember correctly).
Having a weird sector size limited the manufacturers who would supply disks.
This decision also made file deletion slow -- as every block of a file was re-written.
The filesystem supported Several file-organization types. The two most prevelant were sequential access and dynamic access. Sequentlial access files had a pointer from the directory to the file. The file had forward and backward links at the head of each block. To read the 300th block, it would be necessary to read the first 299 blocks before reading the 300th block. To erase the file, each block would need to be read (to obtain the forward link to the next block), and then written.
Dynamic access files would have a table of pointers to disk block - residing in the first blocks of the file. Normal reads wouldn't see the data in the first block directly.
Amazing what cruft sits in my brain after over a decade.
Where law ends, tyranny begins -- William Pitt
... welcome our new 4Kbytes sector overlords.
So say we all
"I see you are trying to use your computer, would you like me to use it as a spam server?"
It is by the juice of the coffee bean that thoughts acquire speed, the teeth acquire stains. The stains become a warning
About time. SCSI has a 32-bit block number built into the protocol. With 512 byte blocks, that a 2 TB architectural limit. We can expect to be able to buy a 2 TB disk off the retailer's shelf in another two or three years.
We're not talking about filesystems. If we were I can remember choosing my ext2 block size anything between 1k and 4k about six years ago. Apple is easily predated here.
Malike Bamiyi wanted my assistance.
When using DMA it doesn't really matter since the whole transfer is performed by the drive. When using PIO, software can use MULTIPLE MODE versions of read and write commands to transfer more than one sector at a time. This results in better performance only because of host software and interface issues. It's not a hard drive internals thing. Multiple mode was added to IDE in the late 80's. You're a little out of date.
Changing the sector size will in no way reduce "the number of error checks and sector head seeks (sic)". It increases the density of user data on the disk with essentially no downside. PC's have not been able to use different sector sizes in the past since BIOS'es assume 512 bytes. It's about time that got fixed.
Forcing a 4K sector size means that the smallest IO will now be 4K. Disk and storage controller vendors will like that I'm sure. Designing high performance storage while worrying about 512 byte performance is ridiculous. Caches no longer have to worry about small sectors. Hoorah!
So, some of the boys here have indicated that SATA disks, by design, can undergo silent data corruption - esp. when used in raid arrays.
Anyone know anything about this?
Yeah, but does it run Linux?
One thing people are forgetting is that you will always waste disk space with all files. All files are NOT divisible by 4096. There will be some remainder which will waste the remaining sector.
You would lose less disk space if you use smaller blocks. The larger the block more disk space will be wasted even if you have only large files and avoid small files.
most modern disk utils let you choose between 512b 1k 2k and 4k sectors
what we will start to see here is our inodes will fill up faster than our disks.
or, on NTFS our $MFT will fill up past the files.
I format all my drives with 512 byte sectors.
this means more sectors on a disk than with 4096 byte sectors.
if you have a lot of small files you should use 512 byte sectors.
the caveat is more sectors means more to go wrong.
They're using their grammar skills there.
Vista will also ship with support for holographic storage, 3D-free air holographic display drivers, and cerebral direct interface.
Vista, like Visa, will leave you further in debt and miserable about it.
Athiesm is a religion like not collecting stamps is a hobby.
Not always the case. In many "modern" file systems (jfs, ReiserFS, xfs(?), and likely others), small files can be stored directly within the directory structure of the file system itself - eg, as part of the B*-tree that forms the file index. So you'll actually have a number of tiny files together taking up only one disk sector. Obviously that's not possible on older filesystem architectures (ext2/3, FAT, etc) where each inode (or the moral equivalent) points directly at the unique start sector for the file.
Don't know how NTFS works - someone more knowledgeable can chime in here.
--
/tsg/
Efficient support for very small files would allow a lot of crap on Windows/Linux (such as the Registry and the Gnome copy of it) to be eliminated, and allow all "metadata" (things like the artist in a song) to be stored as files. Read up on the ReiserFS plans. Ideally 99.99% of the files on a disk would be less than 100 bytes.
However this is best solved by writing the filesystem to put these small files into the blocks with other data, such as many small files together, or inserted directly into the directory. Correctly done you would get a bunch of small files at once with a single read, and since use of this would probably need to look at many at once, this could be very efficient. In any case larger sectors are harmless.
I have some 5+ year old scsi disks that can be low-level formatted with 4k sectors. This is nothing new, aside from the 'standard' increasing the sector size.
Memory serving, the scsi-2 spec also allows sector sizes up to 16k or more during a format, if the drive supports it.
...to be dicks and point out in a pedantic but historically and logically incorrect way that GB doesn't mean multiples of 1024 but 1000.
http://www.ss64.com/docs/bytes.html
So according to you, my 300 GB hard drive is really 300 GB even though common sense, the OS, history, and the 512 byte sector argue is isn't.
So c'mon, let your inner dick flap in the wind, along with your jaws. Please. Pretty please.
But first ask IDEMA to make the sectors 4000 bytes in length. Otherwise I'll conclude you are a lead-paint licking (mom's) basement dweller with nothing helpful to say ever. Or possibly someone working for the marketing department of a hard drive manufacturer, which would be worse.
Or will it hurt overall storage because a Bad Sector now requires a full 4KB spare?
"It's the height of ridiculousness to say for those 9 lines you get hundreds of millions."
Physical Disk Sector Sizes Supported
512 bytes through to 32 kilobytes (in powers of 2), with the caveat that the sector size must be less than or equal to the filesystem blocksize.
http://oss.sgi.com/projects/xfs/
Building a better backup.
Zettabyte Storage
Bumping up the sector size from 512 to 4096 means we can access disks 8 times larger without widening the address registers. We already have 48bit sector addressing which yields up to 128 petabytes PER DISK. 10 years ago drives were 1/1000th the size of current models, perhaps in 10 years we will be seeing drives approaching the petabyte range.. perhaps not. Either way we don't need enlarged sectors for capacity reasons yet.
If this helps to reduce overhead in the design and manufacturing of hardware, then be my guest! What I would really love to see though, is more speed. Capacity grows several orders of magnitude faster than speed, and it is a significant bottleneck for most data-intensive jobs. That's why we have RAID in desktop rigs. Why not implement a black-box raid-like solution for future hard drives ? To hell with form factors, give me a 5.25" height hard drive that boasts 2 terabytes across 8 platters running in parallel, just like a bigass raid-0 stripe, but transparent. 8 platters times 512 bytes per sector = 4096 byte striped sectors. The real advantage would be the 400mb/sec sustained transfer rate (or better). That sort of performance leap would warrant a new interface. SATA is convenient, but we went from 133mb/sec with ATA-133 to 150mb.. coitus interruptus ?
-Billco, Fnarg.com
Wouldnt this waste a *lot* of space for items like shortcuts, weblinks, C modules..
I realize that drives are getting bigger and bigger, but is that an excuse to waste ?
---- Booth was a patriot ----
Good news. This will be nice to have in software, a 4096-byte sector. As others have said, this will exactly match the cluster size used by most modern filesystems, and the page size used by the Intel x86 architecture. This happy coincidence will mean that operating systems can just do a 1-to-1 read/write, and not need to waste time blocking/deblocking.
:)
Isn't this already done in drive hardware? I thought, that in order to save physical space on the disc surface by reducing inter-sector overhead, the drive already internally uses much larger sectors than the current 512-byte standard. The disc controller just takes care of this automatically, doing the blocking/deblocking transparently. It reads/writes the larger sectors just fine, but just serves 512 bytes at a time to the computer, buffering the rest.
From a software point of view, 4096-byte sectors will be nice. I hope they take the time to get in a new partition table format! Drop the obsolete CHS fields, as they've been maxed out for a long time now. As for the LBA fields, widen them to 64 bits, so that there's plenty of room for the future. With 512-byte sectors and the current 32-bit LBA fields, the maximum is 4G sectors, 2TB. With RAID becoming popular these days, this limit is easily reachable! Going to 4096-byte sectors will push this limit back to 16TB, a good thing, but widening the fields to 64 bits will really push this limit out of sight. 72ZB. Maybe that will even be enough for Google?
Dr. Demento On The 'Net!
And SATA2 has already doubled to 300mb, a la 33mb > 66mb > 100mb > 133mb of ATA. That leaves 450mb & 600mb ready for the future, and possibly more (I don't know enough about SATA to specualte).
Short story: the comparison isn't 133mb to 150mb, but 33mb to 150mb (first gen of each).
-bZj
.sig
Those who sacrifice security to condemn liberty deserve to repeat history or something. - Benjamin Santayana
Hah. :p
Well yeah, talking of eliminating atomic units...
Might be nice if they could just drop that HDD block thing and allow to write anywhere on the disk. Ex, maybe with the read part of the I/O heads built in front of the write part, so that when updating part of a physical block, it could just read the unknown bits and overwrite them right after "on the fly" and compute the trailing CRC and ECC very fast. The old content could be saved in a buffer incase the CRC fails to use it with the ECC. Or the write part could be far enough that it has time to read the whole block before starting to overwrite it so that it could retry to read on errors.
But I dunno that much about HDD; there's probably a good reason it's not made this way.
"4096-bytes should be enough for anyone"
Shall we compare to the lovely rates of ATARI hard disks ? I had a fun little 20mb CoolDisk that was probably in the 100's of kb/sec range.
:p Sure, SATA-150 is ten times faster than the first ATA hard drives from over a decade ago, but what does that prove ? That new interfaces are faster than older.. big surprise.
.. SATA-1200, now that would be something to sing about! But then we'll face the problem of a slow main bus across the motherboard.
I don't think there's much point in comparing first gen ATA with first gen SATA, especially when it was initially 16mb/sec and not 33
My original point was that SATA-150 isn't much of an improvement over ATA-133. Reading between the lines, that means I think the SATA working group should have aimed higher. When designing something next-gen, shoot for the stars, especially when dealing with a huge partnership of bureaucrats that will take years before actually producing something tangible. At least take that serial bus and parallelize it so I can keep using 80-pin conductors and get 8x the performance
This ain't rocket science, it's data transmission over copper. Why can't we have one general-purpose system with extremely high speeds that everything could plug into ? Many specialized server devices (blade systems especially) have crazy fast master busses while the lowly PC is still trucking along with 33-mhz PCI that the average FPGA-hacking toddler can max out, yet we change CPU sockets every 10 months, and VGA slots every 2 years.. Lose the legacy!
-Billco, Fnarg.com