HDD Manufacturers Moving To 4096-Byte Sectors
Luminous Coward writes "As previously discussed on Slashdot, according to AnandTech and The Tech Report, hard disk drive manufacturers are now ready to bump the size of the disk sector from 512 to 4096 bytes, in order to minimize storage lost to ECC and sync. This may not be a smooth transition, because some OSes do not align partitions on 4K boundaries."
According to the Anandtech article, only the pretty much end-of-life Windows XP is out of luck. Linux, OS X and modern Windows versions all work ...
Non news?
There are certain models of the Western Digital Caviar Green drives that are already shipping with a 4K sector size, such as this one: http://www.newegg.com/Product/Product.aspx?Item=N82E16822136490
I just checked my system. /dev/sda1 is /dev/sda + 32256 bytes, which is 63 512-byte sectors. /dev/sda2 is also on an odd-numbered sector alignment.
Fedora 11 fresh install, which is less than a year old.
It doesn't sound like the 512 bytes per sector is tightly bound to hardware. More like a low-level reformat plus change of some #defines in the firmware to transform from one to another type. Which would mean there could be i.e. a jumper setting for sector size, allowing for backward compatibility.
Also, the fact an OS doesn't enforce partition alignment doesn't mean it won't respect a disk formatted to aligned partitions. Just provide a 3rd party partitioning tool that aligns the partitions right, and install the OS on pre-made partitions. If your business depends on WinXP so much, your IT dept should be capable of doing it.
45 5F E1 04 22 CA 29 C4 93 3F 95 05 2B 79 2A B2
Why does the sector size presented by the interface have to reflect anything about the hardware?
If the OS clusters aren't aligned to physical sectors, the hard drive's controller has to read-modify-write all the time.
A sector on a HDD is the minimum writeable space. Think of it as a lot in a subdevelopment. If each lot is 50,000 sq. ft. on a 20 acre plot, and you move to 60,000 sq. ft. lots instead, the plot is still 20 acres, but the development now has less lots on it.
In computers, larger sectors are often better for large files, while smaller sectors are better for smaller partitions and smaller files. If a sector is 4096 bytes, and you create a 1024 byte file, it still occupies 4096 bytes on the disk, as the HDD won't write anything else but that file to the sector. If you have files that are hundreds of megabytes though, you can access the file, with minimum wastage, by using fewer sectors, which reduces thrashing and similar issues.
The discrepancy between file sizes and sector sizes is what the difference is in Windows when you view a hard drive and it displays "size" and "size on disk". "Size" is the actual file size, while "size on disk" is the amount of space the file occupies on the hard drive.
FanFictionRecs.net
A byte can be 10 bits; it's an architecture-specific quantity. An octet is always 8 bits.
I am TheRaven on Soylent News
Those are "logical" sectors, which can be different from the physical sector size. According to the Anandtech article the Western Digital hard drive model numbers that end with "EARS" use the larger, 4KB physical sector size, while presenting a 512 byte logical sector size to the operating system for compatibility reasons.
Please note, of course, that the logical sector size is a drive interface level concept distinct from the filesystem cluster or block size. Filesystem block sizes have generally been larger than the logical or physical sector size for quite some time.
So why have the sectors at all? [...]
The 1024 byte file could then take 1024 bytes.
That's not "not having sectors", that's having sectors 1 byte long.
Thus, apply the reasoning of "bigger sectors, faster treatment of bigger files, and vice-versa".
That's like deciding to remove the checksums from TCP and IP because a few protocols provide their own checksums.
Funny you should mention IP checksums, that's one feature removed from the IP layer in IPv6 precisely because the 'important' protocols do it themselves anyway (i.e. TCP).
XML is like violence. If it doesn't solve the problem, use more.
Yes, it's an addressing thing. The grandparent is confusing sectors with allocation units. A filesystem is perfectly at liberty to allocate sub-sectors to different files (some do). A 32-bit disk interface can address 2^32 sectors. If you have one-byte sectors then that means you're limited to 4GB disks. If you have 512 bytes sectors then you're limited to 2TB. If you want a disk bigger than 2TB then you can either make the interface wider or can make the sector size bigger. Making the address wider requires defining a new interface[1], although ATA currently supports 48-bit addresses, so this isn't really a problem for a while. It is convenient for filesystems, because they can continue to use 32-bit sector indexes for partitions larger than 2TB.
The real advantage of bigger sectors is that they reduce the command overhead. To write 4KB to the disk you just need to send one write command and the data, rather than eight. All modern operating systems cache data from disk in RAM and so will write it out or read it in as a group of pages. The smallest page size of any modern architecture is 4KB, so having 4KB sectors is a lot more convenient.
I am TheRaven on Soylent News
That's insane. ECC at the hardware / firmware level corrects the vast majority of bit errors transparently in a manner that is invisible to the operating system. If you took out sector level ECC, the drives would be useless in anything other than a ZFS RAID configuration, and even then performance would drop in the presence of trivially ECC correctable errors, due to the re-reads and stripe reconstructions at the filesystem level.
Drive performance would probably drop because the heads would have to stay in closer alignment without the ability of ECC to correct data read errors caused by small vibrations and electrical noise. In addition, sector relocations would probably increase because tiny flaws that do not impair the ability of a drive to write an ECC correctable sector would force the drive to remap that sector to another part of the disk.
It is a similar issue with various wire level data transmission schemes. If DSL connections did not use error correcting codes, they would suffer much higher packet loss rates than they do now, especially at distance. Most those packets would generally get retransmitted due to transport level checksum errors, but why resort to performance impairing fall back measures when the problem can be largely eliminated at a lower level?
A sector used to be quite literally a sector of a disc in the mathematical sense, like a wedge shape that spins around. Now with LBA (labeling hard drive's blocks in series from zero rather than by their physical position) it is just like a block on your filesystem, but on the hardware instead, it is a blob of data that must be read or written as a whole. The rationale is that you are not likely to ever want to read or write one byte at a time, so there is no reason to make the hard disk handle requests for one byte. The difference between a "sector" and a block is that a block on a file system should not be smaller than a sector on the hard drive since an OS can pretend two, four, etc. sectors is a single block, it cannot cut a sector in half.
The upshot of this, is unlike memory which is addressable to the byte, hard discs can be much bigger compared to the address range since it only needs volume/blocksize addresses to locate the data, so even with a block size of 512, a 2 Terabyte (base2) volume may be sufficiently covered with a 32 bit address space, this makes everything a lot easier and more efficient.
Anyway, in answer to your question, sectors are still as useful as they ever were, just they might not actually be sectors anymore because of LBA. Maybe they are, I'm not sure, I've only written hard disc drivers, I've never built one of the things.
When Argumentum ad Hominem falls short, try Argumentum ad Matrem
Any change in sector size that doesn't affect the filesystem block size will not affect the number of KB required to store a file at all. Since virtually every filesystem already uses 4 KB block sizes by default a change to 4KB logical or physical sector sizes will not have an effect on storage requirements.
The original definition of "byte" was the number of bits used to encode a character of text and is the basic memory-addressable element in a computer. It never originally meant "8 bits".
If you reply, do so only to what I explicitly wrote. If I didn't write it, don't assume or infer it.
NTFS has been 4K aligned for a long time now.
That doesn't do any good if the partition it is on starts with an LBA that is not a multiple of 8. Windows versions prior to Vista create the first partition starting at LBA 63, which is not 4KB aligned.
The people who will have performance problems will primarily be Windows XP users who purchase the newer style drives and do not realign the first partition accordingly. Some versions of "fdisk" on Linux have a similar deficiency, with an "cylinder" based user interface and odd size cylinders in the name of MSDOS compatibility. Not sure if that has been fixed yet.
Wrong.
A word is architecture specific.
A byte is ALWAYS 8 bits.
A byte can't possibly "always" be 8 bits, when a byte means a single character.
This is the definition of 'byte' from 1959. People only started getting confused recently (Recently being the past 20 years) since the IBM 360 systems which first introduced the 8 bit byte and then became a defacto standard in the 80s. Then as new computer users moved into the front, such as yourself, you assume a byte must be 8 bits because that is all you have seen a byte to mean.
There are systems that encode a single byte with 7, 8, 9, and 10 bits still today.
The only time a byte is 8 bits is when the system is structured around 8 bit units.
Hop on a PDP or Cray system and you will see a byte is 7 or 9 bits respectively.
Origins of the word 'byte':
http://www.trailing-edge.com/~bobbemer/BYTE.HTM
[blockquote]The original definition of "byte" was the number of bits used to encode a character of text and is the basic memory-addressable element in a computer. It never originally meant "8 bits".[/blockquote]
That is the definition of 'octet', a term frequently used in telecom. People confuse byte and octet all the time, because popular hardware architectures use an octet as a byte.
Can You Say Linux? I Knew That You Could.
I believe that some of the early CDC machines (a company that is no longer around) had a 6-bit character. The Digital Equipment Company (DEC, alos a company that is no longer around) PDP-1, maybe the PDP-20, and some others also had a 6-bit character. The PDP's had 36-bit words, packing 6 characters into a word. And of course, the IBM machines (a company that is still around) used EBCDIC rather than ASCII (but did use an 8-bits per character). Some of the earlier (and even the 370's) IBM machines used BCD (binary coded decimal) for arithmetic (packing a number from 0 to 9 in 4 bits, with some sign and unassigned bits left over).
Also, back in the IBM JCL days, when allocating disk space for a file you could specify the number of cylinders (or tracks) that you wanted, the block size and the packing factor.
What a bunch of misinformed drivel. That article is missing a couple of things:
firstly) The issue affects all Windows versions based on a 5.x kernel. That means Windows 2000, XP, 2003 server and Windows Home Server.
1) These drives are NOT strictly-4k-sector. The platters may be organized in 4k sectors, but the drive only talks to the OS in terms of 512 byte-sectors. And since we're discussing old Windows versions: NTFS has defaulted to using 4k (logical) sectors since its introduction, so there is NO performance penalty when using NTFS on these drives. You shouldn't be using FAT32 anyway.
2) The issue can be worked around by creating partitions with a tool that understands 4k sectors, or by re-aligning the partitions after creation/installation. If you only use a drive in those systems (i.e. no repartitioning), the drive will work as it should. Even if you create partitions that are unaligned, the drive will still work - you will only lose some performance.
3) The one genuine problem raised in the linked article comes when you want to use these drives in closed-firmware devices. In this case you still have two options: either you use the WD-provided jumper setting, or you pre-create the partitions before you insert the drive.
I fail to see what the fuss is all about.
I see four numbers there, representing the decimal numbers 0 through 3 inclusive. He's not saying that "00,01,10,11" are labels for types of people, he's saying the number of types is "11", which if read as a binary number is 3 in decimal.