Slashdot Mirror


Changes in HDD Sector Usage After 30 Years

freitasm writes "A story on Geekzone tells us that IDEMA (Disk Drive, Equipment, and Materials Association) is planning to implement a new standard for HDD sector usage, replacing the old 512-byte sector with a new 4096-byte sector. The association says it will be more efficient. According to the article Windows Vista will ship with this support already."

69 of 360 comments (clear)

  1. Ah, error correction. by wesley96 · · Score: 5, Insightful

    Well, CD-ROMs use 2352 bytes per sector, ending up with 2048 actual bytes after error correction. Looking at the size of the HDDs these days a 4096-byte sector seems pretty reasonable.

    --
    Serving time in Aristotelean prison for violating laws of physics
    1. Re:Ah, error correction. by Ark42 · · Score: 5, Informative

      Hard drives do the same thing - for each 512 bytes of real data, they actually store near 600 bytes onto the disk with information such as ECC and sector remapping for bad sectors. There is also tiny "lead-in" and "lead-out" areas outside each sector which usually contain a simple pattern of bits to let the drive seek to the sector properly.
      Unlike CD-ROMs, I don't believe you can actually read the sector meta-data without some sort of drive-manufacturer-specific tricks.

    2. Re:Ah, error correction. by bjpirt · · Score: 2, Interesting

      I wonder if the 4096 bytes are before or after error correction. If it's after, it might make sense because (and I'm sure someone will correct me) isn't 4K a relatively common miimum size in today's filesystems. I know that the default for HFS+ on a mac is.

    3. Re:Ah, error correction. by baadger · · Score: 2, Informative

      NTFS has a cluster/allocation size from 512 bytes to 64K. This determines the minimum possible ondisk filesize, but I don't think it has too much to do with the sector size.

    4. Re:Ah, error correction. by alexhs · · Score: 5, Informative

      Unlike CD-ROMs, I don't believe you can actually read the sector meta-data

      What are you calling meta-data ?
      CDs also have "merging bits", and what is read as a byte is in fact coded on-disk as 14 bits, and you can't read C2 errors either, that are beyond the 2352 bytes that really are all used as data on an audio CD, an audio sector being 1/75 of a second, 44100/75*2(channels)*2(bytes per sample) = 2352 bytes and it has correction codes in addition too. You can however read subchannels (96 bytes / sector)

      When dealing with such low-level technologies, reading bits on disk doesn't mean anything as there really are no bits on the disc, just pits and lands (CD) or magnetic particles (HD) causing little electric variations on a sensor, then no variation is interpreted as 0 and a variation is interpreted as a 1, and you need variations even when writing only 0's as a reference clock.

      without some sort of drive-manufacturer-specific tricks.

      Now of course, as you cannot change HD platters within different drive with different heads like you can do with a CD, each manufacturer can (and will !) encode differently. It has been reported that hard disks with the same reference wouldn't "interoperate" exchanging the controller part because of differing firmware versions, while the format is standardized for CDs or DVDs.

      they actually store near 600 bytes

      (that would be 4800 bits) In that light, they're not storing bytes, just magnetizing particles. Bytes are quite high-level. There are probably more than a ten thousands magnetic variations for a 512 byte sector. What you call bytes is already what you can read :) But there is more "meta-data" than that.

      Here's an interesting read quickly found on Google just for you :)

      --
      I have discovered a truly marvelous proof of killer sig, which this margin is too narrow to contain.
    5. Re:Ah, error correction. by Glooty-Us-Maximus · · Score: 2, Informative

      It doesn't, other than the FS block size should be a multiple of the disk sector size to avoid wasting extra read/writes to access/store a FS block, as well as to avoid wasting space storing an FS block.

  2. Cluster size? by dokebi · · Score: 2, Interesting

    I thought cluster sizes were already 4KB for efficiency, and LBA for larger drive sizes. So how does changing the sector size change things? (Especially when we don't access drives by sector/cylinder anymore?)

    --
    In Soviet Russia, articles before post read *you*!
    1. Re:Cluster size? by scdeimos · · Score: 5, Informative
      I thought cluster sizes were already 4KB for efficiency, and LBA for larger drive sizes.
      Cluster sizes are variable on most file systems. On our NTFS web servers we tend to have 1k clusters because it's more efficient to do it that way with lots of small files, but the default NTFS cluster size is 4k. LBA is just a different addressing scheme at the media level to make a volume appear to be a flat array of sectors (as opposed to the old CHS or Cylinder Head Sector scheme).
  3. Re:4MB by irimi_00 · · Score: 2, Funny

    Why not a 32768 bit sector?

  4. Re:That's nice by jcr · · Score: 4, Informative

    So... If I write down a little 16-byte message to myself in Notepad containing a name and a phone number, it will take up 4096 bytes.

    On most systems in use today, it already does.

    Blame the file system, not the sector size on the media.

    -jcr

    --
    The only title of honor that a tyrant can grant is "Enemy of the State."
  5. No, that's not 'sector' by wesley96 · · Score: 4, Informative

    You're thinking of 'cluster'. This is tied to the file system that is actually used on the disk. Even with the current 512-byte sector, a normal NTFS partition of, say, 200GB, uses 4KB cluster and a single file takes up a minimum of 4KB already.

    --
    Serving time in Aristotelean prison for violating laws of physics
    1. Re:No, that's not 'sector' by TapeCutter · · Score: 3, Interesting

      "So, all they doing is pushing this abstraction layer to the hardware, thus getting rid of an unnecessary layer, if I understand it correctly?"

      Nah, nothing that significant. The operating system does/should not "know" anything about how the data is physically stored by a device. The existing O/S storage abstractions will remain. (You may have trouble running a very old O/S but that would be just one of your problems)

      Every modern O/S uses disk space as virtual memory by reading and writing chunks of RAM to the HDD when it runs out of physical RAM. The standard HDD sector size is changing to the most commonly used O/S size for memory "pages" (RAM chunks written to disk).

      The larger size will (in theory) speed things up a tiny amount. The the HDD will now read/write a "page" to disk in one sector rather than four. Meaning the HDD will perform less administrative functions to swap RAM back and forth to the disk. Hardly anyone will notice this but constant minor tweeking of HDD internals has evolved them very rapidly. eg: In 1990 I paid $200AU for a second-hand 20MB HDD (~0.2 SECOND seek time!).

      --
      And did you exchange a walk on part in the war for a lead role in a cage? - Pink Floyd.
  6. Hrm, that kind of makes sense... by I+kan+Spl · · Score: 2, Insightful

    Most "normal use" filesystems nowadays (FAT32, Ext3, HFS, Reiser) all use 4K blocks by default. That means that the smallest amount of data that you can change at a time is 4k, so every time you change a block, the HDD has to do 8 writes or reads. That would leave the drive preforming 8x the number of commands that it would need to.

    As filesystems are slowly moving towards larger block sizes, now that the "wasted" space on drives due to unused space at the ends of blocks are not as noticable, moving up the size on the underlying hardware also makes sense. I don't think that this can make things too much faster, but it would allow SATA drives (and SCSI also) to quesu more commands in their internal buffers, as they will onyl be recieving one command per read/write that the filesystem does, instead of 8.

    --
    My UID is prime and so is this number: 09 F9 11 02 9D 74 E3 5B D8 41 56 C5 63 56 88 C0.
    1. Re:Hrm, that kind of makes sense... by Anonymous Coward · · Score: 3, Informative

      You're talking bullshit. In SCSI/SATA you can read/write big chunks of data (even 1MB) in just one command. Just read the standards.

    2. Re:Hrm, that kind of makes sense... by Anonymous Coward · · Score: 3, Informative

      I'm pretty sure he was talking about operations performed by the drive's internal controller, not those sent through the interface cable.

    3. Re:Hrm, that kind of makes sense... by darkmeridian · · Score: 2, Informative

      Grandparent is discussing "native command queueing", where the hard disk will parse the OS read/write calls and stack them in a way that optimizes hardware access. Pretend there are three consecutive blocks of data on the hard drive: 1, 2, and 3. The OS calls for 1, 3, and then 2. Instead of going three spins around, NCQ will read the data in one spin in 1, 2, 3 order but then toss it out to the OS in 1, 3, 2 order. Now, I'm not sure how much higher sector sizes will affect NCQ capability, because I thought was limited by the amount of hardware cache.

      --
      A NYC lawyer blogs. http://www.chuangblog.com/
  7. Re:That's nice by Beryllium+Sphere(tm) · · Score: 2, Informative

    NTFS will write something that small into the MFT.

  8. Re:Quick Explain How! by AngelofDeath-02 · · Score: 5, Interesting

    Best analogy is a gym locker room
    You have say, 10 lockers up and 20 lockers accross
    You can only put one thing in a locker, so you cant put your gym shorts in the same one as your shoes. But if you have lots of socks, you can pile them in, and take up two or three if neccessary.

    Space is wasted if you have a really big locker, but it's only holding a sock.

    Now, you've got to record where all of this stuff is, or you will take forever to find that sock. So you set asside a locker to hold the clipboard with designations.

    Now to bring this back into real life. There are a _lot_ of sectors on a disk. So keeping track of all of them starts requiring a substantial amount of resources. I imagine they are finding it easier to justify wasting space for small files in order to make it easier to keep track of them. Average file sizes are also going up, so it's not as big of a problem as it used to be either. It's all relative...

    --
    No, I am not an English major. My posts are subject to typos and incorrect grammar. Do not expect perfection.
  9. Good for small devices by BadAnalogyGuy · · Score: 4, Interesting

    Small devices like cellphones typically save files of several kilobytes, whether they be the phonebook database or something like camera images. Whether the data is saved in a couple large sectors or 8 times that many small sectors isn't really an issue. Either way will work fine, as far as the data is concerned. The biggest problem is the amount of battery power used to transfer those files. If you have to re-issue a read or write command (well, the filesystem would do this) for each 512-byte block, that means that you will spend 8 times more energy (give or take a bit) to read or write the same 4k block of data.

    Also, squaring away each sector after processing is a round trip back to the filesystem which can be eliminated by reading a larger sector size in the first place.

    Some semi-ATA disks already force a minimum 4096-byte sector size. It's not necessarily the best way to get the most usage out of your disks, but it is one way of speeding up the disk just a little bit more to reduce power consumption.

  10. Re:Vista by Anonymous Coward · · Score: 2, Funny

    Also, Solitaire will be replaced by Duke Nukem Forever on every shipped copy of Vista. And if you're one of the first 100 in line at any Best Buy when you pick up Vista, you will also get a free Phantom game console.

  11. In Vista already? by sinnerman · · Score: 5, Funny

    Well of course Vista will ship with this supported already. Just like WinFS...er..

  12. Re:That's nice by ars · · Score: 3, Informative

    Um, it already does take up 4K or more. Unless you have a hard disk smaller then 256MB.

    See: http://www.microsoft.com/technet/prodtechnol/winxp pro/reskit/c13621675.mspx and scroll down to Table 13-4

    If you notice, in most of the useful cases the custer size is 4K. Making the hard disk match this seems like a good idea to me.

    And EXT2 also uses a 4K block size.

    Also remember it's for large disks, no FS that I know of supports a cluster (or block) size smaller then 4K for large disks.

    --
    -Ariel
  13. Re:Quick Explain How! by BadAnalogyGuy · · Score: 5, Funny

    I'm willing to sell this account for the right price.

  14. Re:That's nice by Foolhardy · · Score: 3, Informative

    Actually, if you're using NTFS, the data will be stored directly in the file entry in the MFT, taking zero dedicated clusters or sectors. The maximum size for this to happen is like 800 bytes.

    Here's a short description of how NTFS allcates space. On volumes larger than 2GB, the cluster size (the granularity the FS uses to allocate space) was 4k already unless you specified something else when formatting the drive. Also, Windows NT has supported disk sector sizes larger than 512 bytes for a long time; it's just that anything else has been rare.

  15. Re:Quick Explain How! by realcoolguy425 · · Score: 5, Funny

    I'm sorry, Your response has to be in some form of star-trek (or sci-fi) I would have accepted this however...

      Best analogy is Spock's gym locker room

    Spock has say, 10 space lockers up and 20 space lockers accross

    Spock can only put one thing in a locker, so Spock cant put his gym shorts in the same one as your shoes. But since Spock has lots of socks, He can pile them in, and take up two or three if neccessary.

    Space is wasted if Spock uses a really big locker, but it's only holding a sock.

    Now, you've got to record where all of this stuff is, or you will take forever to find that sock. (I guess the tricorders are broken) So Spock sets aside a locker to hold the clipboard with designations.

    Now to bring this back into real life. There are a _lot_ of sectors on a disk. So keeping track of all of them starts requiring a substantial amount of resources. I imagine they are finding it easier to justify wasting space for small files in order to make it easier to keep track of them. Average file sizes are also going up, so it's not as big of a problem as it used to be either. It's all relative...

  16. Why just one standard? by dltaylor · · Score: 2, Interesting

    Competent file system handlers can use disk blocks larger or smaller than the file system block size, but there are some benefits to using the same number for both. Although it may provide more data-per-drive to use larger blocks and you can index larger drives with 32-bit numbers, the drive has to use better (larger and more complex) CRCs to ensure sector data integrity integrity, the granularity of replacement blocks may end up wasting more space simply to provide an adequate count of replacements, and there are still some disk space management tools that insist on working in terms of "cylinders", regardless of the fact that the disk drives have had variable density zones for ages. The range from 4K (common disk block size) to 16K works as a decent compromise.

    "Back in the day" running System V on SMD drives, where you could use almost any block size from 128 Bytes to 32K (the CRCs were weak after that) and control the cylinder-to-cylinder offset of block 0 from the index, I spent a few days trying different tuning parameters and found that, due to the 4K size of the CPU pages, and of the file blocks and swap it really did give a significant improvement in performance. I tried 8K and 16K, because the file system handler could be convinced to break them up, but didn't get any better performance, so used 4k for the spares granularity.

    Perhaps I should take one of my late-model SCSI drives, which support low-level reformatting, and try the tests again. 16KByte file system blocks on 16KByte sectors might really be a win now. Have to do some research to see what I can do with CPU page sizes, too.

  17. Re:4MB by LardBrattish · · Score: 3, Insightful

    Simple answer - every file would then have a minimum size of 4MB

    --
    What are you listening to? (http://megamanic.blogetery.com/)
  18. It's all about Format Efficiency by alanmeyer · · Score: 5, Informative

    HDD manufacturers are looking to increase the amount of data stored on each platter. With larger sector sizes, the HDD vendor can use more efficient codes. This means better format efficieny and more bytes to the end user. The primary argument being that many OSes already use 4K clusters.

    During the transition from 512-byte to 1K, and ultimately 4K sectors, HDDs will be able to emulate 512-byte modes to the host (i.e. making a 1K or 4K native drive 'look' like a standard 512-byte drive). If the OS is using 4K clusters, this will come with no performance decrease. For any application performing random single-block writes, the HDD will suffer 1 rev per write (for a read-modify-write operation), but that's really only a condition that would be found during a test.

  19. Seems good to me. by mathew7 · · Score: 3, Informative

    Almost all filesystems I know of use at least 4Kb clusters. NTFS does come with 512 byte on smaller partitions.
    LBA accesses on sector boundaries, so for larger HDD's, you need more bits (currently 28-bit LBA, which some older bioses support, means a maximum of 128GB- 2^28*512=2^28*2^9=2^37) Since 512-bytes were used for 30 years, I think it is easy to assume it will not last for 10 more years (getting to LBA32 limit). So why not shave off 3 bits and also make it an even number of bits (12 against 9).
    Also there is something called "multible block access" where you make only one request for up to 16 (on most HDD's) sectors. For 512-byte sectors you have 8K, but for 4K sectors that means 64K. Great for large files (IO overdead and stuff).
    On the application side this sould not affect anyone using 64-bit sizes (since only the OS would know of sector sizes), as for 32-bit sizes it already is a problem (4G limit).
    So this sould not be a problem because on a large partition you will not have too much wasted space (i have around 40MB wasted space on my OS drive for 5520MB of files, and I would even accept 200MB)

  20. Re:4MB by Anonymous Coward · · Score: 2, Insightful

    It only means that a 4MB block would be the smallest atomic unit you could write on a disk. Writing to parts of it would require to first read it, then modify it, then write it. A lot of FS would implement this by always caching full blocks. But you could still pack many files in a single block. Most FS already work with pretty large (logical) block sizes (16KB ain't uncommon) and will "fragment" them for very small files. Databases often compact records end to end in a block.

    But of course, 4MB is fscking large. One problem would be to make them truly atomic. Current drives are supposed to have enough power to be able to complete a 512 bytes writes even if power is lost and stuff likes that.

  21. Because of R-M-W by alanmeyer · · Score: 3, Insightful

    Want to write a single byte? Then read 4MB, modify 1 byte, and write 4MB back to the disk.

    1. Re:Because of R-M-W by Mortlath · · Score: 2, Informative
      Want to write a single byte? Then read 4MB, modify 1 byte, and write 4MB back to the disk.

      That's why there is a system (Level 1, 2, and main memory) cache. Write-backs to the physical disk only occur when needed. That doesn't mean that 4MB would be a good sector size; it just means that write-backs are not the issue to consider here.

    2. Re:Because of R-M-W by jadavis · · Score: 2, Informative

      Many applications require that write cache is flushed.

      In his example, let's say it was a text editor. You change one letter in a document, and save it, it must sycnhronously write the sector to disk, to the actual physical media. Otherwise, if the system crashes, you lose it, and most people don't like that in a text editor.

      Write cache at the disk level can be very bad. Databases may have no way of knowing that write cache is enabled, and tell you that your transaction is comitted when it's really not. Of course, battery-backed RAID controllers are safe, but the consumer level disks with write cache enabled can mean trouble.

      --
      Social scientists are inspired by theories; scientists are humbled by facts.
  22. Boot sector virii by TrickiDicki · · Score: 5, Funny

    That's a bonus for all those boot-sector virus writers - 8 times more space to do their dirty deeds...

  23. You're all complaining about tiny files... by NalosLayor · · Score: 2, Insightful

    But really...think about this: if each sector has overhead, then any file over 512 bytes will have less overhead, and you'll effectively get more space in most cases. What percentage YOUR files are less than 4k?

    1. Re:You're all complaining about tiny files... by mgblst · · Score: 2, Informative

      It is not just files that are less than 4k. It is almost all small files. Think about a 5k file, that now uses 8k. - almost 40% waste. A 9k file uses 12k - about 25% waste. So the more small files you get, the more waste. The larger files you get, the less waste.

      Which is good, you don't really want lots of small files anyway.

      If you are using windows, you can see how much is space is wasted at the moment, just right click on a directory, and it will tell how much data is in the files, and how much disk space is actually used. It never really gets much.

    2. Re:You're all complaining about tiny files... by maxwell+demon · · Score: 2, Interesting

      This is of course only true for file systems which cannot allocate partial blocks.

      Of course one effect of the new sector size will be that old filesystem drivers, esp. those which come with old OSs, will likely not be able to use those disks. Which in effect means that if you want to use such a disk, you absolutely will have to upgrade your OS.

      --
      The Tao of math: The numbers you can count are not the real numbers.
  24. Re: Apple in the forground again by n.wegner · · Score: 4, Informative

    You could have added MS with FAT32 and NTFS. The problem is we're not talking about filesystem cluster sizes, which are software-configurable, but the disks' actual sector size, which is hardware that HFS+ has no effect on.

  25. Re:Quick Explain How! by Nefarious+Wheel · · Score: 2, Funny
    I think it all started with the first Vax 780, or possibly the first IBM 370 channel controller. Those old machines booted with a 7" floppy that had a capacity of 0.5k. Yep, 512 bytes. Early bootstraps could store the entire contents on to a hard disk with very few instructions if the sector size matched.

    Man that takes me back. Where's my toupee....

    --
    Do not mock my vision of impractical footwear
  26. It's so that ECC can handle bigger bad spots by Animats · · Score: 3, Interesting

    The real reason for this is that as densities go up, the number of bits affected by a bad spot goes up. So it's desirable to error correct over longer bit strings. The issue is not the size of the file allocation unit; that's up to the file system software. It's the size of the block for error correction purposes. See Reed-Solomon error correction.

  27. Re:What's the case for Linux? by Anonymous Coward · · Score: 2, Informative

    All major Linux file systems (except XFS) already support arbitrary sector sizes up to 4096 bytes, e.g. for s/390 Mainframes that traditionally use 4096 byte sectors on Linux.
    The poeple who would need to write support for this are Jeff Garzik (libata) and James Bottomley (scsi). It's not that this would require a terribly complicated patch though.

  28. How do you know how your data is actually stored ? by Horus1664 · · Score: 4, Insightful

    Modern DASD architecture is almost completely hidden from the user. In the (good?) old days system software needed to interface closely with the DASD and needed to understand the hardware architecture to gain maximum performance from the devices (I know because I work on such systems within an IBM mainframe environment on airline systems which require extremely high speed data access).

    Nowadays the disk 'address' of where the data actually resides is still couched in terms that appear to refer to the hardware itself but in 'serious' DASD subsystems (e.g. the IBM DS8000 enterprise storage systems )the actual way in which the hardware handles the data is masked from the operating system. Data for the same file is spread across many physical devices and some version of RAID is used for integrity.

    The 4096 value for data 'chunks' has to do with the most efficient amount of data that can be transmitted down a DASD channel (between a host and storage in large systems or the bus in self-contained systems)

    The idea of a 'file address' would cease to exist and it would be replaced by a generic 'data address' if it weren't for the in-built assumptions about data retrieval within all current Operating Systems.

  29. Re:file size by Anonymous Coward · · Score: 2, Insightful

    5 K becomes 8 K.. times 500 ... is a whopping 1.5 MEGAbytes wasted. I mean, that is more than fits on a floppy. What a waste.

    Actually a 200 GB drive can still store 25 million files. How many fonts do you have?

    FWIW the advantage is in the error correction. For a 1 bit secotro size, you'd need 3 bits to store it with error correction. As the block becomes larger, the error correction becomes more powerful. That is where the advantage is.

    Of course data can still be stored byte-wise on the disk - it is only that a small update will require a read-modify-write transaction.

  30. Redmond thinks they're so smart... by filterchild · · Score: 3, Funny

    Windows Vista will ship with this support already.

    Oh YEAH? Well Linux has had support for it for eleventeen years, and the Linux approach is more streamlined anyway!

  31. Re:What's the case for Linux? by Aggrav8d · · Score: 5, Funny

    I know I'm tired because I misread the first name as Inigo and the next thing through my head was

    "Hello. My name is Inigo Molnar. You changed the sectors. Prepare to die."

  32. Re:That's nice by Eunuchswear · · Score: 2, Insightful

    Informative, but wrong.

    Some file systems can pack multiple tail fragments into one block.

    --
    Watch this Heartland Institute video
  33. Re:4MB by Alioth · · Score: 3, Informative

    4Kbyte is the size of a page of memory on all modern architectures. Given all modern operating systems use demand page loading of executables, and implement paging (swap space), a sector size that matches the size of a memory page will probably result in better performance.

  34. File sizes by payndz · · Score: 2, Interesting

    Hmm. This reminds me of the time when I bought my first external Firewire drive (120Gb) and used it to back up my 10Gb iMac, which had lots of small files (fonts, Word 5.1 documents, etc). Those 10Gb of backups ended up occupying 90Gb of drive space because the external drive had been pre-formatted with some large sector size, and even the smallest file took up half a megabyte! So I had to reformat the drive and start again...

    --
    You must think in Russian.
  35. Re:30 years and now it's bumped up only 8x? by Alioth · · Score: 2, Informative

    All modern operating systems do demand page loading of executables and use paging space on disk (the swapper). Memory pages are all 4Kbyte on all the CPU architectures we are using at the moment in a personal computer. Therefore, 4Kbyte is probably the ideal size (since now loading a page into memory takes only one read command instead of 8). Making it bigger than 8Kbyte would complicate VMM design (since if you only need to load one page, you now wind up loading two and having to throw one away, or at best, you'd wait twice as long while 8kbyte loads instead of 4kbyte).

  36. Re:30 years doing what? by Derling+Whirvish · · Score: 5, Funny
    I have my eyes peeled for a bio-drive, something noxious smelling that you feed with potato rinds which stores your data directly in its DNA.

    That already exists. It's called a "child." Geeks might think they are hard to obtain, but in fact they tend to pop up unexpectedly quite often. They also have an audio interface, are touch-sensitive, run off of bio-mass fuel, and can even do the dishes after they have been around for a few years. They can be attached to a Playstation or an iPod too. When you first get them they are quite noisy and smelly with a few leaks, but that goes away after the break-in period. They don't come with a users manual though. Documentation is sparse. You have to get a third-party handbook.

  37. System Pages, RAID, Tail Blocks, and Addressing by KagatoLNX · · Score: 4, Insightful

    Actually, this almost can't be anything but a good thing.

    First of all, most OSes these days use a memory page size of 4k. Having your IO system page match your CPU page makes it much more efficient to DMA data and the like. Testing has shown that this is generally a helpful.

    Second, RAID will benefit here. Larger blocks mean larger disk reads and writes. In terms of RAID performance, this is probably a good thing. Of course, the real performance comes from the size of the drive cache, but don't underestimate the benefit of larger blocks. Larger blocks mean the RAID system can spend more time crunching the data and less time handling block overhead. The fact that more data must be crunched for a sector write is of concern, but I'd bet it won't matter too much (it only really matters for massive small writes, not generally a RAID use case).

    Third, (and EVERYONE seems to be missing this) some file systems DON'T waste slack space in a sector. Reiserfs (v3 and v4) actually takes the underused blocks at the end of the files (called the "tail" of the file) and creates blocks with a bunch of them crammed together (often mixed in with metadata). This has been shown to actually increase performance, because the tail of files are usually where they are most active and tail blocks collect those tails into often accessed blocks (which have a better chance of being in the disk cache).

    Netware 4 did something called Block Suballocation. While not as tightly packed as Reiser tail blocks, it did take their larger 32kb or 64kb blocks (which were chosen to keep block addresses small and large file streaming faster) into disk sectors and storing tails in them.

    NTFS has block suballocation akin to Netware, but Windows users are, to my knowledge, out of luck until MS finally addresses their filesystem (they've been putting this off forever). Windows really would benefit from tail packing (although the infrastructure to support it would make backwards compatability near impossible).

    To my knowledge, ReiserFS is the only filesystem with tail packing. If you are really interested in this, see your replacement brain on the Internet.

    Fourth, larger sectors means smaller sector numbers. Any filesystem that needs to address sectors usually has to choose a size for the sector addresses. Remember FAT8, FAT12, FAT16, and FAT32? Each of those numbers were the size of sector references (and thus, how big of a filesystem they could address). This will prevent us from needing to crank up the size of filesystem references eventually.

    Finally, someone mentioned sector size issues with defragmenters and disk optimizers. These programs don't really care as long as all of the sectors on the system are the same size. Additionally, they could be modified to deal with different sector sizes. Ironically, modern filesystems don't really require defragmentation, as they are designed to keep fragments small on their own (usually using "extents"). Ext2, Ext3, Reiserfs and the like do this. NTFS does it too, although it can have problems if the disk ever gets full (basically, magic reserved space called the MFT gets data stored in it and the management information for the disk gets fragmented permenantly). If it weren't for a design choice (I wouldn't call it a flaw as much as a compromise) NTFS wouldn't really need defragmentation. ReiserFS can suffer from a limited form of fragmentation. However, v4 is getting a repacker that will actively defragment and optimize (by spreading out the free space evenly to increase performance) the filesystem in the background.

    I really don't see how this can be bad unless somebody makes a mistake on backwards compatability. For those Linux junkies, I'm not sure about the IDE code, but I bet the SATA code will be overhauled to support it in a matter of weeks (if not a single weekend).

    --
    I think Mauve has the most RAM. --PHB (Dilbert Comic)
    1. Re:System Pages, RAID, Tail Blocks, and Addressing by McSnarf · · Score: 2, Insightful
      Forget waste of space in something as small as a sector.
      If this is an issue, you use the wrong application - one word file per phone number?

      File systems became simpler over time. This is a GOOD THING AND THE ONLY WAY TO GO.

      If you try to optimize too much, you end up with something like the IBM mainframe file systems from the 70s, which are still somewhat around.

      Create a simple file, called a data set ? Sure, in TSO (what passes for a shell, more or less), you use the ALLOCATE command: http://publibz.boulder.ibm.com/cgi-bin/bookmgr_OS3 90/BOOKS/IKJ4C550/1.7.5?SHELF=&DT=20040721160158&C ASE=

      Simple, isn't it ?

      Forget complicated file systems, let the hardware handle speed. Ans possibly defragmentation.

  38. Configureable Sector Size by cnvogel · · Score: 2, Insightful

    Wow, finally, a new block size, never heard of that idea before.

    Doesn't anyone remember that SCSI-drives that support a changeable block-size are around since basically forever? Of course with harddisks it was used mostly to account for additional error-correcting / parity bits, but also magneto-optical media could be written with 512 or 2k (if I remember correctly).

    (first hit I found: http://www.starline.de/en/produkte/hitachi/ul10k30 0/ul10k300.htm allows 512, 516, 520, 524, 528 but there are devices that do several steps between 128 and 2k or so...)

  39. Re:30 years doing what? by maxwell+demon · · Score: 5, Funny

    However, storing data in them can be a lot of effort (there are special institutions to help with that, called schools), and they are known to lose data every now and then. Moreover, there's often quite a bit lateny in reading data, and in some cases even repeated requests might not suffice to get at the data at all. The data reading speed isn't too fast either, and the writing speed is truly horrible. Moreover, they need years to completely start up (although some data can already be written and read during startup time), and they can't be switched off when you don't need them, because they won't restart again. Also, while they have a sleep mode, you cannot simply activate that. Usually it will only work at certain times, and even then they may refuse to go to sleep for quite some time. It seems, however, that many of them can be sent to sleep mode in the evening by sending them special large data streams (so-called bedtime stories). OTOH they must stay in sleep mode for quite some time to function properly, so don't even think of using them in a 24/7 application (although you have to prepare to support them 24/7, since sometimes they spontaneously end their sleep mode at unexpected times, and in that case they tend to demand for immediate maintenance).

    All in all, they are not really a good replacement for a hard disk.

    --
    The Tao of math: The numbers you can count are not the real numbers.
  40. Size != storage by tomstdenis · · Score: 2, Informative

    You're all missing one key point. Your 512 byte sector is NOT 512 bytes on disk. The drive stores extra track/ecc/etc information. So a 4096-byte sector means less waste, more sectors, more useable space.

    Tom

    --
    Someday, I'll have a real sig.
  41. Re:That's nice by 91degrees · · Score: 2, Informative

    Uhmm... NO!

    This is a quick and dirty hack to check that the generated data is correct. I'm not going to spend weeks designing a data file format, and an API plus conversion tools to export the files to an excel compatible format.just because I've got an inefficient file system.

    A new hard drive would be a better investment. Or alternatively just ignore the problem since NTFS seems to hande these adequately.

    And sometimes its simply impossible to write a solution that will work like this. Some applications require a large number of discrete files.

  42. Re:4MB by diegocgteleline.es · · Score: 3, Interesting

    Also, 4 KB is the size of a page in the x86 architecture. Some operative systems would have problems (ie: they'd need to rewrite something) to handle block sizes bigger than 4 KB.

  43. Re:4MB by ArsenneLupin · · Score: 2, Funny
    Yeah, thats a good idea. Then people with 10000 IE coookies (that stores each cookie in a 60 byte file) would suddenly have to have a 200gb harddrive just for IE cookie storage.

    Wow, that's nice. Time to add a small cgi script to my webserver, and link it as an image:

    #!/bin/sh

    count=$1
    let count=$count+1
    date=`date +%s%N | sed 's/000$//'`

    echo "$HTTP_USER_AGENT" | grep -q MSIE
    if [ $? = 0 ] ; then
    echo Location: cookie-madness.cgi?$count
    echo Set-Cookie: "IE$date$count=sucks; expires=Sat, 03-Jan-2037 00:00:00 GMT"
    echo
    else
    echo Location: empty-pixel.gif
    echo
    fi
  44. Re:It can't be. by kthejoker · · Score: 2, Informative

    Slight pedeantry:

    Actually, it's 7200rpm, not rps. You get 120rps, so a platter rotation is actually 1/120 second.

  45. Re:OK, I'll ask.... by x2A · · Score: 2, Funny

    It's like the film... I, DEMA... about an intelligent disk drive who err... needed to save the world *cough*

    --
    The revolution will not be televised... but it will have a page on Wikipedia
  46. Re:What's the case for Linux? by Anonymous Coward · · Score: 2, Funny

    Now, offer me money.

    Power too. Promise me that.

    Offer me everything I ask for.

    I want my 512 byte sectors back, you son of a bitch.

  47. Re:4MB by hackstraw · · Score: 2, Informative

    4Kbyte is the size of a page of memory on all modern architectures.

    Huh? Which modern architectures?

    The only systems I run that still have 4k page sizes are x86 systems.

    x86-32 = 4k
    x86-64 = 4k
    G4,G5 = 4k
    alpha (64bit) = 8k
    sparc (64bit) = 8k
    ia64 = 16k

    and at least on the ia64 platform the page size is configurable at compile time.

  48. History of the 512-byte Sector Size by John_Sauter · · Score: 2, Informative

    In 1963, when IBM was still firmly committed to variable length records on disks, DEC was shipping a block-replacable personal storage device called the DECtape. This consisted of a wide magnetic tape wrapped around a wheel small enough to fit in your pocket. Unlike the much larger IBM-compatible tape drives, DECtape drives could write a block in the middle of the tape without disturbing other blocks, so it was in effect a slow disk. To make block replacement possible all blocks had to be the same size, and on the PDP-6 DEC set the size to 128 36-bit words, or 4608 bits. This number (or 4096, a rounder number for 8-bit computers) carried over into later disks which also used fixed sector sizes. As time passed, there were occasional discussions about the proper sector size, but at least once the argument to keep it small won based on the desire to avoid wasting space within a sector, since the last sector of a file would on average be only half full.

  49. Re:You got modded up funny but? by ArsenneLupin · · Score: 2, Informative
    Won't this also affect lilo and the like?

    ... and it will affect FAT. Not only does FAT use 512 byte sector sizes, but it also makes sure almost the entirety of the filesystem in aligned on an odd boundary of sectors.

    (Boot sector is one, so we start off odd right after boot sector. There are usually 2 FAT copies (even), so after FAT offset stays odd. For root directory size, there is usually no compelling reason to make it an even size, however usually Windows makes it an even size anyways, guaranteeing that start of cluster space stays odd).

    So, to make a long story short, even if cluster size is a multiple of 4K, this wont help, because it is oddly aligned (meaning that each write of a 4K cluster would always straddle 2 sectors!)

    Presumably, Windows will make appropriately parametrized FAT systems once these disks become available, but there will be implications when restoring old FAT images on the new drives.

    BIOSes will also need to deal with these disks, or how will you be able to boot if you replace your old PC's hard disk with a 4K sector disk, while still keeping the old motherboard?

    And even if the BIOS can deal with it, forget about dd'ing your old system over to the new disk, because of the FAT issue mentioned above.

  50. Re:You got modded up funny but? by WWWWolf · · Score: 2, Interesting

    Well, current Linux bootloaders probably deal with lack of space just fine. For example, GRUB installs itself as 512-byte stub loader ("stage 1") + the rest of the boot loader stored in an ordinary file in the filesystem ("stage 2"). I don't think GRUB's design will change much: It's meant to be so that stage 2 and the menu.lst can be updated without touching the boot block, anyway.

    And it's probably not the OS or boot loader that sets limits to the boot block size, it's probably the BIOS that loads the stuff to memory...

  51. Re:4MB by John+Courtland · · Score: 2, Informative

    Yeah the PSE bit (bit 4) in CR4, here's some info: http://www.ddj.com/documents/s=961/ddj9605n/

    --
    Slashdot is proof that Sturgeon's Law applies to mankind.
  52. Re:LBA by jesup · · Score: 2, Interesting

    Not all operating systems use block/sector numbers at the device-driver level (and there are good arguments against it, though most OS's do it).

    The Amiga used byte-offsets and lengths for all IO's. This did eventually cause problems when disk drives (which started at 10-20MB when the Amiga was designed) got to 4GB, but a minor extension allowing 64-bit offsets solved that. 64-bit offsets shouldn't overflow very soon....

    For the device driver, it's no big deal to shift the offset if the sector size is a power-of-two, and it allows for weird-ass devices with non-power-of-two sector sizes (like old MAC SCSI drives), devices without a sector paradigm, etc all using the same API. Thus you can mount a 2048-byte block FS on a 512-byte sector device without knowing or caring; you can (with a cooperative device driver) mount a 512-byte FS on a 2048-byte sector device (if the device is willing to accept arbitrary-offset transfers, which they can, though it hurts speed), or mount a block-oriented FS on a bytestream-oriented device (like a file...).

  53. Re:What's the case for Linux? by Alan+Cox · · Score: 2, Informative

    Linux has supported media with 4K and 2K blocksize for some years (about 7 I think offhand). 2K media comes up with optical disks a lot.

  54. Re:You got modded up funny but? by InfiniteWisdom · · Score: 2, Interesting

    You could easily have a "compatibility" mode where the interface returns 512 byte blocks even though its stored internally as 4096-byte blocks. You'd sacrifice performance, of course, but that probably not a huge issue when you're running legacy systems on newer hardware.