Slashdot Mirror


Long Block Data Standard Finalized

An anonymous reader writes "IDEMA has finally released the LBD (Long Block Data) standard. This standard, in work since 2000, increases the length of the data blocks of each sector from 512 bytes to 4,096 bytes. This is an update that has been requested for some time by the hard-drive industry and the development of new drives will start immediately. The new standard offers many advantages — improved reliability and higher transfer rates are the two most obvious. While some manufacturers say the reliability may increase as much as tenfold, the degree of performance improvement to be expected is a bit more elusive. Overall improvements include shorter time to format and more efficient data transfers due to smaller overhead per block during read and write operations."

21 of 199 comments (clear)

  1. Re:Oh noes! by drinkypoo · · Score: 2, Informative

    Actually, they're going to take up eight times as much space... YOU FAIL IT! They will waste 3636b space unused in blocks, however, instead of only 112 bytes, so they'll be wasting over 32 times as much space. But then, won't ReiserFS already store multiple files in a single block in some cases?

    --
    "You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
  2. Re:Higher Reliability? by 5pp000 · · Score: 5, Informative

    The longer block sizes add reliability because the error correcting codes have more to work with at a time (more data bits, but also more ECC bits).

    As for wasted space, that's under the filesystem's control, not the drive's.

    --
    Your god may be dead, but mine aren't!
  3. Re:Sounds like a good idea to me. by Animaether · · Score: 2, Informative

    In addition, it doesn't matter whether the file is less than 512 or, in this case, 4096 bytes. What matters is if the 'size % block_size' is non-zero. I.e. let's say the file is 4090 bytes. It will fit just fine, and you'll only waste 6 bytes. Now the file is 4100 bytes, only 4 bytes over. Except now you need 2 blocks, and thus waste 4092 bytes.

    Sure, on a multi-GB file that's not going to matter too much, as even on a TB drive you can only have a few hundred of those, and who's going to miss that 1MB?
    However, there's plenty of other files that hover between 1k and 10k, 10k and 100k, 100k and 1MB where those tiny fractions do add up.

    That said, GP is still right. Say you do have a TB drive.. unless you only have a few free MB left, you're not going to worry too much about the losses from block sizes.

  4. Error correction better over larger blocks by EmbeddedJanitor · · Score: 4, Informative
    If you're working with a certain number of ecc bits per data bit, then the number of corrections you can perform increases with an increased data block size. Oversimplifying, just for explanation here:

    Let's suppose you can fix one error per 512 byte block or 6 errors per 4096 byte block. Intuitively that might seem like a step back because 6/8 is smaller than 1, but that is not so. If you have 512-byte blocks and get two errors in a 512-byte sequence then that block is corrupt. However if instead you're using 4096 byte blocks then a 512-byte sequence within that block can have two errors since we can tolerate up to 6 errors in the whole block.

    Or put another way, consider a 4 k sequence of data, represented by a sequence of digits dependent on the number of errors in each 512 bytes. 00000000 means no errors, 03010000 means 3 errors in the second block and 1 in the fourth block (ie a total of 4 errors in the whole 4096 bytes). With a scheme that can fix only one error per 512 bytes, the block with 3 errors cannot be corrected (because 3 > 1), but in the system which fixes up to 6 errors per 4096, the errors can be fixed because 4 6. This means that the ECC is far more reliable.

    --
    Engineering is the art of compromise.
    1. Re:Error correction better over larger blocks by hamanu · · Score: 2, Informative

      OK, yes you COULD move the parity dta around but you'd get shitty performance. Hard drives are made so that each sector is independent of another. That makes each sector a seperate codeword on disk. What you are proposing is to introduce dependency between sectors, and that would mean having to read adjacent sectors in order to write a single sector, which means goin through 2 revolution of the disk instead of one.

      --
      every _exit() is the same, but every clone() is different.
    2. Re:Error correction better over larger blocks by Anonymous Coward · · Score: 1, Informative

      Parity is an extremely bad error correction code. There are batter way and your limit is set by Shannon's source coding theorem, http://en.wikipedia.org/wiki/Source_coding_theorem . That is, one treat the disc surface -> computer memory as an information channel that has certain probability of errors and one want to add extra redundancy into the data so the probability of failure decreases to some practical value like only one error in 10 years.

      Shannon's theorem gives the theoretical minimal level of redundancy that is required to achieve this low overall probability. Practical error correction codes stays well above the minimal level meaning that more space is wasted on the disk than strictly necessary. But with bigger blocks it is easier to get closer to the limit so 4K blocks should allow for less wasteful redundancy.

    3. Re:Error correction better over larger blocks by Anonymous Coward · · Score: 1, Informative

      Most OS's coalesce small writes anyway (ref:delayed write), so you wouldn't get any performance degradation unless you are manually flushing the bytes to disk.

  5. Re:Why 4096? by 42forty-two42 · · Score: 3, Informative

    Using 4MB blocks for everything would kill memory performance - and more specifically, mmap performance. Each library loaded in your system would require at least 4MB of ram - probably more, as they have code, data, and zeroed data segments. Additionally, each process would require another 4MB*n. There's no gain for doing this either, except under specialized circumstances, as the OS can already request a batch of sectors from the drive in one operation.

  6. CD error recovery unrelated to block size by _Shorty-dammit · · Score: 2, Informative

    Block size has absolutely nothing to do with how much redundancy you can build in, and I fail to see the logic in assuming so. Makes absolutely no sense. The 2048 bytes stored on a sector of a CD only refers to your data, and absolutely none of them have anything to do with the CD's error-correction mechanisms. They add lots of extra bits to make up their error-correction, over and above your 2048 bytes of data. But, the point is it doesn't matter how much space you reserve to hold user data, you can arbitrarily reserve any amount of space you want for error-correction bits. You can have 16-byte sectors with 16MB of error-correction. Now, *that* would be a lot of redundancy. But certainly something you could do if you want to, and there's not going to be very many people arguing that those 16-byte sectors weren't covered by much redundancy. I doubt anyone would ever use that much redundancy, obviously, but it's just an outrageous example to show that the amount of redundancy has absolutely nothing to do with how much user data is stored per sector.

    1. Re:CD error recovery unrelated to block size by hamanu · · Score: 2, Informative

      the rate of a code measure how much redundundacy it has, correct. But why do you think block length doesn't matter? Just because you have high redundancy doesn't mean your errors are going to magically be recoverable. To actually recover the data you need enough distance between valid codewords so that when a codeword is perturbed by errors you can still see which valid codeword it is closest to. With short block lengths you get small decoding distances, and low error correcting power. If you learn information theory a bit better you'll see Claude Shannon's channel "capacity" theory assumes infinite block length, and it does that for a REASON.

      --
      every _exit() is the same, but every clone() is different.
    2. Re:CD error recovery unrelated to block size by hamanu · · Score: 2, Informative

      I guess I should pre-emptively point out that for a hard drive you want to be able to modify each sector atomically, which means that a single sector corresponds to a single codeword, and increasing areal density means you need longer codewords to maintain error correction. So either you decrease the rate of the code, and use extra redundancy, which lower capacity and defeats the purpose of increasing areal density, or you us longer codewords at the same rate, which means using longer sectors.

      --
      every _exit() is the same, but every clone() is different.
  7. Re:Sounds like a good idea to me. by Anonymous+Cowpat · · Score: 2, Informative

    My HTPC has hundreds of files that are an average of 1 gigabyte and quite often, twice that size.
    So... 2 gigabytes?
    --
    FGD 135
  8. Re:Oh great by avxo · · Score: 4, Informative

    Now when I want to update just 256 bytes, instead of reading 512 bytes, changing 256 of them, and writing 512 back, I now have to do this with 4096 bytes. So I end up transferring 3584 more bytes than I otherwise needed to.
    So, your O/S requires that you issue all read and write operations using the hard drive's native block size? That must suck. What else must you do? Setup DMA manually in your app? Solder a microcontroller onto the board perhaps? Sarcasm aside, you seem to have a fundamental misunderstanding of what this change achieves, who it will affect, and how. Other posters have addressed those very issues eloquently, so I won't go into that.

    They really could do this transparently. Let the driver write anything in any range.
    Sorry to burst your bubble but it already is done transparently. The O/S lets you write anything -- from a single byte, to gigabytes -- transparently; all you do is tell the O/S read n bytes of file F so and so into buffer at x, or write m bytes from buffer at y into file F, which is the interface that 99% of programmers use. And after what you wrote above, I find it hard to believe that you are writing the specialized software, low-level drivers and/or controller microcode that could potentially be affected by this change.
  9. Re:Why 4096? by pchan- · · Score: 1, Informative

    Pretty much every paging-capable microprocessor in existence uses 4K memory blocks, thus why they're the natural size for a hard disk.

    AMD64/x86-64 uses 8KB pages. ARM uses 1KB pages.

  10. Re:Why 4096? by Scott+Wood · · Score: 3, Informative

    No, x64 and ARM both use 4K pages (though ARM has 1K subpages that you can set permissions on individually). Alpha and sparc64 use 8K pages, though.

  11. blocks and clusters by ceroklis · · Score: 4, Informative
    To all the posters complaining about the loss of space when they will be forced to use 4096 instead of 512 bytes to store their 20 bytes file:

    • The cluster size (unit of disk space allocation for files) need not be equal to the physical block size. It can be a multiple or even a fraction of the physical block size. It is fairly probable that you are already using 4K clusters (or bigger), so this will not change anything. This is for example the case if you have an NTFS filesystem bigger than 2GB.
    • Not all filesystems waste space in this manner. Reiserfs or EXT3 can pack several small files in a "cluster" .
    1. Re:blocks and clusters by ceroklis · · Score: 2, Informative

      s/EXT3/EXT4/

  12. Re:Sounds like a good idea to me. by EvanED · · Score: 2, Informative

    Ask Wikipedia

    It's in the table "Allocation and layout policies". Look at both tail packing and block suballocation.

    There are a few others that do, but not many. (JFS, QFS, NWFS, and VMFS are marked yes; NTFS and ZFS are marked partial.)

  13. No, the logic is not flawed. by EmbeddedJanitor · · Score: 2, Informative
    Consider it this way

    Let's say you have 4096 bytes arranged as 8x512-byte blocks and each block can correct one error. Now lets say that we RANDOMLY (ie statisticly independently) introduce, say, 4 errors into that set of 8 blocks. Sometimes the errors will fall so that there are at most one error per block. That is correctable. Sometimes the errors will fall so that there are more than one per block. In that case data will be lost.

    However, if we can correct up to, say, 6 arbitrarily placed errors per 4096 bytes we can then have 4 errors anywhere in that block and we won't lose data. It does not matter whether they are spread out or clustered together we can always handle those errors.

    This makes for stronger correction.

    --
    Engineering is the art of compromise.
  14. Re:What about the MBR? by ottffssent · · Score: 3, Informative

    The word you're looking for is GPT. It has nothing to do with 4k hardware sectors, but it does support up to 128 partitions. Which ought to be enough for anybody (says the man with a 1 average number of partitions per disk in his household).

  15. Re:What about the MBR? by shawnce · · Score: 2, Informative