Long Block Data Standard Finalized
An anonymous reader writes "IDEMA has finally released the LBD (Long Block Data) standard. This standard, in work since 2000, increases the length of the data blocks of each sector from 512 bytes to 4,096 bytes. This is an update that has been requested for some time by the hard-drive industry and the development of new drives will start immediately. The new standard offers many advantages — improved reliability and higher transfer rates are the two most obvious. While some manufacturers say the reliability may increase as much as tenfold, the degree of performance improvement to be expected is a bit more elusive. Overall improvements include shorter time to format and more efficient data transfers due to smaller overhead per block during read and write operations."
How does larger block sizes result in better reliability? Intuitively, I would almost think the opposite, since a single byte corruption means a much larger block is now erroneous. I obviously am missing something though.
-dave
http://millionnumbers.com/ - own the number of your dreams
Is there a good reason why 4096 was chosen? Is that just an artifact of this being designed in 2000? At this point very few files on the average system would be smaller than this. It seems to me they could have quite safely chosen something like 16k which would have improved things more, future proofed them more, yet still have been small enough as to not waste a tremendous amount of space (like if they chose 512k).
Why not make it variable, in that each drive can have it's own value (limited to a power of 2, between 512 and say 512k)? That way one drives today could be 4k, with drives in a few years being more without requiring another 7 years for a new standard?
Comment forecast: Bits of genius surrounded by a sea of mediocrity.
It's a brave new world. The 10megabyte hard drive is long dead and gone.
Files nowdays are on the average, huge. My HTPC has hundreds of files that are an average of 1 gigabyte and quite often, twice that size.
NOTHING is 512 bytes anymore. Back in the early 80's IE DOS 2.11 it may have seemed a great idea.
I'm going to give them a benefit of a doubt and consider that there may be good technical reasons I don't know about that explains why a standards body can't slap something together in a few weeks that says "block sizes are now 4096 bytes instead of 512". There must be. Otherwise, I think we have a new benchmark for bureaucratic inefficiency.
Not a typewriter
All of my 400b files are now going to take up 10 times as much space!!!
Heh, glad to see this is finally going through!
-Rick
"Most people in the U.S. wouldn't know they live in a tyrannical state if it walked up and grabbed their junk." - MyFirs
Yeah why 4092 bytes? Why not 09 F9 11 02 9D 74 E3 5B D8 41 56 C5 63 56 88 C0 bytes? It seems to me to be the best option
To multiply 512 by 8?
FORMAT volume [/FS:file-system] [/V:label] [/Q] [/A:size] [/C] [/X] /N:sectors]
/FS:filesystem Specifies the type of the file system (FAT, FAT32, or NTFS). /V:label Specifies the volume label. /Q Performs a quick format. /C NTFS only: Files created on the new volume will be compressed /X Forces the volume to dismount first if necessary. All opened /A:size Overrides the default allocation unit size. Default settings
/F:size Specifies the size of the floppy disk to format (1.44) /T:tracks Specifies the number of tracks per disk side. /N:sectors Specifies the number of sectors per track.
FORMAT volume [/V:label] [/Q] [/F:size]
FORMAT volume [/V:label] [/Q] [/T:tracks
FORMAT volume [/V:label] [/Q]
FORMAT volume [/Q]
volume Specifies the drive letter (followed by a colon),
mount point, or volume name.
by default.
handles to the volume would no longer be valid.
are strongly recommended for general use.
NTFS supports 512, 1024, 2048, 4096, 8192, 16K, 32K, 64K.
FAT supports 512, 1024, 2048, 4096, 8192, 16K, 32K, 64K,
(128K, 256K for sector size > 512 bytes).
FAT32 supports 512, 1024, 2048, 4096, 8192, 16K, 32K, 64K,
(128K, 256K for sector size > 512 bytes).
Note that the FAT and FAT32 files systems impose the
following restrictions on the number of clusters on a volume:
FAT: Number of clusters = 65526
FAT32: 65526 Number of clusters 4177918
Format will immediately stop processing if it decides that
the above requirements cannot be met using the specified
cluster size.
NTFS compression is not supported for allocation unit sizes
above 4096.
Ben Hocking
Need a professional organizer?
What concerns me is why no one is working on improving density on my 3.5" microfloppy. I am running Windows 9.5 on an AMD-K5 HP box. I like to save Slashdot tirades, but typical Slashdot tirades consume more space than I have on my microfloppy.
I would love to see that post as +5 Redundant. We all know 09 F9 11 02 9D 74 E3 5B D8 41 56 C5 63 56 88 C0 is going to be waaaaaay over abused for ever now, but at least that post was a good use of it.
-Rick
"Most people in the U.S. wouldn't know they live in a tyrannical state if it walked up and grabbed their junk." - MyFirs
It also means more wasted space on a Windows machine where the user wants a block size of say... 512bytes, or OS/2 and eComStation's HPFS that only uses 512bytes to prevent space waste. It doesnt seem like much, but it does add up if you have a lot of files (pr0n, music, data, images, etc).
StarTrekPhase2 - The Five Year Mission Continues!
Trying to fit an entire virus into 512 bytes was always a challenge.. but 4096 bytes? That's too easy!
How we know is more important than what we know.
As others have pointed out, most file systems already allocate 4K per block.
Ben Hocking
Need a professional organizer?
These kinds of incremental standards are simply not forward-looking! I propose that the data block size be set to a minimum of 2^32 bytes.
Gamingmuseum.com: Give your 3D accelerator a rest.
Let's suppose you can fix one error per 512 byte block or 6 errors per 4096 byte block. Intuitively that might seem like a step back because 6/8 is smaller than 1, but that is not so. If you have 512-byte blocks and get two errors in a 512-byte sequence then that block is corrupt. However if instead you're using 4096 byte blocks then a 512-byte sequence within that block can have two errors since we can tolerate up to 6 errors in the whole block.
Or put another way, consider a 4 k sequence of data, represented by a sequence of digits dependent on the number of errors in each 512 bytes. 00000000 means no errors, 03010000 means 3 errors in the second block and 1 in the fourth block (ie a total of 4 errors in the whole 4096 bytes). With a scheme that can fix only one error per 512 bytes, the block with 3 errors cannot be corrected (because 3 > 1), but in the system which fixes up to 6 errors per 4096, the errors can be fixed because 4 6. This means that the ECC is far more reliable.
Engineering is the art of compromise.
The larger block size is bad if you have a lot of little files, but better if you have fewer large files.
This will just encourage Microsoft to stop using INI and XML files and to store more settings the in the big REGISTRY...
Then you should really be considering a database.
Block size has absolutely nothing to do with how much redundancy you can build in, and I fail to see the logic in assuming so. Makes absolutely no sense. The 2048 bytes stored on a sector of a CD only refers to your data, and absolutely none of them have anything to do with the CD's error-correction mechanisms. They add lots of extra bits to make up their error-correction, over and above your 2048 bytes of data. But, the point is it doesn't matter how much space you reserve to hold user data, you can arbitrarily reserve any amount of space you want for error-correction bits. You can have 16-byte sectors with 16MB of error-correction. Now, *that* would be a lot of redundancy. But certainly something you could do if you want to, and there's not going to be very many people arguing that those 16-byte sectors weren't covered by much redundancy. I doubt anyone would ever use that much redundancy, obviously, but it's just an outrageous example to show that the amount of redundancy has absolutely nothing to do with how much user data is stored per sector.
You are correct. I just looked at one of my 0-byte files and found that it occupied 0 blocks. I guess I'm a little out of touch with file systems.
Ben Hocking
Need a professional organizer?
...if you have Windows loaded.
Engineering is the art of compromise.
Did the space for the bootloader just increase to 4096 as well? For those unaware, the BIOS loads just the first sector of the disk into memory, the bootloader takes it from there. It would certainly let them get a lot more resilient, now they only barf if things are not as expected.
Live today, because you never know what tomorrow brings
Now when I want to update just 256 bytes, instead of reading 512 bytes, changing 256 of them, and writing 512 back, I now have to do this with 4096 bytes. So I end up transferring 3584 more bytes than I otherwise needed to.
They really could do this transparently. Let the driver write anything in any range. Then have the drive take care of reading the data for any sector partially written, update it, and write it back. With a method like that, we really won't have to know the physical sector size (which could even be allowed to vary depending on the position on the drive). It should also be made smart enough that if the driver writes a long sequence of smaller traditional 512 byte sectors, it will know which real physical sectors are totally written and won't make any effort to pre-read them.
now we need to go OSS in diesel cars
Evidently, you don't think EFS3 is the right database?
I mean, it's not relational, but it's pretty damn powerful.
Eloi, Eloi, lema sabachtani?
www.fogbound.net
I have to disagree with the whole premise here. I know that people always say that longer is better when it comes to hard drives, but I've never had any reliability problems with my smaller one. Not only that, but I've had very fast transfer rates under all sorts of strenuous loads.
Wait, we're talking about storage devices? Never mind...
Thank God for evolution.
http://www.idema.org/_smartsite/modules/local/data _file/show_file.php?cmd=standards&cat=103&h=1#
...and they're not that clean.
Long Data Block Standards->Approved Standards says "There are currently no approved standards for this committee". (as of 15:30 PST)
Note to self: If I ever post anything to slashdot I will check I have my facts straight before hand. I wouldn't want to look like (any more of) a twit. It's the internet equivalent of giving a speech only to realize you're on stage in nothing but your underwear.
See the work on ANSI T10 Data Integrity Field, that provides end-to-end error detection. It bumps the standard block size from 512 to 520 bytes.
Mea navis aericumbens anguillis abundat
This guy is a genius. C'mon.
09 F9 11 02 9D 74 E3 5B D8 41 56 C5 63 56 88 C1 bottles of beer on the wall. Take one down, pass it round... Oh, umm...
I've spent time optimizing block size, along with block offset, sparing strategies, etc. for SMD drives. 4K blocks worked well with System V on 68K, since it matched the page size (x86, commonly, too). There were also many more small files in those days.
;-) free), is a smaller cost than having larger file system blocks matched to media blocks is a gain.
I find that 16K works better now, when I have been able to try it, either by reformatting SCSI drives or using virtual disks at various "stripe" sizes (virtual disks don't let me tinker with sparing, of course). There are tradeoffs in the number of bits required for ECC at different block sizes, and the sparing strategy must be carefully considered, but larger blocks are going to provide more data on the media and better throughput until they start conflicting with buffering and physical media geometry. Further, the proportion of small files (4k, or less) is dropping. "man" pages are getting larger, along with the "info" pages, and mail agents often aggregate messages in folders (or globally), rather than having a message-per-file. I suppose the proliferation of ".*rc" files brings up the proportion of small files a bit. If you're on a UFS, or equivalent, what proportion of your files are tucked into the inodes (yes, I know it's different, but remember why the facility exists)? One of my continual annoyances with ext2 is the stupidly small limitation on block size.
Paging is about the only downside to 16K blocks, but that's really a small kernel tweak to schedule paging in media, in addition to CPU, block sizes. Even without the change, the read 4, merge 1, write 4 for paging ('specially if you can use a gather list for write DMA, since the "merge" is virtually
Hopefully it will... but as I put my entire CD collection onto my machine, that will only alleviate MS's part in the wasted space issue... :-(
Much of the other issues a larger sector size alleviate would also be addressed if MS would revise NTFS so it wouldnt fragment. They know how, as they have access to the HPFS internals (HPFS rarely exceeds 1 or 2% fragmentation). That is something else I dont understand... they have the answers to many complaints about Windows (that being only one of them) and do nothing about it. I guess it's a matter of economics (like the ability to sell "better"/"full function" defrag tools to supplement/replace the ones that come in Windows...
StarTrekPhase2 - The Five Year Mission Continues!
I heard it on talk radio, but there weren't any more details.
That can't be true, at least not like you stated it. If you can correct fewer than 1 error per 512 bytes, and if the locations of the errors are statistically independent, then you are worse off than before. So either: 1) you left out some critical information (namely, that the errors tend to occur in clusters), or 2) you are wrong. So, which is it?
Chances are you're reading 4096 bytes already. Disk access hasn't been sector oriented since before the turn of the century.
When our name is on the back of your car, we're behind you all the way!
It's about effing time!
512 bytes was good for floppy disks. I think we should have started upping the sector size around the same time as we hit the 528mb 1024-cylinder limit back in the early 90's. Considering that a modern hard drive has anywhere from one-half to two billion sectors, and that's some serious overhead for no reason. Error-correction is "easier" if it's spread over larger blocks. Why ? Because most files are quite large, and corrupting a 512 byte chunk is just as bad as corrupting a 4096 or 8192 byte chunk, because it's hosing the file either way. Might as well pool the ECC together and offer better protection for the large block, while still wasting less bits than the sum of all the small sectors' ECC. Even without the proposed ECC algorithm overhaul, larger blocks would allow more usable data per platter.
The downside is that we've had 512 byte sectors for so long, everyone's hardcoded the number in their apps and drivers. The biggest risk involved is to patch all that software... one little glitch could hose a ton of data.
-Billco, Fnarg.com
4k is the common base of the most widely used operating system page sizes. :-)
(1k being more popular for embedded systems that don't have HDs anyway)
THIS THING CAN TURN ON A DIME, MACROSSZERO STYLE ALSO FUCK BETA, ~NYORON
Let's say you have 4096 bytes arranged as 8x512-byte blocks and each block can correct one error. Now lets say that we RANDOMLY (ie statisticly independently) introduce, say, 4 errors into that set of 8 blocks. Sometimes the errors will fall so that there are at most one error per block. That is correctable. Sometimes the errors will fall so that there are more than one per block. In that case data will be lost.
However, if we can correct up to, say, 6 arbitrarily placed errors per 4096 bytes we can then have 4 errors anywhere in that block and we won't lose data. It does not matter whether they are spread out or clustered together we can always handle those errors.
This makes for stronger correction.
Engineering is the art of compromise.
Most RAID systems or databases (and even most filesystems) are using block sizes FAR larger then 4k already.
The still tiny 4k is only used because Intel and most other VM paging systems all use 4k. I haven't run a machine with paging turned on in years tho, I don't think most people do since RAM is so cheap.
- Adam L. Beberg - The Cosm Project - http://www.mithral.com/
Debian Finally Supports Long Block Data
S/360 JCL (and the S/360 DCB macro) had a mechanism for 4096-byte blocks in 1964:
RECFM=F,LRECL=4096
The first IBM disk drive to hold 4096 bytes on a track was the 2314, introduced in 1965.
Yes I think many, if not most, modern filesystems will. I don't really see why a zero-length file should be any harder to handle than a directory, and they generally don't take up any storage space on disk (they obviously do take some space, but not in the way most users think of it). Seems like you could just create an empty file by writing an inode to the table and specifying a zero-byte length, so it wouldn't be linked to or take up any actual data blocks on disk. It would be like a hard link, but pointing to nothing. That doesn't sound like it would be that hard to do, if you really wanted to -- if it's not possible, it's probably less because of infeasibility than because somebody thinks it would be a misfeature or cause problems.
"Ladies and gentlemen, my killbot features Lotus Notes and a machine gun. It is the finest available."
'Hopefully it will... but as I put my entire CD collection onto my machine, that will only alleviate MS's part in the wasted space issue... :-('
They would never do it and the reason why is the answer to your next question.
'Much of the other issues a larger sector size alleviate would also be addressed if MS would revise NTFS so it wouldnt fragment. They know how, as they have access to the HPFS internals (HPFS rarely exceeds 1 or 2% fragmentation). That is something else I dont understand... they have the answers to many complaints about Windows (that being only one of them) and do nothing about it.'
You are right that it is a matter of economics but it isn't about a defrag tool. It is about hardware. If your drive gets fragmented and slows (and you don't know about fragmentation) you will think your computer is getting slow. For most people this takes a long time. If you have a slow computer, you buy a new one. That means you just bought a new computer and made the hardware manufacturers happy and bought a new copy of windows and made Microsoft happy.
That is the same reason windows vista is so bloated, other systems implemented greater new functionality (read linux and macosx) using a small fraction of the system resources. Microsoft's top paid programmers really aren't that bad, the system is intentionally inefficient. Being inefficient means it sells more hardware. Selling more hardware makes PC manufacturers happy and they are the ones who provide Microsoft's bread and butter by preloading windows on PCs.
Funny that I have to see some "finalisation" of something that has existed in DOS for ages (For reference : see DOS Int 13h, AH=05h. Sector-sizes from 128 to 1024 bytes could be choosen), but never has been used as such (I have been wondering why a kludge like "clusters" was used. The answer is probably that in that time MS could easier make software- than (to demand) hardware-changes)
It isn't something that 'no one seems to care about'; it's something that /etc and
bugs me every time my (Unix) boots up. All those pesky little
other minor files that get read on startup, are an occasion to wipe the disk
head over 80 bytes of information and 4016 bytes of nulls.
The disk throughput for small files thus is 80/4096 of its raw throughput,
which means my three-minute boot time would be more like
four seconds if the filesystem 'solved' this little performance issue.
But the drive manufacturers have another issue to deal with- sectors
are separated by a blank unwritten space, and bigger sectors
mean fewer such (and that means more usable bits stored).
So, the 'theoretical' storage goes up because the intersector gaps are
reduced by a factor of eight. And the cost in small-file-access throughput
isn't part of the marketing 'speed' numbers (nor should it be; there are
buffering schemes that can mask it). So, the manufacturers see it as a win.
If it also makes ECC with multibit correction feasible, then the manufacturers
win in another way, too: instead of 1 bad bit causing a bad block (invalidating 512
bytes means throwing away 4095 good bits and one bad bit), they can correct
single-bit errors and 2-bit errors, and detect multibit errors, so can keep all the blocks
with one bad bit, and only throw away if there's (for instance) three bad bits.
Correction is slow, will occur in unpredictable patterns, and costs some processor
power, BUT modern S.M.A.R.T. reporting and other technology makes it OK for
the future. Didn't see anything in the article to indicate if ECC was gonna
be part of the drive or was gonna be dumped onto the host OS's device
driver. Or is ECC just someone's speculation?
The mere mention of shorter format times immediately makes me think that Windows "might" benefit from this: I've never had a long disk format time for ReiserFS, Ext3/2, nor UFS. But FAT and NTFS: give me a break. The long format times really makes Microsoft look stupid.
They did already. The difference is that you get a whole new hard drive all at once, rather than swapping platters in and out like you do with a floppy.
#naabhaprzrag, #sverubfr-000, #agi-fcbafberq, negvpyr[pynff*=' negvpyr-ary-'] { qvfcynl: abar !vzcbegnag; }
Yup. I worked in data recovery during the late 80s. A number of manufacturers used larger sector sizes to allow for bigger hard drives in their MS-DOS machines. However, that meant they had to use a specially patched version of MS-DOS.
Every now and again someone would decide to upgrade to Microsoft's regular MS-DOS, or run Norton defragment... and their entire hard drive would get trashed, and we'd be called in to try and recover whatever we could.
GCHQ Quantum Insert installed. If only our tongues were made of glass, how much more careful we would be when we speak
It *was* used, back in the 80s. For instance, Tandon sold MS-DOS machines with larger sector sizes.
The problem was there were lots of MS-DOS programs that accessed the disk directly, or didn't know about the sector size parameter. Hence people tended to end up with their data scrambled. (I worked in data recovery at the time.)
GCHQ Quantum Insert installed. If only our tongues were made of glass, how much more careful we would be when we speak
The other benefit I'm looking forward to is that flash disks will most likely become 8x faster in the near term. Since they tend to be constructed with 512 byte sectors, it'll make sense to use 8 of them in parallel to get the 4K sector size. (Somewhere down the line they may decide to just increase the flash sector sizes, but they don't actually need to just yet...)
-- *My* journal is more interesting than *yours*...
Keep in mind that the number of blocks per cylinder must be an integer. Increasing block size from 512 to 4096 makes the amount of storage lost to fragmentation at the cylinder boundary much worse.
Microsoft to offer an "open" (eherm) alternative to Long Block Data...
and tries to patent it
The MAFIAA is a bunch of mindless jerks who will be the first up against the wall when the revolution comes
Excellent points and thanks for the reply!
:-)
-Rob
StarTrekPhase2 - The Five Year Mission Continues!