Changes in HDD Sector Usage After 30 Years

4MB by Anonymous Coward · 2006-03-23 19:10 · Score: 1, Funny

Why not a 4MB sector?

Re:4MB by irimi_00 · 2006-03-23 19:13 · Score: 2, Funny

Why not a 32768 bit sector?
Re:4MB by LardBrattish · 2006-03-23 19:31 · Score: 3, Insightful

Simple answer - every file would then have a minimum size of 4MB

--
What are you listening to? (http://megamanic.blogetery.com/)
Re:4MB by Anonymous Coward · 2006-03-23 19:41 · Score: 2, Insightful

It only means that a 4MB block would be the smallest atomic unit you could write on a disk. Writing to parts of it would require to first read it, then modify it, then write it. A lot of FS would implement this by always caching full blocks. But you could still pack many files in a single block. Most FS already work with pretty large (logical) block sizes (16KB ain't uncommon) and will "fragment" them for very small files. Databases often compact records end to end in a block.

But of course, 4MB is fscking large. One problem would be to make them truly atomic. Current drives are supposed to have enough power to be able to complete a 512 bytes writes even if power is lost and stuff likes that.
Re:4MB by Nefarious+Wheel · 2006-03-23 19:56 · Score: 1

Why not a 4MB sector?
Depends on the size of your device and system cache, really. There's a functional difference between the size of a sector and the size of the space allocated for new files or file extensions. My advice? Let the hardware vendors decide what they want for the sector size, doesn't matter all that much. Then make sure any file extension or file initial size allocation is a healthy multiple of that. If you don't use it all, truncate on close. It's small allocations, not small sector sizes, that determine the rate of file fragmentation.

--
Do not mock my vision of impractical footwear
Re:4MB by Alioth · 2006-03-23 20:49 · Score: 3, Informative

4Kbyte is the size of a page of memory on all modern architectures. Given all modern operating systems use demand page loading of executables, and implement paging (swap space), a sector size that matches the size of a memory page will probably result in better performance.

--
Oolite: Elite-like game. For Mac, Linux and Windows
Re:4MB by Warg!+The+Orcs!! · 2006-03-23 20:53 · Score: 1

I'm thinking that the Anonymous Coward was thinking that 4096 bytes = 4Mb and so why not just call it a 4Mb sector instead of a 4096 byte sector.

Of course 4096 bytes is 4Kb not 4Mb.

--
Travelling forward in time at a rate of 1 second per second.
Re:4MB by incubuz1980 · 2006-03-23 21:23 · Score: 1

Solaris 9 on SPARC uses 8K by default, and you can change it.

incubuz@marvin ~
$ pagesize
8192

incubuz@marvin ~
$ pagesize -a
8192
65536
524288
4194304
morten@marvin ~
$
Re:4MB by DavidRawling · 2006-03-23 22:11 · Score: 1, Informative

Sorry, 4Kb is 4 Kilobits, not Kilobytes. Bytes is abbreviated with a capital B to distinguish from bits - so we are looking at 4KB sectors (32Kb).

On the other hand, don't you just love the fact that capacity is in powers of 2 when dealing at the sector level, but powers of 10 at the device level - and this from the same organisations!
Re:4MB by Anonymous Coward · 2006-03-23 22:20 · Score: 1

Yeah, thats a good idea. Then people with 10000 IE coookies (that stores each cookie in a 60 byte file) would suddenly have to have a 200gb harddrive just for IE cookie storage.

Wow, they make some people smart
Re:4MB by Warg!+The+Orcs!! · 2006-03-23 22:40 · Score: 1

OK, my bad. KB not Kb, MB not Mb

--
Travelling forward in time at a rate of 1 second per second.
Re:4MB by tzot · 2006-03-23 22:58 · Score: 1

Sorry, 4Kb is 4 Kilobits, not Kilobytes. Bytes is abbreviated with a capital B to distinguish from bits - so we are looking at 4KB sectors (32Kb).
Sorry, 4 Kb is 4000 bits while 4 Kib is 4096 bits, and the discussed new sector size is 4 KiB (4096 bytes), not 4 KB (4000 bytes).

--
I speak England very best
Re:4MB by master_p · 2006-03-23 23:14 · Score: 1

Indeed...and a clever filesystem could store smaller than 4K files within one sector, in order to make more efficient use of space; I think Reiser FS does something similar.
Re:4MB by diegocgteleline.es · 2006-03-24 00:00 · Score: 3, Interesting

Also, 4 KB is the size of a page in the x86 architecture. Some operative systems would have problems (ie: they'd need to rewrite something) to handle block sizes bigger than 4 KB.
Re:4MB by DavidRawling · 2006-03-24 00:02 · Score: 1

Yes, it is. My comment about ignoring IEEE's interference in our well defined measurements got lost in one of my edits.
Re:4MB by LordKronos · 2006-03-24 01:06 · Score: 1

That is pretty much what is commonly used, but I seem to recall reading once that Intel processors also support a mode which can switch the page size to 4MB. It might have been documented in my Pentium architecture manual.
Re:4MB by ArsenneLupin · 2006-03-24 01:35 · Score: 2, Funny

Yeah, thats a good idea. Then people with 10000 IE coookies (that stores each cookie in a 60 byte file) would suddenly have to have a 200gb harddrive just for IE cookie storage.
Wow, that's nice. Time to add a small cgi script to my webserver, and link it as an image:
#!/bin/sh count=$1 let count=$count+1 date=`date +%s%N | sed 's/000$//'` echo "$HTTP_USER_AGENT" | grep -q MSIE if [ $? = 0 ] ; then echo Location: cookie-madness.cgi?$count echo Set-Cookie: "IE$date$count=sucks; expires=Sat, 03-Jan-2037 00:00:00 GMT" echo else echo Location: empty-pixel.gif echo fi
Re:4MB by ajs318 · 2006-03-24 02:11 · Score: 1

It's much more fun to use a .png image for the empty pixel. Internet Explorer still can't display transparent PNGs properly -- they show up white against the background.

--
Je fume. Tu fumes. Nous fûmes!
Re:4MB by Rakshasa+Taisab · 2006-03-24 02:12 · Score: 1

Well, 4 MB is also the size of a page in the x86 architecture.

--
- These characters were randomly selected.
Re:4MB by Spy+der+Mann · 2006-03-24 02:23 · Score: 1

Simple answer - every file would then have a minimum size of 4MB

Not necessarily, that depends on the partition system - two files could share a same sector. Personally I'm against huge block sizes, there are people with huge collections of small files (web designers for example), wasting resources.
Re:4MB by hackstraw · 2006-03-24 02:43 · Score: 2, Informative

4Kbyte is the size of a page of memory on all modern architectures.

Huh? Which modern architectures?

The only systems I run that still have 4k page sizes are x86 systems.

x86-32 = 4k
x86-64 = 4k
G4,G5 = 4k
alpha (64bit) = 8k
sparc (64bit) = 8k
ia64 = 16k

and at least on the ia64 platform the page size is configurable at compile time.
Re:4MB by Anonymous Coward · 2006-03-24 02:48 · Score: 0

4096 bytes should be enough for anybody
Re:4MB by Mr+Pippin · 2006-03-24 03:04 · Score: 1

Very valid point. In fact, the larger systems already allow even larger page sizes for VMM management. With larger and larger amounts of system RAM, managing memory at a 4K level starts consuming sizable amounts of CPU and memory itself. Moving to 16K or larger page sizes for VMM management is likely not that far off.

One item that I do think might should be considered is some transactional overhead space, much like mainframe and AS/400 disk formats use. So, instead of using a 16K block size, you might use 16K plus a 32byte header and trailer.
Re:4MB by Mr+Z · 2006-03-24 03:10 · Score: 1

Look up "tail packing."

--
Program Intellivision!
Re:4MB by Zaatxe · 2006-03-24 03:13 · Score: 1

Because your wasted space would be (2MB * number_of_files), silly.

Just to give you an example, the directory where I store my digital camera pics has about 7,500 files. The files make 6.61 GB and the used space is 6.63 GB. If the sectors in my HD were 4MB in size, the used space would jump to about 29.63 GB.

Me wouldn't like that...

--
So say we all
Re:4MB by Mr+Z · 2006-03-24 03:13 · Score: 1

Since when are the PowerPC G4 and G5 x86 systems? Or are you saying you don't run those anymore and you just included them for completeness?

--
Program Intellivision!
Re:4MB by John+Courtland · 2006-03-24 04:44 · Score: 2, Informative

Yeah the PSE bit (bit 4) in CR4, here's some info: http://www.ddj.com/documents/s=961/ddj9605n/

--
Slashdot is proof that Sturgeon's Law applies to mankind.
Re:4MB by hackstraw · 2006-03-24 05:27 · Score: 1

Since when are the PowerPC G4 and G5 x86 systems? Or are you saying you don't run those anymore and you just included them for completeness?

I noticed that after I hit submit.

For some reason, it didn't dawn on me to include my PowerBook G4 and my iMac G5. I got their pagesize, and threw the results in at the lat minute. I was surprised that the G5 only used a 4k pagesize.
Re:4MB by dgatwood · 2006-03-24 05:40 · Score: 1

I'm not surprised. There's little advantage to a larger page size unless you're running 64-bit processes where your page table can get huge. Even then, for the average end-user app, you'll probably still find 4k page size to be slightly more efficient (on the average) simply because of the smaller amount of wasted space at the end of each allocation region.
It's not like the VM activity is limited to paging in a single page at a time. For performance optimization, it isn't uncommon to page in multiple adjacent pages at a time. Same goes for filesystems. Thus, moving to larger VM allocation sizes or HD block sizes just means more wasted space per unit of allocation (unless the OS does extra work to get around those limitations), with no real speed benefit (apart from possibly better page table update and software page table walk performance).
Much ado about nothing, IMHO.

--
Check out my sci-fi/humor trilogy at PatriotsBooks.
Re:4MB by Mr+Z · 2006-03-24 07:01 · Score: 1

The main thing is the overall working set system wide. As long as your TLB has a form of "PID filter," then you don't pay the TLB flush cost across tasks. In that case, you start gaining performance the total working set of the machine fits nicely in the TLB.

Of course, how much of a difference it makes depends on the CPU. As I recall on the Athlon 64s, it has 32 TLB entries at L1 for 4K pages, and 512 entries at L2. (There are actually 40 TLB entries at L1. The remaining 8 are for 2MB/4MB pages.) This works out pretty nicely, as the total working set the 4K pages can cover is 128K at L1 (twice the L1D cache capacity), and 2MB at L2 (between 1x and 4x the L2 cache capacity). Thus, it seems unlikely that a workload that plays well in the cache would play poorly in the TLB.
--Joe

--
Program Intellivision!
Re:4MB by Tekzel · 2006-03-24 07:12 · Score: 1

You know, I tried and tried to resist posting this. Really, I did.

Tail packing? Isn't that a popular sport in prison?
Re:4MB by ichimunki · 2006-03-24 10:25 · Score: 1

File system blocks do not have to be the same size as physical device sectors. A filesystem can be written to combine both physical sector location as well as byte location within that sector when determining where to start a file. Or so I hear.

--
I do not have a signature
Re:4MB by Zaatxe · 2006-03-24 13:41 · Score: 1

True... but clusters smaller than sectors would probably ruin the performance. Well, that's just something I suppose, I didn't make much thinking about this (let alone testing), so it's just a supposition.

And although this could ruin my point, it still sounds like a bad idea.

--
So say we all
Re:4MB by Wolfrider · 2006-03-24 20:25 · Score: 1

Reiserfs is good for lots of small files, due to its default "tail" behavior. There is a slight speed boost if you use the "notail" mount option - but then you lose the small-file packing.

--
.
== WolfriderV6 == I'm willing to admit that *I just might* be wrong... Are you??

Ah, error correction. by wesley96 · 2006-03-23 19:10 · Score: 5, Insightful

Well, CD-ROMs use 2352 bytes per sector, ending up with 2048 actual bytes after error correction. Looking at the size of the HDDs these days a 4096-byte sector seems pretty reasonable.

--
Serving time in Aristotelean prison for violating laws of physics

Re:Ah, error correction. by Ark42 · 2006-03-23 19:28 · Score: 5, Informative

Hard drives do the same thing - for each 512 bytes of real data, they actually store near 600 bytes onto the disk with information such as ECC and sector remapping for bad sectors. There is also tiny "lead-in" and "lead-out" areas outside each sector which usually contain a simple pattern of bits to let the drive seek to the sector properly.
Unlike CD-ROMs, I don't believe you can actually read the sector meta-data without some sort of drive-manufacturer-specific tricks.

--
Morphing Software
Re:Ah, error correction. by bjpirt · 2006-03-23 20:50 · Score: 2, Interesting

I wonder if the 4096 bytes are before or after error correction. If it's after, it might make sense because (and I'm sure someone will correct me) isn't 4K a relatively common miimum size in today's filesystems. I know that the default for HFS+ on a mac is.
Re:Ah, error correction. by baadger · 2006-03-23 21:23 · Score: 2, Informative

NTFS has a cluster/allocation size from 512 bytes to 64K. This determines the minimum possible ondisk filesize, but I don't think it has too much to do with the sector size.
Re:Ah, error correction. by alexhs · 2006-03-23 22:17 · Score: 5, Informative

Unlike CD-ROMs, I don't believe you can actually read the sector meta-data

What are you calling meta-data ?
CDs also have "merging bits", and what is read as a byte is in fact coded on-disk as 14 bits, and you can't read C2 errors either, that are beyond the 2352 bytes that really are all used as data on an audio CD, an audio sector being 1/75 of a second, 44100/75*2(channels)*2(bytes per sample) = 2352 bytes and it has correction codes in addition too. You can however read subchannels (96 bytes / sector)

When dealing with such low-level technologies, reading bits on disk doesn't mean anything as there really are no bits on the disc, just pits and lands (CD) or magnetic particles (HD) causing little electric variations on a sensor, then no variation is interpreted as 0 and a variation is interpreted as a 1, and you need variations even when writing only 0's as a reference clock.

without some sort of drive-manufacturer-specific tricks.

Now of course, as you cannot change HD platters within different drive with different heads like you can do with a CD, each manufacturer can (and will !) encode differently. It has been reported that hard disks with the same reference wouldn't "interoperate" exchanging the controller part because of differing firmware versions, while the format is standardized for CDs or DVDs.

they actually store near 600 bytes

(that would be 4800 bits) In that light, they're not storing bytes, just magnetizing particles. Bytes are quite high-level. There are probably more than a ten thousands magnetic variations for a 512 byte sector. What you call bytes is already what you can read :) But there is more "meta-data" than that.

Here's an interesting read quickly found on Google just for you :)

--
I have discovered a truly marvelous proof of killer sig, which this margin is too narrow to contain.
Re:Ah, error correction. by Glooty-Us-Maximus · 2006-03-23 22:18 · Score: 2, Informative

It doesn't, other than the FS block size should be a multiple of the disk sector size to avoid wasting extra read/writes to access/store a FS block, as well as to avoid wasting space storing an FS block.
Re:Ah, error correction. by Anonymous Coward · 2006-03-23 23:04 · Score: 0

Unlike CD-ROMs, I don't believe you can actually read the sector meta-data without some sort of drive-manufacturer-specific tricks.

--sure can,, google it just like you did the info you put here, google lad.

(ohh golly, I can goggle for info)
Re:Ah, error correction. by dlZ · 2006-03-23 23:36 · Score: 1

Now of course, as you cannot change HD platters within different drive with different heads like you can do with a CD, each manufacturer can (and will !) encode differently. It has been reported that hard disks with the same reference wouldn't "interoperate" exchanging the controller part because of differing firmware versions, while the format is standardized for CDs or DVDs.

I've had to change the controller on a few hard drives for clients who did some really stupid things to their drives, but didn't want to pay to send it out and needed data off of them. It has worked on a few drives, who were close or the same for production date/product runs, but then on the other hand, it hasn't worked for even more drives. I'm sure a few of these cases the abuse was too severe and the drive itself was damaged, but a few times it was obvious that the controller just wasn't going to work on the other drive for whatever reason, be it firmware or some other little difference, even though the drive would seemingly start to work (many times the BIOS on the machine would find the drive with the different controller, but label it as jibberish.) How so many people managed to damage the controller is beyond me, though. We luckily haven't seen this happen anytime recently.

--
rm -rf ./evidence @ punkcomp
Re:Ah, error correction. by GNUALMAFUERTE · 2006-03-24 02:06 · Score: 1

That's exactly what i thought when i first saw the article, this has been the default for ext2 and ext3 filesystems for years ... only this time will be implemented in hardware.

--
WTF am I doing replying to an AC at 5 A.M on a Friday night?
Re:Ah, error correction. by networkBoy · 2006-03-24 03:19 · Score: 1

So does this mean that the smallest format level sector would be 4K then?
Going high level here: Larger sectors are better when you have larger files, smaller sectors are better if you have lots of tiny files.
a 1B file will consume one full sector, thus I try to optimize my storage patterns: Drives that store archives (all in the high 100's of megs to low gigs) use 64K sectors, while the drives that store lots of smaller files (masters research paper and notes) uses 16K sectors. How does changing the sector size on the physical drive map to the logical sector size in the formatting?

On a side note: who remembers the days of ESDI where you got to pick how many sectors per track you wanted (33, 63, 64 SPT IIRC were the popular ones), whether or not to have sector sparing, and even whether you wanted to do some wierder stuff?
-nB

--
whois gawk date unzip strip find touch finger mount join nice man top fsck grep eject more yes exit umount sleep dump
Re:Ah, error correction. by 2TecTom · 2006-03-24 04:22 · Score: 1

at one chop shop, where I was a sysop, we would routinely order the smallest drive within a series and when it arrived, swap the logic from the failed, larger capacity drive, from the same series & viola! ... more megs

ah the joys of hacking

--
Words to men, as air to birds.
Re:Ah, error correction. by Jeremi · 2006-03-24 04:58 · Score: 1

at one chop shop, where I was a sysop, we would routinely order the smallest drive within a series and when it arrived, swap the logic from the failed, larger capacity drive, from the same series & viola! ... more megs

Kind of reminds me of turning a low-density 3.5" floppy into a "high-density" one, with a hole punch. I hope you weren't storing anything important on those drives!

--

I don't care if it's 90,000 hectares. That lake was not my doing.
Re:Ah, error correction. by ultranova · 2006-03-24 06:06 · Score: 1

How does changing the sector size on the physical drive map to the logical sector size in the formatting?

On Linux at least, it doesn't. The hardware drivers show the disks (or memory segments or whatever) as a single continuous file, and the filesystem drivers then read/write that like any other file.

I presume that all other unixy systems do it the same way. Dunno about DOS/Windows.

--
Forget magic. Any technology distinguishable from divine power is insufficiently advanced.
Re:Ah, error correction. by networkBoy · 2006-03-24 06:34 · Score: 1

Thank you, I now can say I've learned something today :-)
-nB

--
whois gawk date unzip strip find touch finger mount join nice man top fsck grep eject more yes exit umount sleep dump
Re:Ah, error correction. by ak_hepcat · 2006-03-24 06:57 · Score: 1

Oh, the days of flipping my apple's 5.25" floppy over with a hole-punched addition.
nothing like an extra 170KiB....

--
Support FSF: Stop thinking with your wallet, and think with your imagination. (cc/non-commercial)
Re:Ah, error correction. by panZ · 2006-03-24 07:34 · Score: 1

You are missing the point. Most simple block device systems address 32 bits. Granted, some hard drive busses use up to 48 bits to address but the reality is most OSes/drivers use 32 bit addressing because it is damn easy to work with on 32 bit systems. This means with 512 byte blocks, you can address 512*(2^32) bytes of storage. 2199023255552 bytes or 2 petabytes. The thinking is that this will eventually be an impediment and that making the minimal addressable unit 8 times the size will give you 16 petabytes of addressable space.
I'm writing drivers for the new CE-ATA bus. The spec makers tried to do force this change on the new drives but the drive makers, fortunately, made it optional.
Adapting disk drivers to existing block drivers deal with the difference in the interim can be a pain in the ass, requiring extra buffer copies for non-aligned block data. I really don't see the benifit. By the time a 2 petabyte block device or drive comes out, most CPUs will probably be 64 bit making bit extended arithmetic in drivers uncessary and I'm betting most drive interfaces will be 48 to 64 bit addressable (e.g. EIDE is 48 bit today)
Most of my fellow engineers think this change is half baked.

--
--Let's hack root on 127.0.0.1 --panZ
Re:Ah, error correction. by Anonymous Coward · 2006-03-26 08:10 · Score: 0

they stored all the companys data

to my knowledge, the drives worked perfectly

smaller drives result from fucking with firmware

welcome to market manipulation

What's the case for Linux? by szobatudos · 2006-03-23 19:10 · Score: 1

Is Ingo Molnar working on this?

Re:What's the case for Linux? by A+beautiful+mind · 2006-03-23 19:55 · Score: 1

Why would he be working on this? He works with RTS, not with filesystems.

Ask Hans Reiser maybe.

--
It takes a man to suffer ignorance and smile
Be yourself no matter what they say
Re:What's the case for Linux? by Anonymous Coward · 2006-03-23 20:06 · Score: 2, Informative

All major Linux file systems (except XFS) already support arbitrary sector sizes up to 4096 bytes, e.g. for s/390 Mainframes that traditionally use 4096 byte sectors on Linux.
The poeple who would need to write support for this are Jeff Garzik (libata) and James Bottomley (scsi). It's not that this would require a terribly complicated patch though.
Re:What's the case for Linux? by Aggrav8d · 2006-03-23 20:33 · Score: 5, Funny

I know I'm tired because I misread the first name as Inigo and the next thing through my head was

"Hello. My name is Inigo Molnar. You changed the sectors. Prepare to die."
Re:What's the case for Linux? by caveman · 2006-03-23 21:18 · Score: 1

Linux 2.0.35 had a patch allowing 2048-byte sectors on SCSI devices; handy if you had a Fujitsu Magneto-optical drive with a capacity of 640Mb/disc or more, which used 2k sectors.

As the patch was done 'properly', a couple of tweaks of some constants and a recompile (if it isn't a run-time parameter already) should enable 4k sectors, 8k sectors, even 1Mb sectors, if you really want to go there.
Re:What's the case for Linux? by Anonymous Coward · 2006-03-24 02:32 · Score: 2, Funny

Now, offer me money.

Power too. Promise me that.

Offer me everything I ask for.

I want my 512 byte sectors back, you son of a bitch.
Re:What's the case for Linux? by MikePikeFL · 2006-03-24 02:54 · Score: 1

Glad I'm not the only one.

--
"Never underestimate the bandwidth of a station wagon full of tapes hurtling down the highway" -Andrew Tanenbaum
Re:What's the case for Linux? by Alan+Cox · 2006-03-24 05:31 · Score: 2, Informative

Linux has supported media with 4K and 2K blocksize for some years (about 7 I think offhand). 2K media comes up with optical disks a lot.
Re:What's the case for Linux? by fuzznutz · 2006-03-24 06:02 · Score: 1

"Hello. My name is Inigo Molnar. You changed the sectors. Prepare to die."
Inconceivable!

That's nice by Mancat · 2006-03-23 19:11 · Score: 0, Troll

So... If I write down a little 16-byte message to myself in Notepad containing a name and a phone number, it will take up 4096 bytes. That's good. Thanks. I guess since disks are getting so large, they have to find a way to help me waste the space even faster.

--
hello dear sirs my name is jamesh i are india (bihar) can u guide me install red had linux 9?

Re:That's nice by Trevahaha · 2006-03-23 19:14 · Score: 1

Will something like NTFS disk compression actually make it less than one sector, or are you right that every file will take up at least 4096 bytes? Does anyone know if this will do anything to improve retrieval speed?
Re:That's nice by jcr · 2006-03-23 19:14 · Score: 4, Informative

So... If I write down a little 16-byte message to myself in Notepad containing a name and a phone number, it will take up 4096 bytes.

On most systems in use today, it already does.

Blame the file system, not the sector size on the media.

-jcr

--
The only title of honor that a tyrant can grant is "Enemy of the State."
Re:That's nice by Beryllium+Sphere(tm) · 2006-03-23 19:18 · Score: 2, Informative

NTFS will write something that small into the MFT.
Re:That's nice by entrex · 2006-03-23 19:20 · Score: 0

You think thats bad, you should see my porn collection.

--
To a nail, every person with a hammer looks like a problem.
Re:That's nice by GoingDown · 2006-03-23 19:21 · Score: 1

It is perhaps already taking it or even more.

Today's filesystems are usually using larger chunks than 512 bytes to save data. But of course it depends of the filesystem you are using.

And judging of your talk about Notepad, you are using Windows. Windows NTFS uses 4k blocks by default on large (> 2GB) disks.
Re:That's nice by AuMatar · 2006-03-23 19:21 · Score: 1, Informative

OSes allocate disk space by the sector. If the size of a file is less than 1 sector, the rest is wasted. This is called internal fragmentation. Every file wastes sector_size-file_size%sector_size bytes. On average, thats sector_size/2 bytes per file.

--
I still have more fans than freaks. WTF is wrong with you people?
Re:That's nice by ars · 2006-03-23 19:23 · Score: 3, Informative

Um, it already does take up 4K or more. Unless you have a hard disk smaller then 256MB.

See: http://www.microsoft.com/technet/prodtechnol/winxp pro/reskit/c13621675.mspx and scroll down to Table 13-4

If you notice, in most of the useful cases the custer size is 4K. Making the hard disk match this seems like a good idea to me.

And EXT2 also uses a 4K block size.

Also remember it's for large disks, no FS that I know of supports a cluster (or block) size smaller then 4K for large disks.

--
-Ariel
Re:That's nice by Anonymous Coward · 2006-03-23 19:23 · Score: 0

Too bad, it sucks for those of you that keep gazillion 2-line files. On the other hand, you free up a few bits that were previously used to keep track of all those extra sectors.
Re:That's nice by Foolhardy · 2006-03-23 19:27 · Score: 3, Informative

Actually, if you're using NTFS, the data will be stored directly in the file entry in the MFT, taking zero dedicated clusters or sectors. The maximum size for this to happen is like 800 bytes.

Here's a short description of how NTFS allcates space. On volumes larger than 2GB, the cluster size (the granularity the FS uses to allocate space) was 4k already unless you specified something else when formatting the drive. Also, Windows NT has supported disk sector sizes larger than 512 bytes for a long time; it's just that anything else has been rare.
Re:That's nice by carterhawk001 · 2006-03-23 19:53 · Score: 1

Who the hell uses notepad to make notes?
Re:That's nice by Anonymous Coward · 2006-03-23 19:54 · Score: 0

Hello?!?!

NOTEpad? What do you think it's for?
Re:That's nice by tacocat · 2006-03-23 19:58 · Score: 1

like /swap?
Re:That's nice by Kraeloc · 2006-03-23 20:21 · Score: 1

What would you use, Word? For simple text-only notes, less is better. If I need to jot down a phone number quickly before I forget, I'm just gonna hit notepad (or pico on unix) and be typing it in seconds; I don't want to wait several minutes for Word to get it's bloated ass in gear.
Re:That's nice by ars · 2006-03-23 20:25 · Score: 1

What about /swap? Do you mean the swapfile partition in linux? That too uses a 4K page size.

--
-Ariel
Re:That's nice by arrrrg · 2006-03-23 20:34 · Score: 1, Insightful

Anyway, why do you give a flying fuck? You can get a 250 GB hard drive for less than $100 these days ... at that rate, that 4096 bytes costs you about $100/64000000~= $.0000016. Was that really worth your time bitching? I didn't think so.
Re:That's nice by Eunuchswear · 2006-03-23 20:40 · Score: 2, Insightful

Informative, but wrong.

Some file systems can pack multiple tail fragments into one block.

--
Watch this Heartland Institute video
Re:That's nice by Valdoran · 2006-03-23 20:40 · Score: 1

You've got porn vids of less than 4KB? Cool!
Re:That's nice by carterhawk001 · 2006-03-23 20:44 · Score: 1

https://addons.mozilla.org/extensions/moreinfo.php ?id=2011&application=firefox
Re:That's nice by Anonymous Coward · 2006-03-23 20:59 · Score: 0

ReiserFS, for example.
Re:That's nice by Anonymous Coward · 2006-03-23 21:02 · Score: 0

Oh no! You'll be wasting 0.0004 cents (yes, I actually worked it out).
Re:That's nice by Anonymous Coward · 2006-03-23 21:20 · Score: 0

Overhead is a part of any disk drive or file system. If you don't like it, don't buy the disk. I, for one, think this will make I/O much more efficient.
Re:That's nice by 91degrees · 2006-03-23 21:34 · Score: 1

This can be a little inconvenient if you deal with a lot of small files though. Logging for the application I develop generates 2500 files. This is 10Megs of wasted space for each run. Okay, this stil isn't all that significant but there may be people who generate considerably more than that.
Re:That's nice by geegs · 2006-03-23 21:57 · Score: 1

On Linux, the ReiserFS and Reiser4 filesystems "squish small files together" to avoid this problem.
Re:That's nice by Anonymous Coward · 2006-03-23 21:59 · Score: 0

Why the hell was the parent modded flamebait? If the GP had talked about an application that wrote 100s of thousands of small files, that would be one thing. But we all know no /.er has the millions of names & phone numbers to make this small amount of wasted bits matter at all .
Re:That's nice by odaen · 2006-03-23 22:11 · Score: 1, Insightful

On the alien planet Kamarr they can compress small file sizes together natively for the past 5 versions of Door. However we do not live on the planet Kamarr.
Re:That's nice by Anonymous Coward · 2006-03-23 22:46 · Score: 0

Hey, I wonder what size the smallest file which is undeniably ponographic could be?

A nice cross-disciplinary challenge. Do you compress tightly, or try to do something artistic with matchstick figures, perhaps? I suppose we would need two competitve divisions, color and B/W.

Over to /.
Re:That's nice by 91degrees · 2006-03-23 23:41 · Score: 2, Informative

Uhmm... NO!

This is a quick and dirty hack to check that the generated data is correct. I'm not going to spend weeks designing a data file format, and an API plus conversion tools to export the files to an excel compatible format.just because I've got an inefficient file system.

A new hard drive would be a better investment. Or alternatively just ignore the problem since NTFS seems to hande these adequately.

And sometimes its simply impossible to write a solution that will work like this. Some applications require a large number of discrete files.
Re:That's nice by Anonymous Coward · 2006-03-24 00:48 · Score: 0

Well, if you are too lazy to code up a proper solution, and instead would rather generate 2500 files per run, then you have to accept that your wasted space is the "dirt" part of "quick and dirty". Stop your bitching.
Re:That's nice by 91degrees · 2006-03-24 01:30 · Score: 1

Who's bitching? I merely make the observation that inefficient filesystems can cause a problem with small files if you have a large number of small files, illustrating it with an example of a typical case. Then you bitched about my code quality, and suggested an alternative that aside from being useless, is in certain cases impossible, and where possible doesn't solve the problem.
Re:That's nice by Anonymous Coward · 2006-03-24 02:01 · Score: 0

Your welcome. Becuase, you know... Don't store more than one phone number in a text file. That's just stupid.
Re:That's nice by Anonymous Coward · 2006-03-24 02:05 · Score: 0

If you were using a real OS and ReiserFS, you wouldn't have any problem with tiny files.

Forget that. If you were running something less brain-damaged than _Windows ME_ you'd have a filesystem that does it properly. Even NTFS can.
Re:That's nice by ceeam · 2006-03-24 02:12 · Score: 1

ReiserFS with its "tail conversion" roxorz.
Re:That's nice by 91degrees · 2006-03-24 02:32 · Score: 1

Uhm. This is a discussion forum. We're discussing whether the overhead is worth it for the improvements in I/O efficiency.
Re:That's nice by Anonymous Coward · 2006-03-24 02:36 · Score: 0

I actually shot milk out my nose with this one.
Re:That's nice by Anonymous Coward · 2006-03-24 02:36 · Score: 0

I'm no mod, but I presume it's because the poster is a tool.
Re:That's nice by Hatta · 2006-03-24 02:51 · Score: 1

For simple text-only notes, less is better.

But only for viewing.

--
Give me Classic Slashdot or give me death!
Re:That's nice by Fred_A · 2006-03-24 03:18 · Score: 1

Or Knotes (or the Gnome equivalent)

--

May contain traces of nut.
Made from the freshest electrons.
Re:That's nice by Anonymous Coward · 2006-03-24 03:22 · Score: 0

8===>

Is that pornographic? 5 bytes!
Re:That's nice by suitti · 2006-03-24 03:32 · Score: 1

And EXT2 also uses a 4K block size.
This is true by default, but is not a requirement. I just got a new 160 GB drive. See http://predelusional.blogspot.com/2006/03/using-pr icewatch-effectively.html
By using a command like
mke2fs -j -b 1024 -m 0 -N 2000000 /dev/hdb1
one can have a 1K sized blocks. I argue that with mostly continous filesystems, the block size is largely irrelevant to performance. A small block size reduces the wasted space at the ends of files. For me, going from 4K to 1K gave me 4 GB more free space on my 160 GB drive for my current files - 2.5%.
System supported file compression would yield more, of course. Sure, the mp3 & jpg files won't compress, but I have lots of text files. A good compression system will know this and not bother compressing files that are incompressible. It might even use a file like magic cookie mechanism so it doesn't have to attempt compression to find out it's pointless.
If the improvement in space due to consolodated ECC codes and other overhead saves more total space than the end-of-file wastage, it would still be a win. The article doesn't say how much better the new standard will be.
It may be that Linux will provide 1K filesystem blocks on top of the 4K phsyical blocks. Performance will be worse. However, the original ext filesystem provided half K blocks, and that option is now moribund.
For new systems, this standard is fine. However, I run my machines into the ground, and the better ones have lasted fifteen years. Given that my current machine could last another ten years, it would be a shame to have to toss it into the landfill in five years because the disk drives can't be replaced. Progress is good. Forced upgrades are not.

--
-- Stephen.
Re:That's nice by Anonymous Coward · 2006-03-24 03:34 · Score: 0

And that doesn't even take into account the space your desktop search engine will use recording where it found your little string, or the file description information in the directory tree, or the formatting information that your text editor may associate with your text, or the fact that it might be in double-byte Unicode.

Fortunately the vast majority of your data is enomormously larger than the sector size, making it more efficient to use larger sectors. Compared to that even a few thousand of your messages is insignificant, and if you have that many I suggest you choose a more appropriate program to track them all.
Re:That's nice by LubosD · 2006-03-24 04:04 · Score: 1

Use some better filesystem then. ReiserFS can group small files so they won't take that much space.
Re:That's nice by swillden · 2006-03-24 04:05 · Score: 1

You can get a 250 GB hard drive for less than $100 these days ... at that rate, that 4096 bytes costs you about $100/64000000~= $.0000016. Was that really worth your time bitching?
Maildir.

--
Note to ACs: I usually delete AC replies without reading them. If you want to talk to me, log in.
Re:That's nice by robnauta · 2006-03-24 05:24 · Score: 1

This is nonsense. All OS'es allocate space by the cluster, typically 4K up to 32K for fat32. The driver talks to the drive in 512-byte sectors, something which is never visible to the end user. If the sector size changes from 512 bytes to 4K it would have no effect on any Windows version, wastage would remain the same. Only harddisks and drivers would become more efficient and maybe a little faster. The only users affected would be people using fdisk-like tools that allow you to work in head/track/sector notation.
Re:That's nice by robnauta · 2006-03-24 05:42 · Score: 1

Use some better filesystem then. ReiserFS can group small files so they won't take that much space.
And if you read a tail end of a file, or a very small file, it still has to read the block, mask out the other data and shift it to the beginning of a block in memory and then return it.
Worse, a normal block write is just a write, but a small block write needs a read, an update in memory and a write.
With 300 GB disks the space saved is probably $0.0001 cent worth, the slow speed remains. If you keep 100 Gb free space anyway, wouldn't you want to turn off this 'useful' feature ?
Re:That's nice by ars · 2006-03-24 06:07 · Score: 1

"This is true by default, but is not a requirement."

I know. It's just a good idea to do so.

"By using a command like
mke2fs -j -b 1024 -m 0 -N 2000000 /dev/hdb1
one can have a 1K sized blocks. I argue that with mostly continous filesystems, the block size is largely irrelevant to performance."

And there you have the difference between theory and reality :) Actual testing finds you to be wrong. Remember there is 4 times the overhead in dealing with smaller block sizes - and also that when you don't match the page size of the CPU in question it adds even more overhead. You have a blocks in use map that is 4 times larger, and for every file the block list is 4 times larger. That wastes a little bit of space, and more overhead.

"A small block size reduces the wasted space at the ends of files. For me, going from 4K to 1K gave me 4 GB more free space on my 160 GB drive for my current files - 2.5%."

No it didn't. If you assume a maximum waste of 3K per file that would mean you have more then 1 million (1,048,576) files! I somehow doubt that. Esecially in light of the next paragraph:

The actual savings came from using -N 2000000. You significantly reduced the number of inodes you have on the hd. inodes take up space and that's where your 4GB came from. And here's the best part - if you really do have millions of files on the system you are in trouble because you only have 2 million inodes!

"It may be that Linux will provide 1K filesystem blocks on top of the 4K phsyical blocks. Performance will be worse. However, the original ext filesystem provided half K blocks, and that option is now moribund."

I quite double that it would do so. Why would anyone want that? Performance would be embarassing. And half K blocks are no problem when the sector size is half a K. You can't have a block size smaller then the sector size.

"For new systems, this standard is fine. However, I run my machines into the ground, and the better ones have lasted fifteen years. Given that my current machine could last another ten years, it would be a shame to have to toss it into the landfill in five years because the disk drives can't be replaced. Progress is good. Forced upgrades are not."

Did you even read the RTF? First of all the large sector sizes are only for large hd's. Buy a small one and you'll be fine. Second, they have backward compatibility modes - not for the hardware, but for the OS (i.e. windows) which can't deal with unexpected sector sizes. Linux can handle the sector sizes no problem, so no land fill for you.

--
-Ariel
Re:That's nice by Anonymous Coward · 2006-03-24 07:26 · Score: 0

You've got porn vids of less than 4KB? Cool!

Thumbnails, dude.
Re:That's nice by Kraeloc · 2006-03-24 10:26 · Score: 1

How do you figure? I'd rather have annoying bloat slowing me down while reading a note than while writing one.
Re:That's nice by Kraeloc · 2006-03-24 10:30 · Score: 1

Indeed, that is useful. But I often have a terminal window open anyway, and always a blank Textedit window (I'm on OSX for normal usage), so it's just as easy to simply alt-tab over.
Re:That's nice by ameoba · 2006-03-24 13:55 · Score: 1

Why not just use a 16-character filename on an empty file?

--
my sig's at the bottom of the page.
Re:That's nice by Cow+Herd+(Anonymous) · 2006-03-24 13:56 · Score: 1

You should get a prize for completely missing the point. http://www.greenwoodsoftware.com/less/
Re:That's nice by Kraeloc · 2006-03-24 14:16 · Score: 1

Oh, god, I can't believe I missed that. I'm going to go hide in the corner now.

Quick Explain How! by realcoolguy425 · 2006-03-23 19:13 · Score: 0

Someone explain to me how this works! why is this better? oh yeah you can only use sci-fi/startrecky terminology or else it doesn't count.

Re:Quick Explain How! by AngelofDeath-02 · 2006-03-23 19:18 · Score: 5, Interesting

Best analogy is a gym locker room
You have say, 10 lockers up and 20 lockers accross
You can only put one thing in a locker, so you cant put your gym shorts in the same one as your shoes. But if you have lots of socks, you can pile them in, and take up two or three if neccessary.

Space is wasted if you have a really big locker, but it's only holding a sock.

Now, you've got to record where all of this stuff is, or you will take forever to find that sock. So you set asside a locker to hold the clipboard with designations.

Now to bring this back into real life. There are a _lot_ of sectors on a disk. So keeping track of all of them starts requiring a substantial amount of resources. I imagine they are finding it easier to justify wasting space for small files in order to make it easier to keep track of them. Average file sizes are also going up, so it's not as big of a problem as it used to be either. It's all relative...

--
No, I am not an English major. My posts are subject to typos and incorrect grammar. Do not expect perfection.
Re:Quick Explain How! by Anonymous Coward · 2006-03-23 19:24 · Score: 0

Who the hell has more than one gym locker and can't put their shorts with their shoes? LOL :-) Other than that, good explanation.
Re:Quick Explain How! by BadAnalogyGuy · 2006-03-23 19:24 · Score: 5, Funny

I'm willing to sell this account for the right price.
Re:Quick Explain How! by realcoolguy425 · 2006-03-23 19:28 · Score: 5, Funny

I'm sorry, Your response has to be in some form of star-trek (or sci-fi) I would have accepted this however...

Best analogy is Spock's gym locker room

Spock has say, 10 space lockers up and 20 space lockers accross

Spock can only put one thing in a locker, so Spock cant put his gym shorts in the same one as your shoes. But since Spock has lots of socks, He can pile them in, and take up two or three if neccessary.

Space is wasted if Spock uses a really big locker, but it's only holding a sock.

Now, you've got to record where all of this stuff is, or you will take forever to find that sock. (I guess the tricorders are broken) So Spock sets aside a locker to hold the clipboard with designations.

Now to bring this back into real life. There are a _lot_ of sectors on a disk. So keeping track of all of them starts requiring a substantial amount of resources. I imagine they are finding it easier to justify wasting space for small files in order to make it easier to keep track of them. Average file sizes are also going up, so it's not as big of a problem as it used to be either. It's all relative...
Re:Quick Explain How! by Bob54321 · 2006-03-23 19:41 · Score: 1

Funniest comment I have read in ages...

--
:(){ :|:& };:
Re:Quick Explain How! by Nefarious+Wheel · 2006-03-23 20:02 · Score: 2, Funny

I think it all started with the first Vax 780, or possibly the first IBM 370 channel controller. Those old machines booted with a 7" floppy that had a capacity of 0.5k. Yep, 512 bytes. Early bootstraps could store the entire contents on to a hard disk with very few instructions if the sector size matched.
Man that takes me back. Where's my toupee....

--
Do not mock my vision of impractical footwear
Re:Quick Explain How! by MichaelSmith · 2006-03-23 20:34 · Score: 1

I think it all started with the first Vax 780, or possibly the first IBM 370 channel controller. Those old machines booted with a 7" floppy that had a capacity of 0.5k. Yep, 512 bytes. Early bootstraps could store the entire contents on to a hard disk with very few instructions if the sector size matched.
The 11/750 loaded its microcode from a little magnetic tape. It used to take (seemingly) ages to get going. I used to boot PDP 11/84's and 83's from TK50 tape. This was in traffic signal cabins out in the middle of nowhere, usually at 0200 or so. I could go for a walk and listen for the console printer to start chattering as the system came up.

I think the pdp's are the reason I now use NetBSD. Not sure why. Just a similar feel.

--
http://michaelsmith.id.au
Re:Quick Explain How! by BuR4N · 2006-03-23 21:08 · Score: 1

Thanks for the first laugh of the day!!! :)

--
http://www.intellipool.se/ - Intellipool Network Monitor
Re:Quick Explain How! by SleepyHappyDoc · 2006-03-23 21:17 · Score: 1

oh, come on. A vulcan would never store his gym shorts in someone elses locker.

--
Stasis is death. Embrace change.
Re:Quick Explain How! by MooUK · 2006-03-23 22:42 · Score: 1

Yesterday, we had Soviet Britannia. Today we have this.

What're we going to get to top that tomorrow?
Re:Quick Explain How! by Anonymous Coward · 2006-03-24 00:25 · Score: 0

Just tell me one thing: did you create this account, and maintain the persona of "BadAnalogyGuy," just so that some day, when someone else made a really bad analogy, you'd be able to make this joke?
Re:Quick Explain How! by BadAnalogyGuy · 2006-03-24 00:27 · Score: 1

I could make this joke every fucking day on this site. To be quite frank, 90% of the people who post on this site could step into my shoes quite easily.
Re:Quick Explain How! by LordKronos · 2006-03-24 00:56 · Score: 1

Did you come up with the name BadAnalogyGuy? It sounds like it could have been one of the superhero names on "Whose Line Is It Anyway?"
Re:Quick Explain How! by zerocool^ · 2006-03-24 01:34 · Score: 1

That post was illogical.

--
sig?
Re:Quick Explain How! by BadAnalogyGuy · 2006-03-24 01:48 · Score: 1

I'm sure they probably did something like that on the show. But this name itself came to me like a bolt out of the blue. Angels sang and birds chirped and thus I was born, from the forehead of Zeus, you could say.
Re:Quick Explain How! by Anonymous Coward · 2006-03-24 01:57 · Score: 0

This forgets about the intricate quantum mechanical properties of socks. Socks behave like fermion particles, where there may be no two socks having the same quantum state (color scheme and design) in the same locker. So, if ever Spock has some matching socks in his huge pile, he will need more lockers.
On the other hand, the analogy would work differently with quantum mechanical washing machines. Here, Spock could fit in as many socks as he wanted, because they would start vanishing the more he stuffed in...
Re:Quick Explain How! by Zontar_Thing_From_Ve · 2006-03-24 02:38 · Score: 1

I think it all started with the first Vax 780, or possibly the first IBM 370 channel controller. Those old machines booted with a 7" floppy that had a capacity of 0.5k. Yep, 512 bytes. Early bootstraps could store the entire contents on to a hard disk with very few instructions if the sector size matched.

Man that takes me back. Where's my toupee....

I think it's in the same place as your vacuum tubes.
Re:Quick Explain How! by AngelofDeath-02 · 2006-03-24 02:46 · Score: 1

haha. I'll take it!

Yah, definately a bad analogy but I couldn't resist ;)

--
No, I am not an English major. My posts are subject to typos and incorrect grammar. Do not expect perfection.
Re:Quick Explain How! by Anonymous Coward · 2006-03-24 06:29 · Score: 0

! Copyright violation!
I see that you have taken an existing work, modified it and created a derivative work.
I don't think you are allowed to do that until the parent poster has been dead for > 95 years.
Re:Quick Explain How! by Nefarious+Wheel · 2006-03-26 00:48 · Score: 1

Vacuum tubes? Discrete transistors, thanks. From the days when Seymour Cray would sand back a resistor to tune the first CDC Supers.
And yes, I remember booting 750's, 780's, 785's, 8550's, 6550's ... all manner of Vaxen, repeatedly, about 10 years of it.
Mind you, at least 7 years of that was waiting for the 750 microcode to load...
Hardware was all very compatible, worked very well when it was working at all -- sort of like a Citroen.
But I still miss DCL, and logical names. They waz cool.
I'm standing here, watching a tape drive, spinning around, and talking...

--
Do not mock my vision of impractical footwear
Re:Quick Explain How! by Phil+Karn · 2006-03-27 11:16 · Score: 1

Actually, the floppies on the VAX 11/780 were 8", and they held about 243 kilobytes. Still pretty tiny by today's standards.
Re:Quick Explain How! by Nefarious+Wheel · 2006-03-27 12:40 · Score: 1

Doh! I will shut up and go back to my RFP's now.

--
Do not mock my vision of impractical footwear

Cluster size? by dokebi · 2006-03-23 19:13 · Score: 2, Interesting

I thought cluster sizes were already 4KB for efficiency, and LBA for larger drive sizes. So how does changing the sector size change things? (Especially when we don't access drives by sector/cylinder anymore?)

--
In Soviet Russia, articles before post read *you*!

Re:Cluster size? by scdeimos · 2006-03-23 19:55 · Score: 5, Informative

I thought cluster sizes were already 4KB for efficiency, and LBA for larger drive sizes.
Cluster sizes are variable on most file systems. On our NTFS web servers we tend to have 1k clusters because it's more efficient to do it that way with lots of small files, but the default NTFS cluster size is 4k. LBA is just a different addressing scheme at the media level to make a volume appear to be a flat array of sectors (as opposed to the old CHS or Cylinder Head Sector scheme).
Re:Cluster size? by Charan · 2006-03-24 03:25 · Score: 1

By boosting the sector size up to 4KB, the OS is forced to access the disk in increments of 4KB instead of 512 bytes. Yes, if you only need 512 bytes, you're now transferring 8x the data you used to. But most modern filesystems use a block size (determined by the file system) of at least 4KB, and the entire block would be read into memory on a read operation. It won't matter if this happens lines up with the sector size (determined by the disk). The common-case access will stay the same.

This is just an interface change. It does not mean that disks suddenly grew 8x in capacity overnight. Then again, there has to be some change to prompt this...
Re:Cluster size? by shmlco · 2006-03-24 03:54 · Score: 1

I thnk it better maps modern OS cluster and memory pages sizes to the drive, improves throughput, allows for higher capacity drives, and, not so incidentally, zaps 75% of the space needed for sector metadata (id, sync, ecc, etc.).

--
Any sect, cult, or religion will legislate its creed into law if it acquires the political power to do so.
Re:Cluster size? by rrohbeck · 2006-03-24 04:14 · Score: 1

I thought cluster sizes were already 4KB for efficiency, and LBA for larger drive sizes. So how does changing the sector size change things?

Even if the file system allocates space only in multiples of the cluster size, the disk is still read and written in sectors. That is, any overhead (space on disk for interblock gaps, time on the interface for comamnds) is 8 times as much for 512-byte sectors as it is for 4k sectors. ECCs (error correcting codes) also get more efficient the larger the block is, but that relation is nonlinear.

--
thegodmovie.com - watch it

problems for older hardware??? by 3seas · 2006-03-23 19:14 · Score: 1

so long as this new format is transparent, built internally in the drives and doesn't effect older hardware or software, there shouldn't be a problem. It also should not contain any DRM junk.

All to often an advantage in speed improvements and such are more than countered by adding overhead junk.

now maybe I should RTFA...

Re:problems for older hardware??? by Anonymous Coward · 2006-03-23 21:36 · Score: 0

It also should not contain any DRM junk.
It says something that every change to hardware these days is immediately viewed as as excuse to stuff some DRM crap in there... and everyone knows it. Thanks to the baleful glare of Microsoft and Intel and their desire to lockdown and gain total control over the PC platform... no technically aware PC-owner looks foward to new PC technology. There's just a pervading feeling of pessimism and the feeling that we're seeing the end of the most interesting and liberating technology ever. That all new systems will be little more than locked-down kiosks controlled elsewhere.
The engineers responsible for this shit should take a good long look at what they've been doing.

No, that's not 'sector' by wesley96 · 2006-03-23 19:15 · Score: 4, Informative

You're thinking of 'cluster'. This is tied to the file system that is actually used on the disk. Even with the current 512-byte sector, a normal NTFS partition of, say, 200GB, uses 4KB cluster and a single file takes up a minimum of 4KB already.

--
Serving time in Aristotelean prison for violating laws of physics

Re:No, that's not 'sector' by A+beautiful+mind · 2006-03-23 19:58 · Score: 1

So, all they doing is pushing this abstraction layer to the hardware, thus getting rid of an unnecessary layer, if I understand it correctly?

--
It takes a man to suffer ignorance and smile
Be yourself no matter what they say
Re:No, that's not 'sector' by Anonymous Coward · 2006-03-23 20:42 · Score: 1, Informative

Even with the current 512-byte sector, a normal NTFS partition of, say, 200GB, uses 4KB cluster and a single file takes up a minimum of 4KB already
No. If the file can fit into the MFT record (resident file) then it takes 0 bytes outside of the file's metadata, which is usually 1 KB altogether. The maximum size of such files are usually 700-800 bytes. Though not many other filesystems have this capability.
Re:No, that's not 'sector' by TapeCutter · 2006-03-23 22:07 · Score: 3, Interesting

"So, all they doing is pushing this abstraction layer to the hardware, thus getting rid of an unnecessary layer, if I understand it correctly?"

Nah, nothing that significant. The operating system does/should not "know" anything about how the data is physically stored by a device. The existing O/S storage abstractions will remain. (You may have trouble running a very old O/S but that would be just one of your problems)

Every modern O/S uses disk space as virtual memory by reading and writing chunks of RAM to the HDD when it runs out of physical RAM. The standard HDD sector size is changing to the most commonly used O/S size for memory "pages" (RAM chunks written to disk).

The larger size will (in theory) speed things up a tiny amount. The the HDD will now read/write a "page" to disk in one sector rather than four. Meaning the HDD will perform less administrative functions to swap RAM back and forth to the disk. Hardly anyone will notice this but constant minor tweeking of HDD internals has evolved them very rapidly. eg: In 1990 I paid $200AU for a second-hand 20MB HDD (~0.2 SECOND seek time!).

--
And did you exchange a walk on part in the war for a lead role in a cage? - Pink Floyd.

Hrm, that kind of makes sense... by I+kan+Spl · 2006-03-23 19:17 · Score: 2, Insightful

Most "normal use" filesystems nowadays (FAT32, Ext3, HFS, Reiser) all use 4K blocks by default. That means that the smallest amount of data that you can change at a time is 4k, so every time you change a block, the HDD has to do 8 writes or reads. That would leave the drive preforming 8x the number of commands that it would need to.

As filesystems are slowly moving towards larger block sizes, now that the "wasted" space on drives due to unused space at the ends of blocks are not as noticable, moving up the size on the underlying hardware also makes sense. I don't think that this can make things too much faster, but it would allow SATA drives (and SCSI also) to quesu more commands in their internal buffers, as they will onyl be recieving one command per read/write that the filesystem does, instead of 8.

--
My UID is prime and so is this number: 09 F9 11 02 9D 74 E3 5B D8 41 56 C5 63 56 88 C0.

Re:Hrm, that kind of makes sense... by Anonymous Coward · 2006-03-23 19:35 · Score: 3, Informative

You're talking bullshit. In SCSI/SATA you can read/write big chunks of data (even 1MB) in just one command. Just read the standards.
Re:Hrm, that kind of makes sense... by Anonymous Coward · 2006-03-23 21:02 · Score: 3, Informative

I'm pretty sure he was talking about operations performed by the drive's internal controller, not those sent through the interface cable.
Re:Hrm, that kind of makes sense... by Antique+Geekmeister · 2006-03-24 00:41 · Score: 1

A file-system block is not a hard-disk block. This means that block sizes smaller than 4096 bytes will not be available, and that tools that talk to the disk at a low level (such as fdisk and parted) will have to be reviewed for any assumptions that block sizes are not, in fact, 512 bytes. It also means that old drivers that made such assumptions are not going to interoperate correctly with these new disks and controllers, unless the manufacturers are very clever about maintaining interfaces that look identical.

I expect a lot of people running old OS's who don't like to upgrade are going to be very unhappy when they can't use the new hardware and tell their poor IT person to "just find the patch!" when they can't transfer their deprecated OS and tools to the new "our sales partner recommended it!" hardware. Vista should be OK with it: Microsoft has strong partnerships with the manufacturers and the drives absolutely must work under Vista. But the NetBSD authors and the people running RedHat 6.x servers because "we know our code works on it" are in for a big surprise if this becomes popular.

Expect to see it first on the very large hard drives, 200 Gigabytes and up, where the larger block size is a real advantage.
Re:Hrm, that kind of makes sense... by darkmeridian · 2006-03-24 02:22 · Score: 2, Informative

Grandparent is discussing "native command queueing", where the hard disk will parse the OS read/write calls and stack them in a way that optimizes hardware access. Pretend there are three consecutive blocks of data on the hard drive: 1, 2, and 3. The OS calls for 1, 3, and then 2. Instead of going three spins around, NCQ will read the data in one spin in 1, 2, 3 order but then toss it out to the OS in 1, 3, 2 order. Now, I'm not sure how much higher sector sizes will affect NCQ capability, because I thought was limited by the amount of hardware cache.

--
A NYC lawyer blogs. http://www.chuangblog.com/
Re:Hrm, that kind of makes sense... by Helios1182 · 2006-03-24 03:08 · Score: 1

200GB is already on the small end for new drives, so by your estimates it will affect everything from now on. I have heard estimates that 1TB drives will be reasonably priced by 2007, by then it should be standard.
Re:Hrm, that kind of makes sense... by Ironsides · 2006-03-24 04:02 · Score: 1

Now, I'm not sure how much higher sector sizes will affect NCQ capability, because I thought was limited by the amount of hardware cache.

Given that the hardware cache in HDs at a minimum is 2MB and has recently gone to 8MB as pretty much a standard and I frequently see 16MB drive caches, I don't think changing the sector size from 512B to 4KB is going to be much of a problem.

--
Fly me to the moon Let me sing among those stars Let me see what spring is like On jupiter and mars
Re:Hrm, that kind of makes sense... by hamanu · 2006-03-24 04:32 · Score: 1

Actually, returning the data in 1,3,2 order would be the older ATA "tagged command queueing" way of doing things, where commands had to complete in-order. In SCSI TCQ and the new non-brain-damaged ATA NCQ the drive would return the data in the 1,2,3 order, which is different from the order that the OS requestd them in, but minimizes the amount of time an application must wait for its data. SCSI TCQ and ATA NCQ are about out-or-order COMPLETION, not just out-of-order EXECUTION.

--
every _exit() is the same, but every clone() is different.
Re:Hrm, that kind of makes sense... by Antique+Geekmeister · 2006-03-24 09:57 · Score: 1

200 is still pretty generous for low end systems, SCSI, and high speed arrays. But you have a point. Feel free to up it by a factor of 2 or more.
Re:Hrm, that kind of makes sense... by jesup · 2006-03-24 10:10 · Score: 1

And SCSI has had tags since around ... 1991? 1992? maybe a few years earlier.

Even back around 1992/1993 we (Commodore Amiga) had tagged-queue implementations that were done entirely in the NCR 53c710 SCSI chip with no processor intervention. When a new request was added, we fiddled the "list" (really an instruction sequence for the '710) so when a tagged continuation occurred, it could be handled and DMA into memory without a processor interrupt.

It's criminal that ATA took until a couple of years ago to even get close to that.

Good for small devices by BadAnalogyGuy · 2006-03-23 19:19 · Score: 4, Interesting

Small devices like cellphones typically save files of several kilobytes, whether they be the phonebook database or something like camera images. Whether the data is saved in a couple large sectors or 8 times that many small sectors isn't really an issue. Either way will work fine, as far as the data is concerned. The biggest problem is the amount of battery power used to transfer those files. If you have to re-issue a read or write command (well, the filesystem would do this) for each 512-byte block, that means that you will spend 8 times more energy (give or take a bit) to read or write the same 4k block of data.

Also, squaring away each sector after processing is a round trip back to the filesystem which can be eliminated by reading a larger sector size in the first place.

Some semi-ATA disks already force a minimum 4096-byte sector size. It's not necessarily the best way to get the most usage out of your disks, but it is one way of speeding up the disk just a little bit more to reduce power consumption.

Re:Good for small devices by scdeimos · 2006-03-23 20:02 · Score: 1

I guess that's why they call you bad analogy guy. :) R/W filesystems tend to abstract media into clusters, which are groups of sectors, so that they can take advantage of multi-sector read/write commands (which have been around since MFM hard disks with CHS addressing schemes, by the way) to get more than one sector's worth of data on/off the hard disk in a single command.
Re:Good for small devices by BadAnalogyGuy · 2006-03-23 20:05 · Score: 1

The single command to the taskfile, sure. But the actual sector access must still be handled on a sector-by-sector basis. So to read 4k of data, you go from:

1 taskfile write + 8 sector reads

to

1 taskfile write + 1 sector read.
Re:Good for small devices by Anonymous Coward · 2006-03-23 20:23 · Score: 0

interleaving will allow all the cluster's sectors to be read/written in 1 shot.
Re:Good for small devices by BadAnalogyGuy · 2006-03-23 20:27 · Score: 1

At what level of abstraction? Is this truly a continuous 8-sector access?
Re:Good for small devices by Khyber · 2006-03-24 00:58 · Score: 1

Umm, excuse me, but WTF is a 'semi-ATA disk?' Either it's ATA or it's not, there is no hybrid that I'm aware of.

--
Still waiting on Serviscope_minor to wake up to fucking reality and realize that Jessica Price isn't going to fuck him.
Re:Good for small devices by BadAnalogyGuy · 2006-03-24 01:04 · Score: 1

CE-ATA, for one. It already requires 4096-byte sectors.

There really isn't much data... by rice_burners_suck · 2006-03-23 19:21 · Score: 1

...to back up the claim that this is more efficient, though it intuitively "feels" like it should be faster, but not necessarily more efficient in terms of space. Suppose a file with a size of 1 byte takes up 512 bytes of space on the disk. With this larger sector size, that file would take 4k. I don't see why this isn't an option that can be set through drive initialization parameters, and why you can't choose any size for the sectors, depending on whatever tweaking you can do to figure out what's best for your application.

Re:There really isn't much data... by Anonymous Coward · 2006-03-23 19:32 · Score: 0

As others have noted, common file systems all use block size of 4K or bigger. Matching the disc sector size will simply expedite the IO.
Re:There really isn't much data... by gbjbaanb · 2006-03-23 22:19 · Score: 1

I am not a filesystem developer but... as pretty much all filesystems are configured to use 4k blocks, when you write a 1 byte file, it takes up 4k in the filesystem. When the filesystem writes the block to disk, it takes up 4k which translates to 8 drive blocks.

I think that's it. You will not lose space on the disk when the change happens because the filesystem block sizes are still 4k. I suppose you could use whatever compression system your FS offers to pack the data into less blocks, then you'd gain a ton of extra space.

And, at the end of the day, how many files do you ave that are less than 4k in size. Would you really care if they suddenly becamne 4k in size? I doubt it, your drive is taken up with movies, binaries and images. The little files take up no space whatsoever in comparison.
Re:There really isn't much data... by kv9 · 2006-03-23 23:27 · Score: 1

And, at the end of the day, how many files do you ave that are less than 4k in size.
you'd be surprised:
# find / -type file -size -4096c | wc -l 107599
~69% in my case.

--
Stop Computers/Cars Analogies on S
Re:There really isn't much data... by karzan · 2006-03-23 23:46 · Score: 1

107599 * 4096 = 440725504 bytes, or approximately 440 megabytes.

If this is 69% of the size of your hard drive, I'm impressed that you can read Slashdot on your computer. Considering these days most hard drives shipping are at the very least 40GB or so, does 440MB (versus the 110MB it would be if they were all 512 byte files instead) really make much of a difference?
Re:There really isn't much data... by Anonymous Coward · 2006-03-24 01:37 · Score: 0

I think he means 69% of the NUMBER of the files, not the space on his hard drive.
Re:There really isn't much data... by kv9 · 2006-03-24 02:44 · Score: 1

it's 69% of the total *number* of files. and it's 52M as opposed to 420M, an 8 fold increase. anyway, my point was that some people do have lots of small files, i was not talking about size.

--
Stop Computers/Cars Analogies on S
Re:There really isn't much data... by jcaren · 2006-03-24 04:48 · Score: 1

One minor point.

Does a file not start with an inode?
WOuld this not mean each small file would eat two blocks?

This comes to mind because a previous employer wrote small files into the inode if they were ~3K - to save two writes to the raid array.

Some wally decided to hard code this behaviour into the backup software and when this 'feature' was removed, backups of all small files were nulled out on restore.

Needless to say, this pissed of a LOT of developers who after using the restore (by directory) facility found they had lost even more code.
Re:There really isn't much data... by Anonymous Coward · 2006-03-24 05:19 · Score: 0

And with ReiserFS it uses just 1 byte (plus the directory entry, of course). FAT isn't exactly the pinnacle of filesystem technology...
Re:There really isn't much data... by Davin811a · 2006-03-24 05:21 · Score: 1

Just remember, Reiserfs handles sparse files efficiently. Rather than saving 1000 100 byte files in 1000 individual allocation units, Reiser will fold them into the tree.

Re:Vista by Anonymous Coward · 2006-03-23 19:22 · Score: 2, Funny

Also, Solitaire will be replaced by Duke Nukem Forever on every shipped copy of Vista. And if you're one of the first 100 in line at any Best Buy when you pick up Vista, you will also get a free Phantom game console.

In Vista already? by sinnerman · 2006-03-23 19:23 · Score: 5, Funny

Well of course Vista will ship with this supported already. Just like WinFS...er..

Re:In Vista already? by x2A · 2006-03-24 01:52 · Score: 1

"Well of course Vista will ship with this supported already"

That sentence would be just as funny with the end cut off...

"Well of cause Vista will ship"

...yeah, of cause... anytime now...

--
The revolution will not be televised... but it will have a page on Wikipedia
Re:In Vista already? by Anonymous Coward · 2006-03-24 05:19 · Score: 0

Of course Vista will ship with support for this already.
When it finally ships, Hard Drives will start using 16384 byte sectors.

Finally!! by 5pp000 · 2006-03-23 19:23 · Score: 1

I've wondered about this for years. In 1981 I had a CP/M system -- CP/M!!! -- that could happily handle 128-, 256-, 512-, or 1024-byte sectors on its 8-inch floppy drives. (I wrote the reblocking code in the BIOS myself.) There was no question that larger sizes were better for both capacity and performance, if perhaps slightly more prone to unrecoverable errors (not that I really noticed, though). But somehow the idea that drivers could support multiple sector sizes was forgotten.

To answer someone's question, if all disk transfers are multiples of 4K anyway, you're better off using that as the hardware sector size because there's less overhead -- less spatial overhead on the disk because you have fewer sector headers and intersector gaps, and less temporal overhead in the I/O protocol because you're only sending one transfer command instead of eight.

--
Your god may be dead, but mine aren't!

Re:Finally!! by IvyKing · 2006-03-23 20:19 · Score: 1

At the same time, 86-DOS (the original name for what is now MS-DOS) could also handle varying sized sectors - generally used 128 byte sectors for SSSD 8" disks and 1024 byte sectors for DSDD 8" disks. Tim Paterson recommened using a cluster size that was the square root of the disk capacity in bytes, so the SSSD disks used four sectors per cluster (512 bytes) and the DSDD used 1 sector per cluster (1024 bytes).
What was done with CP/M and 86-DOS, but not PC/MS-DOS, was distributing source code for the BIOS (called io.sys under 86-DOS). With that source code, it was relatively easy to support whatever sector size you wanted.
Re:Finally!! by MichaelSmith · 2006-03-23 20:26 · Score: 1

I had a CP/M system as well and IIRC it stored contiguous files on every third or sixth sector because the CPU could not always keep up with the disk.

I remember writing some slow basic code around the same time on an apple ][ which caused the floppy drive to stop and wait for the CPU.

Also there was something about batch files in CP/M. I think the files were structured so that the shell could go back to the disk for the next step in the script. Those were the days when memory was really scarce.

--
http://michaelsmith.id.au
Re:Finally!! by Anonymous Coward · 2006-03-23 21:55 · Score: 0

To answer someone's question, if all disk transfers are multiples of 4K anyway, you're better off using that as the hardware sector size because there's less overhead -- less spatial overhead on the disk because you have fewer sector headers and intersector gaps, and less temporal overhead in the I/O protocol because you're only sending one transfer command instead of eight.
You're correct, and this seems to be an important point that many people here are missing. There is a certain amount of per-sector overhead that is of relatively constant size (regardless of sector size). If this overhead is, say, 50 bytes (a number pulled out of my ass) then the overhead is about 9.8% for a 512 byte sector, but only about 1.2% for a 4096 byte sector. Given the sector densities on modern drives, this change will probably allow a few extra sectors per track, especially on the outer cylinders.
Re:Finally!! by tricorn · 2006-03-24 02:53 · Score: 1

I don't recall there even being "shell scripts". I do remember that "the shell" would move/relocate itself to high memory, so short programs could be loaded without overwriting it. Longer programs that wanted to use all of memory would then take longer after exit because the command shell would need to be reloaded.

The coolest thing I ever did with the BIOS was to take the print driver from the supplied sample source for a daisy-wheel printer and insert it into the PRN output driver, so for instance Control-P would start properly printing to the daisy-wheel, programs that wanted to print could do so by using the PRN: device instead of having to directly support the printer, etc. Of course, to do anything really pretty with it, e.g. offset-overstrike (bold), or sub-character justification, you needed to talk directly to the printer, but Wordstar supported it so that was all you needed.

Amazing what you could do with 64KB of memory (including screen text buffer) and a 360KB diskette on a, what, 1MHz 8-bit processor?
Re:Finally!! by raduf · 2006-03-24 03:16 · Score: 1

Relax, nobody remembers what's a CP/M anymore ;)
Re:Finally!! by barronVonBackstabber · 2006-03-24 04:01 · Score: 1

CPM 2.2 had shell scripts of a sort, I used them a lot in the old days. The disk tracks did indeed have the sectors out of sequence (called sector interleave), this was so you didn't have to wait for the disk to come round again to read the next sector or jump to the next track. We did some pretty neat stuff on CPM, like re-writing the bios so that the screen handler started from bottom right and wrote from right to left, or just printed random characters so you had to really trust your fingers whilst typing. Oh to be a young programmer again.
Re:Finally!! by MichaelSmith · 2006-03-24 08:53 · Score: 1

CPM 2.2 had shell scripts of a sort, I used them a lot in the old days.
The C compiler I used (HI-TECH C) was invoked as a script. You could watch it run executables for the precompiler, code generator and assembler. It was the best education I got in compiler design, that one.

--
http://michaelsmith.id.au
Re:Finally!! by tricorn · 2006-03-25 06:20 · Score: 1

Yeah, the CP/M file system had the sector interleave built in as I recall, but disk controllers could format with an interleave as well. Using the wrong formatted interleave with software interleave would make things really slow. Besides interleave, you also used track skew, so that after a head seek after the last sector on a track, the first sector of the next track would just about be getting around. I wrote some routines to do timing on sector reads to determine the formatted interleave and track skew.

Similar techniques were used in hard disk drives as well, before all that stuff was subsumed into the controller and cache buffers started to be used to avoid needing to do most of that stuff, or at least to avoid having anything but the disk hardware be aware of it. UNIX file system may still have vestiges of that stuff left over as part of the disk read scheduling routines.

Wasted space? by solarbob · 2006-03-23 19:26 · Score: 1

Taking the cost per GB currently and that most "small files" now are 10K+ does the overhead this cause by "wasted space" really need to matter. It still takes a hell of a lot of documents to fill up even a 250GB disk and as you can now get these disk for next to nothing I'm happy to get the extra performance

--
SolarVPS - Quality Windows and Linux Virtual Servers

Apple in the forground again by danmcn · 2006-03-23 19:29 · Score: 1, Interesting

Isn't this what apple tried to do 5+ years ago with HFS+

Re: Apple in the forground again by n.wegner · 2006-03-23 20:00 · Score: 4, Informative

You could have added MS with FAT32 and NTFS. The problem is we're not talking about filesystem cluster sizes, which are software-configurable, but the disks' actual sector size, which is hardware that HFS+ has no effect on.
Re: Apple in the forground again by Anonymous Coward · 2006-03-23 20:10 · Score: 0

No, it isn't, you don't know what you're talking about.

Hmm, apple fanboy who doesn't know what he's talking about ... I'm sure that's a coincidence.
Re: Apple in the forground again by Anonymous Coward · 2006-03-23 20:18 · Score: 1, Informative

On the filesystem layer everbody moved the block size up more than 15 years ago. Even Microsoft's POS FAT32 has a 4096 byte cluster. This time Microsoft was only 10 years behind Unix. We're talking hardware layer here. If you can address 2^48 sectors (or whatever is the limit these days) and your sector size is 8x you can have much bigger disks. Think in term of SAN arrays and you'll see that such a need might be here right now on big iron already.
Re: Apple in the forground again by ocelotbob · 2006-03-23 20:28 · Score: 1

Apple is hardly at the forefront in FS research here. Pretty much every FS on the planet has had variable cluster sizes for many, many years; I remember back in the DOS days being able to format disks with cluster sizes from 512 bytes to 32 or even 64K. This proposal is talking about the physical sector sizes and getting them equalized with what the default cluster size for most operating systems.

--
Marxism is the opiate of dumbasses
Re: Apple in the forground again by Anonymous Coward · 2006-03-23 21:55 · Score: 0

off on a tangent, back in the Apple// days, nibble magazine had a some sort of hyper-fast graphic slide show program. The way they did it was reformat the normally 16 sector track as one big track. I don't know the technical aspects on how it worked (other than the abstract idea of now having to read just one sector instead of 16,) but it apparently worked fast enough to publish.
Re: Apple in the forground again by bzipitidoo · 2006-03-24 07:15 · Score: 1

Warning, the following is an extremely boring post about very obsolete hardware and software.
Apple ][s were very close to the hardware. The drive controller was just a couple of ROMs and an interface to the cable. The 6502 had to access 4 memory locations in specific orders at a specific rate to pulse the stepper motor that moved the drive arm, another 2 memory locations to turn the rotation on or off, and a few more to read and write data. The code was loaded into RAM on booting up Apple DOS. The boot code in the ROM was very simple: pulse the drive arm motor to move the arm to track 0 (you would hear the arm bounce against the stop 40 times minus whatever track the arm happened to start on), then read sector 0 and jump to the start of that code. That 256 bytes of code could read the rest of track 0 but not move the arm. The rest of track 0 had code for moving the arm either way (the boot ROM only moved the arm one direction), and this was immediately used to read track 1 and 2 and finish loading DOS. Primitive boot strapping.
Data storage on the disk was very wasteful. Each sector had a lead in so the computer could figure out where a sector started, and the lead in for the track had to be bigger yet. Then, partly because drive rotation speeds were not precise (was supposed to be 300 rpm, and the drive had a screw for adjusting the rotation speed in case it got out of spec), Apple/Shugart used an encoding scheme that allocated something like 10 bits per byte so the computer could figure out the byte boundaries within a sector and correct if it missed a bit. Was error correction on a per byte basis. Apple DOS 3.3 improved the encoding over 3.2, allowing 16 sectors per track instead of 13. Sectors were 256 bytes, so the gap between the start and end of a track got very large on the outer edges of the disk.
Apple DOS was slow because it could not process a sector quite fast enough to be ready to read the next sector before the disk rotation had moved the start past the head. 3rd party DOS's were much faster thanks to such things as optimizing the code (DOS 3.3 had some double buffering going on), or interleaving, or spreading the sectors out a little more. Typically, improvements could get 4x the speed. I found I could speed up sequential operations a little more yet by adding in the disk format code a slight delay before changing tracks, because of course the format routine was a little faster and consequently started formatting the next track at the worst possible place, slightly before where the read or write routines would be after accessing the last sector of the last track and moving the arm to the next next track.
So, yeah, a lot of things that slide show code could've done to optimize. I don't know so much about modern drive controller hardware but expect it's all sophisticated enough that one would not be able to find such easy improvements. Separating the work of positioning (5.25" PC floppy drives used a little hole near the center for alignment, 3.5" floppies have a dual purpose cog hole used to grip for rotation and to position the sectors, hard drives all have low level formatting) I think made some things like sector positioning and gap sizes moot.

--
Intellectual Property is a monopolistic, selfish, and defective concept. It is "tyranny over the mind of man"
Re: Apple in the forground again by An+ominous+Cow+art · 2006-03-24 09:23 · Score: 1

Ah, memories. I wish I hadn't given away my coppy of "Beneath Apple DOS"... :-(

Why just one standard? by dltaylor · 2006-03-23 19:29 · Score: 2, Interesting

Competent file system handlers can use disk blocks larger or smaller than the file system block size, but there are some benefits to using the same number for both. Although it may provide more data-per-drive to use larger blocks and you can index larger drives with 32-bit numbers, the drive has to use better (larger and more complex) CRCs to ensure sector data integrity integrity, the granularity of replacement blocks may end up wasting more space simply to provide an adequate count of replacements, and there are still some disk space management tools that insist on working in terms of "cylinders", regardless of the fact that the disk drives have had variable density zones for ages. The range from 4K (common disk block size) to 16K works as a decent compromise.

"Back in the day" running System V on SMD drives, where you could use almost any block size from 128 Bytes to 32K (the CRCs were weak after that) and control the cylinder-to-cylinder offset of block 0 from the index, I spent a few days trying different tuning parameters and found that, due to the 4K size of the CPU pages, and of the file blocks and swap it really did give a significant improvement in performance. I tried 8K and 16K, because the file system handler could be convinced to break them up, but didn't get any better performance, so used 4k for the spares granularity.

Perhaps I should take one of my late-model SCSI drives, which support low-level reformatting, and try the tests again. 16KByte file system blocks on 16KByte sectors might really be a win now. Have to do some research to see what I can do with CPU page sizes, too.

Re:Why just one standard? by Anonymous Coward · 2006-03-23 23:34 · Score: 0

ummmm.... doesn't "standard" mean "just one"?
Re:Why just one standard? by jesup · 2006-03-24 04:53 · Score: 1

On the Amiga, we supported fairly arbitrary sector sizes (though there was a lower limit), and allowed a FS to use N sectors/block. Given the FS structure (hash-chains in directories, block-lists for files, and shadow directory blocks for speeding up directory searches/listing, larger FS blocks sizes (even 1K or 2K) made huge improvements in many common operations. There was little win above that - but those were in the days of 200MB-1GB drives, 3600 RPM (I was beta-testing the first Maxtor higher-RPM ("Magic") SCSI drives at the time).

I vaguely remember that old Apple Macs used some oddball HD sector size like 532; some SCSI drives back then could be low-level-formatted for that.

It's all about Format Efficiency by alanmeyer · 2006-03-23 19:33 · Score: 5, Informative

HDD manufacturers are looking to increase the amount of data stored on each platter. With larger sector sizes, the HDD vendor can use more efficient codes. This means better format efficieny and more bytes to the end user. The primary argument being that many OSes already use 4K clusters.

During the transition from 512-byte to 1K, and ultimately 4K sectors, HDDs will be able to emulate 512-byte modes to the host (i.e. making a 1K or 4K native drive 'look' like a standard 512-byte drive). If the OS is using 4K clusters, this will come with no performance decrease. For any application performing random single-block writes, the HDD will suffer 1 rev per write (for a read-modify-write operation), but that's really only a condition that would be found during a test.

Re:It's all about Format Efficiency by Anonymous Coward · 2006-03-23 23:43 · Score: 0

Some cautions to keep in mind during a transition:
Read-modify-write operations on blocks larger than requested (e.g. 512B block write requested, 4KB block read-modified-written) have reliability implications. Writes assumed to be independent may not be. A write of a 512B block that's interrupted could silently corrupt a larger block.
Re:It's all about Format Efficiency by dfghjk · 2006-03-24 02:51 · Score: 1

This is nonsense
Re:It's all about Format Efficiency by Wesley+Felter · 2006-03-24 04:19 · Score: 1

Maybe you should explain.
Re:It's all about Format Efficiency by nerdbert · 2006-03-26 03:54 · Score: 1

Since you asked (and since I design the things), you gain a lot with better ECC codes, and those codes (which are multi-bit optimized) prefer bigger sectors on which to operate. Further, there are format overheads to sectors. Each sector is preceeded by a small (4-32 byte) training field to allow each sector to get good timing and gain adjustments, as well as to servo correctly over the data. And there is some "slop" in where you can drop the read/write gate signal that you have to allow for since the controllers are *much* slower than the read channels (controllers typically run less than 800 MHz, while channels typically can run up to almost 3 GHz).

How hard do we push density now? How about splitting even today's sectors around servo zones to squeeze out more bytes?

So yes, 4K sectors will help efficiency in many ways. You can use better ECC algorithms to help increase density (bigger sectors allow more correction to be done, which means more error recovery). There will be some physical recovery of space, too, on the hard drives.

Seems good to me. by mathew7 · 2006-03-23 19:37 · Score: 3, Informative

Almost all filesystems I know of use at least 4Kb clusters. NTFS does come with 512 byte on smaller partitions.
LBA accesses on sector boundaries, so for larger HDD's, you need more bits (currently 28-bit LBA, which some older bioses support, means a maximum of 128GB- 2^28*512=2^28*2^9=2^37) Since 512-bytes were used for 30 years, I think it is easy to assume it will not last for 10 more years (getting to LBA32 limit). So why not shave off 3 bits and also make it an even number of bits (12 against 9).
Also there is something called "multible block access" where you make only one request for up to 16 (on most HDD's) sectors. For 512-byte sectors you have 8K, but for 4K sectors that means 64K. Great for large files (IO overdead and stuff).
On the application side this sould not affect anyone using 64-bit sizes (since only the OS would know of sector sizes), as for 32-bit sizes it already is a problem (4G limit).
So this sould not be a problem because on a large partition you will not have too much wasted space (i have around 40MB wasted space on my OS drive for 5520MB of files, and I would even accept 200MB)

Re:Seems good to me. by farnz · 2006-03-23 20:02 · Score: 1

But we've solved the LBA28 limit, by switching to LBA48 (48-bit sector addresses, maximum of 128PiB (131,072TiB) with 512 byte sectors, going up to 512PiB (524,288TiB) with 4k sectors). Multiple block access allows a drive in PIO mode to return up to 16 contiguous sectors; in DMA modes, the drive is allowed to set a limit of up to 128MiB of contiguous sectors. Thus, a larger sector size helps with MSDOS (PIO mode only), but not Windows 2000 or above, or Linux 2.0 or above (or probably Linux 1.2, but I've not investigated that).
The gain is that for large disks, the filesystems already use 4K blocks; having the sector size == 4K just means that the sector size and block size match, so that sector and block numbers match, and the drive can use more efficient ECC to get more usable space.

--
I appear to have a blog. Odd.

Finally! by Zo0ok · 2006-03-23 19:41 · Score: 1, Redundant

Finally! This is what I really wanted for years. Cant believe this innovation has not been materialized earlier. This is great for perfomance, TCO, iPods, everything. I cant wait to get my hands on one of those new goodies!

It can't be. by Myria · 2006-03-23 19:41 · Score: 1

It can't be, at least not efficiently. Like flash devices, it's impossible to write less than a sector at a time.

If this were transparently implemented by the hardware, the OS would frequently try to write a single 512 byte sector. In order for this to work, the hard drive controller would have to read the existing sector then write it back with the 512 bytes changed. This is a big waste, as a read then a write costs at least a full platter rotation (1/7200 second). Do this hundreds or thousands of times per second, and you have a nice slow hard drive.

Melissa

--
"Screw Sun, cross-platform will never work. Let's move on and steal the Java language." - Visual J++ Product Manager

Re:It can't be. by kthejoker · 2006-03-24 01:55 · Score: 2, Informative

Slight pedeantry:

Actually, it's 7200rpm, not rps. You get 120rps, so a platter rotation is actually 1/120 second.

Because of R-M-W by alanmeyer · 2006-03-23 19:42 · Score: 3, Insightful

Want to write a single byte? Then read 4MB, modify 1 byte, and write 4MB back to the disk.

Re:Because of R-M-W by Mortlath · 2006-03-23 20:25 · Score: 2, Informative

Want to write a single byte? Then read 4MB, modify 1 byte, and write 4MB back to the disk.
That's why there is a system (Level 1, 2, and main memory) cache. Write-backs to the physical disk only occur when needed. That doesn't mean that 4MB would be a good sector size; it just means that write-backs are not the issue to consider here.
Re:Because of R-M-W by sverdlichenko · 2006-03-24 03:38 · Score: 1

Read single sector. Modify 1 byte. Write single sector.

It takes same time no matter how big is sector size: 512 bytes, 1K, 4K. It's limited by rotation speed, not data size: SATA is fast enough.
Re:Because of R-M-W by jadavis · 2006-03-24 09:24 · Score: 2, Informative

Many applications require that write cache is flushed.

In his example, let's say it was a text editor. You change one letter in a document, and save it, it must sycnhronously write the sector to disk, to the actual physical media. Otherwise, if the system crashes, you lose it, and most people don't like that in a text editor.

Write cache at the disk level can be very bad. Databases may have no way of knowing that write cache is enabled, and tell you that your transaction is comitted when it's really not. Of course, battery-backed RAID controllers are safe, but the consumer level disks with write cache enabled can mean trouble.

--
Social scientists are inspired by theories; scientists are humbled by facts.

Block size issue by danmcn · 2006-03-23 19:48 · Score: 1

Ok lets all get real for a minute if block size is not consistent how the hell will disk defragment and optimizer soft wear work. The Whole trick to those soft wear is moving the blocks around. Now if you want to loose the ability to fix and speed up your hard drive go for variable block size. But those of us who run multiple large drives want a small fixed block (to save space on things like fonts and e mail) otherwise our drives will quickly become unmanageable. Dan

Re:Block size issue by ocelotbob · 2006-03-23 20:42 · Score: 1

Most drives nowadays are formatted at 4k already because it's faster than a 512 byte format. With the size of drives, the lost space caused by large clusters is not much of an issue these days. Finally, people concerned about disk space could use a file with a tail gathering scheme, like reiserfs, in order to gain more space.

--
Marxism is the opiate of dumbasses
Re:Block size issue by ocelotbob · 2006-03-23 20:46 · Score: 1

Oh, and with inconsistent blocksizes, they're already inconsistent. Some people use 512 byte blocks, some use 32k blocks. The defragger will continue working just as it always has; the OS will take care of things and not create a partition smaller than the sector size.

--
Marxism is the opiate of dumbasses
Re:Block size issue by danmcn · 2006-03-23 20:59 · Score: 1

But all the block sizes have to be the same of the brogram can't move them around
Re:Block size issue by Anonymous Coward · 2006-03-23 21:12 · Score: 0

You should really move away from the computer if you think your programs are called "soft wear". You are neither worthy to use it nor ask questions about it. It's smarter than you, run away quickly!
Re:Block size issue by somersault · 2006-03-23 23:46 · Score: 1

The block sizes would be consistent on any single partition, though I guess there's no reason they couldnt be, but then you'd presumably have to have some kind of lookup that keeps track of the different block sizes for each block, which seems like a waste of space/time. Your email is not likely to be stored in separate small files, more likely to be a PST file or equivalent.

Not thought about stuff this low level for quite a while.. I wonder how many geeks are truly informed as to how the basics of their system work. We did a fairly basic course on OSs where they covered file systems, though now I've forgotten specifics since I've not spent any time worrying about coding up a filesystem :p Just interesting to think that it's possible to, for example, administer a server/network, but when it comes down to it, people these days will be brought up using 'advanced' HCI's etc, and never really have to understand what their machine is doing under the hood so to speak. When working with GBs of storage etc these days, it's quite strange to start thinking on the bytes and sectors level :p Anyway I'd better stop ranting and get back to a semblance of work

--
which is totally what she said
Re:Block size issue by tepples · 2006-03-24 01:53 · Score: 1

people concerned about disk space could use a file with a tail gathering scheme, like reiserfs, in order to gain more space.

My paid-for peripherals have only Windows drivers. Which file system that works with Windows has tail gathering? Does NTFS?
Re:Block size issue by Anonymous Coward · 2006-03-24 03:23 · Score: 0

My paid-for peripherals have only Windows drivers.
That's your problem, dumbass.
Re:Block size issue by tepples · 2006-03-24 05:43 · Score: 1

That's your problem, dumbass.

Then you find me richer parents who can afford to buy the more expensive flatbed scanners that work with operating systems other than Microsoft Windows.
Re:Block size issue by ocelotbob · 2006-03-24 10:03 · Score: 1

Sucks to be you. Sorry, next time, buy decent hardware if you're concerned about FS space.

--
Marxism is the opiate of dumbasses

Boot sector virii by TrickiDicki · 2006-03-23 19:49 · Score: 5, Funny

That's a bonus for all those boot-sector virus writers - 8 times more space to do their dirty deeds...

Re:Boot sector virii by Pharmboy · 2006-03-24 01:16 · Score: 1

That's a bonus for all those boot-sector virus writers - 8 times more space to do their dirty deeds...

Oh great, now my viruses can be bloatware, too. I guess with that much space, they can even install a GUI for the virus, or maybe "Clippy" to keep me distracted while he formats my hard drive.

--
Tequila: It's not just for breakfast anymore!
Re:Boot sector virii by Anonymous Coward · 2006-03-24 02:30 · Score: 0

It's VIRUSES.

You're all complaining about tiny files... by NalosLayor · 2006-03-23 19:54 · Score: 2, Insightful

But really...think about this: if each sector has overhead, then any file over 512 bytes will have less overhead, and you'll effectively get more space in most cases. What percentage YOUR files are less than 4k?

Re:You're all complaining about tiny files... by NalosLayor · 2006-03-23 19:57 · Score: 1

Sorry to reply to my own post, but I'd like to add: TFA makes it sound like the engineers on the comittee made this analysis and concluded that 4k sectors WERE in most cases more efficient.
Re:You're all complaining about tiny files... by BadAnalogyGuy · 2006-03-23 19:58 · Score: 1

On average, you'll have about 2048 bytes of wasted space per sector vs. 256 bytes per sector for the 512-byte sectors. If the filesystem does such things, you may already have 2048 bytes of wasted space per sector anyway.

The problem isn't small files, but files that spillover into another sector and wasting space.
Re:You're all complaining about tiny files... by NalosLayor · 2006-03-23 20:01 · Score: 1

Don't you think that the large # of sectors in the middle of a file outweighs the one at the end?
Re:You're all complaining about tiny files... by BadAnalogyGuy · 2006-03-23 20:15 · Score: 1

It depends on usage, of course. As an extreme example (and extremely simplified), if I have 1,000,000 4097 byte files, on a 512-byte sector disk there will be 511*1,000,000 bytes of wasted space With a 4096-byte sector size, there will be 4095*1,000,000 bytes wasted.

On average, the wasted space will be 256 and 2048 bytes respectively. And I'm only talking about space which is in use but not usable.
Re:You're all complaining about tiny files... by mgblst · 2006-03-23 21:25 · Score: 2, Informative

It is not just files that are less than 4k. It is almost all small files. Think about a 5k file, that now uses 8k. - almost 40% waste. A 9k file uses 12k - about 25% waste. So the more small files you get, the more waste. The larger files you get, the less waste.

Which is good, you don't really want lots of small files anyway.

If you are using windows, you can see how much is space is wasted at the moment, just right click on a directory, and it will tell how much data is in the files, and how much disk space is actually used. It never really gets much.
Re:You're all complaining about tiny files... by maxwell+demon · 2006-03-23 21:46 · Score: 2, Interesting

This is of course only true for file systems which cannot allocate partial blocks.

Of course one effect of the new sector size will be that old filesystem drivers, esp. those which come with old OSs, will likely not be able to use those disks. Which in effect means that if you want to use such a disk, you absolutely will have to upgrade your OS.

--
The Tao of math: The numbers you can count are not the real numbers.
Re:You're all complaining about tiny files... by 91degrees · 2006-03-23 22:39 · Score: 1

What percentage YOUR files are less than 4k?

66.9274%. Why do you ask? :P
Re:You're all complaining about tiny files... by Anonymous Coward · 2006-03-24 03:32 · Score: 0

What percentage of your systems have a cluster size less than 4K?

At 4K NTFS can handle 2TB of data. This cluster size is used in anything above 2GB. When was the last time you had a HD that was smaller than that? I am sure they still exist but they are not that common anymore.

A better question to ask is what percentage of your data is wasted by this.

I have LOTS of little files. However at home I also have 50GB of 2-3GB files.

The reality is across say 5000 files your talking 5 to 10 megabytes of data.

If you are really worried about it build a better file system. You will find the cluster/sector idea is fairly good. You could also archive your files into something like zip/rar/arj/7z if you do not use them much.

I think at this point with 500GB drives we can 'waste' a few meg (and almost all drives we are anyway because of the OS). The speed boost will is well worth it and well documented (look into SCSI RAID arrays). 8x less commands to issue just to get one small file? That means lower latency files. Think of your OS booting faster because the OS doesnt have to issue a bunch of commands to get the bunch of 4K clusters it is going to go get ANYWAY for the all files.

I also just checked 'waste' space on my current drive 200k in files, 'waste' of 40MB. That is out of a 70GB drive. Could I use that space for something else? Sure, but I also have another 30GB free. So I do not think its that big of an issue. So I have 'wasted' .1% of my data.

file size by danmcn · 2006-03-23 20:02 · Score: 0

A 5K font file now uses 8k put 500 fonts on your machine as a designer you see the problem. Email !.2k average file 4k How much mail do you keep. It keeps adding up and you loose massave amounts of space

Re:file size by Anonymous Coward · 2006-03-23 20:14 · Score: 2, Insightful

5 K becomes 8 K.. times 500 ... is a whopping 1.5 MEGAbytes wasted. I mean, that is more than fits on a floppy. What a waste.

Actually a 200 GB drive can still store 25 million files. How many fonts do you have?

FWIW the advantage is in the error correction. For a 1 bit secotro size, you'd need 3 bits to store it with error correction. As the block becomes larger, the error correction becomes more powerful. That is where the advantage is.

Of course data can still be stored byte-wise on the disk - it is only that a small update will require a read-modify-write transaction.
Re:file size by Rob+Simpson · 2006-03-23 20:56 · Score: 1

Uh, don't most email programs store mail as a single large database file, rather than zillions of individual files?
Re:file size by Alioth · 2006-03-23 20:57 · Score: 1

All that is still trivial on even the smallest disks supplied today. In any case, your OS is already using 4K blocks so you're already doing that anyway regardless of the underlying sector size.

--
Oolite: Elite-like game. For Mac, Linux and Windows
Re:file size by Anonymous Coward · 2006-03-23 22:26 · Score: 0

> It keeps adding up and you loose massave amounts of space

What's the difference between tight and loose space? I've been in this business for 35 years, and I've never heard the term "loose space" used wrt storage. What do you mean?

Things don't work that way. by woolio · 2006-03-23 20:05 · Score: 1

If you have to re-issue a read or write command (well, the filesystem would do this) for each 512-byte block, that means that you will spend 8 times more energy (give or take a bit) to read or write the same 4k block of data.

Well sorry, but that's the way it is.

Hard drives generally have the ability to read/write multiple sectors with a single command. (Go read the ATA standards). And DMA is usually used [ program I/O just plain sucks].

I don't see how changing the sector size is going to save power... Either way they have to increase the size of the buffers for the read/write multiple operations. So these could just be increased while keeping 512-byte sectors and the same benefit would result.

Re:Things don't work that way. by BadAnalogyGuy · 2006-03-23 20:10 · Score: 1

Yes, as I mentioned in another reply, the command itself is handled as a single operation, then the data transfer is handled as another operation.

At the DMA level, the HDD hardware will still have to either access the sectors as 8 separate sectors or as 1 large sector, whether this is invisible to the software and CPU or not. The microseconds eliminated from those sector seeks add up.
Re:Things don't work that way. by Antique+Geekmeister · 2006-03-24 00:46 · Score: 1

The differential is even worse on a drive that is fragmented: those 8 blocks may be scattered all over the disk and require additional fascinating head movements to get them all off in order.

There are losses of file space for file systems that contain lots of small files: the maximum space wasted by having only one byte on a block goes from 511 bytes to 4095 bytes, and that will affect available disk space in some systems that use lots of small files.
Re:Things don't work that way. by AlecC · 2006-03-24 02:21 · Score: 1

There would not normally be any sector seeks in a consecutive burst. Any modern drive is capable of reading consecutive sectors (a) in one command and (b) in one pass of the oxide under the heads. 4096 data bytes is 32768 bits. With ECC etc, the total number of data bits which have to be transferred is probably about 38 k. If it were formatted as 8 512 byte sectors, they heads would also have to fly over 7 inter-block gaps, each probably a hundred or two bits long. Also, since there would seperate ECC blocks foir each 512 bytes instead of a single slightly larger one, there would be some wastage there. But the time taken would be about the time for the heads to fly over 40k raw bits instead of 38k raw bits - negligible. And the data storage wold go up comparably.

--
Consciousness is an illusion caused by an excess of self consciousness.

It's so that ECC can handle bigger bad spots by Animats · 2006-03-23 20:05 · Score: 3, Interesting

The real reason for this is that as densities go up, the number of bits affected by a bad spot goes up. So it's desirable to error correct over longer bit strings. The issue is not the size of the file allocation unit; that's up to the file system software. It's the size of the block for error correction purposes. See Reed-Solomon error correction.

Re:It's so that ECC can handle bigger bad spots by danmcn · 2006-03-23 20:08 · Score: 1

error corection will step in to the next available block reread the story

How do you know how your data is actually stored ? by Horus1664 · 2006-03-23 20:07 · Score: 4, Insightful

Modern DASD architecture is almost completely hidden from the user. In the (good?) old days system software needed to interface closely with the DASD and needed to understand the hardware architecture to gain maximum performance from the devices (I know because I work on such systems within an IBM mainframe environment on airline systems which require extremely high speed data access).

Nowadays the disk 'address' of where the data actually resides is still couched in terms that appear to refer to the hardware itself but in 'serious' DASD subsystems (e.g. the IBM DS8000 enterprise storage systems )the actual way in which the hardware handles the data is masked from the operating system. Data for the same file is spread across many physical devices and some version of RAID is used for integrity.

The 4096 value for data 'chunks' has to do with the most efficient amount of data that can be transmitted down a DASD channel (between a host and storage in large systems or the bus in self-contained systems)

The idea of a 'file address' would cease to exist and it would be replaced by a generic 'data address' if it weren't for the in-built assumptions about data retrieval within all current Operating Systems.

Redmond thinks they're so smart... by filterchild · 2006-03-23 20:14 · Score: 3, Funny

Windows Vista will ship with this support already.

Oh YEAH? Well Linux has had support for it for eleventeen years, and the Linux approach is more streamlined anyway!

Re:Redmond thinks they're so smart... by Anonymous Coward · 2006-03-23 21:53 · Score: 0

Drunk?
Re:Redmond thinks they're so smart... by 91degrees · 2006-03-23 21:56 · Score: 1

Well Linux has had support for it for eleventeen years, and the Linux approach is more streamlined anyway!

Actually, it wouldn't surprise me if Linux beats Windows for support. Some manufacturers probably used Linux for development, and what with constant delays for Vista, Linux has an edge.
Re:Redmond thinks they're so smart... by Anonymous Coward · 2006-03-24 01:24 · Score: 0

No, but the moderator might have been. He was trying to be funny (and with some success, I might add) but I'm betting the moderator was busy compiling Gentoo and didn't catch the nuances in the comment and just modded it Insightful in either a stupor or a fit of fanboyism.

Of course Vista will support it... by Anonymous Coward · 2006-03-23 20:17 · Score: 0

It will also come bundled with a universal translator and cold fusion power source (both to be developed circa 2200 AD)

Re:Of course Vista will support it... by Anonymous Coward · 2006-03-23 21:08 · Score: 0

the Ultimate Edition bundled with Duke Nukem Forever?

30 years and now it's bumped up only 8x? by WoTG · 2006-03-23 20:25 · Score: 0, Redundant

Bumping this up makes sense; however, I wonder if 4KB (KiB?) is a bit low. I know that hard drive sizes have gone up way more than 8x. Let's see, I started with a 20MB drive in an XT. My desktop downstairs has a 200GB drive. So, that's about 10,000 times bigger. Intuitively, a 8x bump in sector size seems a bit small. Wouldn't something like 16KB be useful for a few extra years?

Re:30 years and now it's bumped up only 8x? by Alioth · 2006-03-23 21:00 · Score: 2, Informative

All modern operating systems do demand page loading of executables and use paging space on disk (the swapper). Memory pages are all 4Kbyte on all the CPU architectures we are using at the moment in a personal computer. Therefore, 4Kbyte is probably the ideal size (since now loading a page into memory takes only one read command instead of 8). Making it bigger than 8Kbyte would complicate VMM design (since if you only need to load one page, you now wind up loading two and having to throw one away, or at best, you'd wait twice as long while 8kbyte loads instead of 4kbyte).

--
Oolite: Elite-like game. For Mac, Linux and Windows
Re:30 years and now it's bumped up only 8x? by Anonymous Coward · 2006-03-24 00:00 · Score: 0

Therefore, 4Kbyte is probably the ideal size (since now loading a page into memory takes only one read command instead of 8).

Why would you issue 8 512-byte reads instead of a single larger one to read one 4 KB page?
Is any OS really so inefficient as to break up I/Os like that?

Modern OSes try to cluster page reads/writes together, so even single page I/O operations are avoided if possible.

Wrong attribution on your sig by IvyKing · 2006-03-23 20:29 · Score: 1

It was Bill Stout, not Ed Heineman who coined the phrase "Simplicate and add lightness".

Nevertheless, Heineman was still one heck of an airplane designer.

As far as the 8" floppy, ISTR that they were intended to replace punched cards, 77 tracks with 26 sectors (hard coded) came out to be pretty close to a box of 2000 hollerith cards (80 columns with 12 bits per column). 8" drives were available before the end of 1975, and the VAX came out in 1977(?). One of the uses for the flopies was loading the microprogram store on the VAX and IBM machines of the same era.

30 years doing what? by glas_gow · 2006-03-23 20:38 · Score: 1

So after thirty years these guys have come up with the idea of consolidating disk density through an 8x decrease in sector resolution. By now I'd rather hoped the magnetic HD had been replaced. I seem to be repairing and replacing these things a lot lately. I doubt this breakthrough will alter that.

I have my eyes peeled for a bio-drive, something noxious smelling that you feed with potato rinds which stores your data directly in its DNA. What d'you reckon? Another thirty years.

Re:30 years doing what? by Derling+Whirvish · 2006-03-23 21:04 · Score: 5, Funny

I have my eyes peeled for a bio-drive, something noxious smelling that you feed with potato rinds which stores your data directly in its DNA.
That already exists. It's called a "child." Geeks might think they are hard to obtain, but in fact they tend to pop up unexpectedly quite often. They also have an audio interface, are touch-sensitive, run off of bio-mass fuel, and can even do the dishes after they have been around for a few years. They can be attached to a Playstation or an iPod too. When you first get them they are quite noisy and smelly with a few leaks, but that goes away after the break-in period. They don't come with a users manual though. Documentation is sparse. You have to get a third-party handbook.
Re:30 years doing what? by maxwell+demon · 2006-03-23 22:13 · Score: 5, Funny

However, storing data in them can be a lot of effort (there are special institutions to help with that, called schools), and they are known to lose data every now and then. Moreover, there's often quite a bit lateny in reading data, and in some cases even repeated requests might not suffice to get at the data at all. The data reading speed isn't too fast either, and the writing speed is truly horrible. Moreover, they need years to completely start up (although some data can already be written and read during startup time), and they can't be switched off when you don't need them, because they won't restart again. Also, while they have a sleep mode, you cannot simply activate that. Usually it will only work at certain times, and even then they may refuse to go to sleep for quite some time. It seems, however, that many of them can be sent to sleep mode in the evening by sending them special large data streams (so-called bedtime stories). OTOH they must stay in sleep mode for quite some time to function properly, so don't even think of using them in a 24/7 application (although you have to prepare to support them 24/7, since sometimes they spontaneously end their sleep mode at unexpected times, and in that case they tend to demand for immediate maintenance).

All in all, they are not really a good replacement for a hard disk.

--
The Tao of math: The numbers you can count are not the real numbers.
Re:30 years doing what? by kalidasa · 2006-03-24 00:30 · Score: 1

On the other hand, the storage medium is integrated with a really, really powerful (if rather idiosyncractic) processing system, which, while excruciatingly slow at simple problems, is partially self-programming, and is the only system capable of solving AI-hard problems (after sufficient programming).
Re:30 years doing what? by milgr · 2006-03-24 02:46 · Score: 1

Some day, after lots of hard work, they may be able to pass the Turing Test.

--
Where law ends, tyranny begins -- William Pitt
Re:30 years doing what? by riffzifnab · 2006-03-24 03:20 · Score: 1

You totaly didn't mention the great multiplayer co-op game that you have to play to normaly get one of these babies. I hear it has a high replay value but can become quite expensive after awhile.
Re:30 years doing what? by Courageous · 2006-03-24 04:09 · Score: 1

Not to mention the fact that they are the only form of bio media storage that's prone to occasionally putting peanut butter sandwiches in your non bio mechanical storage device readers.

C//

is any body reading the mail by danmcn · 2006-03-23 20:42 · Score: 1

You all miss the point. To answer coward over 1100 plus 1300 active emails mostly text. Under 6k Unlike you I use HFS+. When I copied my font and email over to HFS it grew by over 2gigs, I still don't get the error thing error correction should not be affected by block allocation size. as it just links to the next block. Then we still have the defrag and optimize problem.

Re:is any body reading the mail by gabeman-o · 2006-03-24 03:09 · Score: 1

Perhaps whatever email program you are using should change the way it stores email. Why on earth would you store 1 file per email? If there was only one large file (ie. PST) that contained all of your emails, you could avoid this problem entirely.

File sizes by payndz · 2006-03-23 20:59 · Score: 2, Interesting

Hmm. This reminds me of the time when I bought my first external Firewire drive (120Gb) and used it to back up my 10Gb iMac, which had lots of small files (fonts, Word 5.1 documents, etc). Those 10Gb of backups ended up occupying 90Gb of drive space because the external drive had been pre-formatted with some large sector size, and even the smallest file took up half a megabyte! So I had to reformat the drive and start again...

--
You must think in Russian.

Re:File sizes by danmcn · 2006-03-23 21:31 · Score: 1

Gee you think I might have a clue here
Re:File sizes by marcosdumay · 2006-03-24 08:43 · Score: 1

There is a program out there called "tar". You should take a look into it.

--
Rethinking email

System Pages, RAID, Tail Blocks, and Addressing by KagatoLNX · 2006-03-23 21:07 · Score: 4, Insightful

Actually, this almost can't be anything but a good thing.

First of all, most OSes these days use a memory page size of 4k. Having your IO system page match your CPU page makes it much more efficient to DMA data and the like. Testing has shown that this is generally a helpful.

Second, RAID will benefit here. Larger blocks mean larger disk reads and writes. In terms of RAID performance, this is probably a good thing. Of course, the real performance comes from the size of the drive cache, but don't underestimate the benefit of larger blocks. Larger blocks mean the RAID system can spend more time crunching the data and less time handling block overhead. The fact that more data must be crunched for a sector write is of concern, but I'd bet it won't matter too much (it only really matters for massive small writes, not generally a RAID use case).

Third, (and EVERYONE seems to be missing this) some file systems DON'T waste slack space in a sector. Reiserfs (v3 and v4) actually takes the underused blocks at the end of the files (called the "tail" of the file) and creates blocks with a bunch of them crammed together (often mixed in with metadata). This has been shown to actually increase performance, because the tail of files are usually where they are most active and tail blocks collect those tails into often accessed blocks (which have a better chance of being in the disk cache).

Netware 4 did something called Block Suballocation. While not as tightly packed as Reiser tail blocks, it did take their larger 32kb or 64kb blocks (which were chosen to keep block addresses small and large file streaming faster) into disk sectors and storing tails in them.

NTFS has block suballocation akin to Netware, but Windows users are, to my knowledge, out of luck until MS finally addresses their filesystem (they've been putting this off forever). Windows really would benefit from tail packing (although the infrastructure to support it would make backwards compatability near impossible).

To my knowledge, ReiserFS is the only filesystem with tail packing. If you are really interested in this, see your replacement brain on the Internet.

Fourth, larger sectors means smaller sector numbers. Any filesystem that needs to address sectors usually has to choose a size for the sector addresses. Remember FAT8, FAT12, FAT16, and FAT32? Each of those numbers were the size of sector references (and thus, how big of a filesystem they could address). This will prevent us from needing to crank up the size of filesystem references eventually.

Finally, someone mentioned sector size issues with defragmenters and disk optimizers. These programs don't really care as long as all of the sectors on the system are the same size. Additionally, they could be modified to deal with different sector sizes. Ironically, modern filesystems don't really require defragmentation, as they are designed to keep fragments small on their own (usually using "extents"). Ext2, Ext3, Reiserfs and the like do this. NTFS does it too, although it can have problems if the disk ever gets full (basically, magic reserved space called the MFT gets data stored in it and the management information for the disk gets fragmented permenantly). If it weren't for a design choice (I wouldn't call it a flaw as much as a compromise) NTFS wouldn't really need defragmentation. ReiserFS can suffer from a limited form of fragmentation. However, v4 is getting a repacker that will actively defragment and optimize (by spreading out the free space evenly to increase performance) the filesystem in the background.

I really don't see how this can be bad unless somebody makes a mistake on backwards compatability. For those Linux junkies, I'm not sure about the IDE code, but I bet the SATA code will be overhauled to support it in a matter of weeks (if not a single weekend).

--
I think Mauve has the most RAM. --PHB (Dilbert Comic)

Re:System Pages, RAID, Tail Blocks, and Addressing by danmcn · 2006-03-23 21:23 · Score: 1

Ok this all started over variable block size, which is why I made the comment about, defrag and opt. Next you statement "modern file systems don't need to have defrag or opt run against them on a regular basis" is horse shit. I don't know what kind of shop you work in but if you move large (200 megs or better files) across you machines and multi tetra bite storage systems 100's of times a day on both Mac and Windows you would have not made that statement. I know cause I do. Other than That I kind of agree with you.
Re:System Pages, RAID, Tail Blocks, and Addressing by McSnarf · 2006-03-23 23:43 · Score: 2, Insightful

Forget waste of space in something as small as a sector.
If this is an issue, you use the wrong application - one word file per phone number?
File systems became simpler over time. This is a GOOD THING AND THE ONLY WAY TO GO.
If you try to optimize too much, you end up with something like the IBM mainframe file systems from the 70s, which are still somewhat around.
Create a simple file, called a data set ? Sure, in TSO (what passes for a shell, more or less), you use the ALLOCATE command: http://publibz.boulder.ibm.com/cgi-bin/bookmgr_OS3 90/BOOKS/IKJ4C550/1.7.5?SHELF=&DT=20040721160158&C ASE=
Simple, isn't it ?
Forget complicated file systems, let the hardware handle speed. Ans possibly defragmentation.
Re:System Pages, RAID, Tail Blocks, and Addressing by Anonymous Coward · 2006-03-24 00:02 · Score: 0

Hehe. This might be true if you use a really crappy filesystem. But it's not true for reiserfs (And other modern filesystems too probably. But i don't know them like i know reiserfs). Unless your filesystem is very close to full witch is not a good idea anyway.
Re:System Pages, RAID, Tail Blocks, and Addressing by Brane2 · 2006-03-24 01:41 · Score: 1

I don't see why all the fuss is about. As OP said, about only thing that this might be good for, is ECC.

On the minus side, there will be more wasted bits as a result of having integral number of sectors per track, so on average there will be 1/2 sector=2Kbytes wasted instead of previous 256 bytes/track. But OTOH, smaller number of sectors mean less wasted bits for sector headers and tails...

Everything else is just about the same. Modern PATA/SATA disk transfer data by multisector transfers, so it's about the same if you transfer 16x512bytes or 2x4096 bytes per request, all buffering on disk etc being the same.

Some visible difference is to be expected, but nothing to get excited for...
Re:System Pages, RAID, Tail Blocks, and Addressing by Ziviyr · 2006-03-24 01:44 · Score: 1

Reiser user here. I agree, defragging has its uses.

I've seen many cases where growing a handful of files in parallel causes remarkably reduced read performance on those files.

Further, in the case of a heavily used filesystem, I imagine straightening out the unallocated space would also still be useful.

I don't support the line that I should have to keep a large chunk of my storage empty to maintain my filesystem. (though I typically have 40% free)

--

Someone set us up the bomb, so shine we are!
Re:System Pages, RAID, Tail Blocks, and Addressing by djmurdoch · 2006-03-24 01:45 · Score: 1

This was a very informative post -- thanks.

You did make one error, however:

Fourth, larger sectors means smaller sector numbers. Any filesystem that needs to address sectors usually has to choose a size for the sector addresses. Remember FAT8, FAT12, FAT16, and FAT32? Each of those numbers were the size of sector references (and thus, how big of a filesystem they could address). This will prevent us from needing to crank up the size of filesystem references eventually.

I don't think there was ever a FAT8, but the 12, 16 and 32 are the bit counts of cluster numbers, not sector numbers. The FAT file systems grouped sectors into contiguous blocks, and addressed those. The blocks were usually 1K-32K in size (i.e. 2 to 64 sectors, always a power of 2). They bore no relation to the drive geometry.
Re:System Pages, RAID, Tail Blocks, and Addressing by dfghjk · 2006-03-24 04:13 · Score: 1

There was no FAT8 and the numbers stored were cluster numbers, not sector numbers. A larger sector size would have meant that that smaller FAT clusters were not possible. The largest FAT cluster would still have been 32K due to BIOS PIO limitations.

Larger sector sizes make it harder for RAID systems to implement parallel access designs because single sector interleaves are now 8x larger than before. Such designs aren't that common and those vendors can still get small sector drives if they need them. For independent access, raising the minimum IO size to 4K will reduce the burden on optimizing small IO. RAID hardware vendors can now export large sector sizes themselves and know that the OS can support them.

Frankly, I don't think RAID people will mind 4K sectors too much. It's overall probably better for them but it won't make much difference. The average IO size won't change much if at all, the smallest IO will clearly be 4K but may already be, and the amount of computation won't change noticably. Remember that larger accesses are not needed or necessarily desired in independent access RAID designs. The goal of those designs is to make the RAID interleave some integer multiple larger than the average access size. Like I said, larger sectors probably won't increase the average IO size but, even if it did, it would only change the tuning. The goal of these RAID implementations is lots of parallel IO, not lots of large IO requests.
Re:System Pages, RAID, Tail Blocks, and Addressing by Jeremi · 2006-03-24 05:08 · Score: 1

Forget waste of space in something as small as a sector.
If this is an issue, you use the wrong application - one word file per phone number?

The only reason having "one file per phone number" is considered unreasonable is because (traditionally) filesystems couldn't support that usage pattern efficiently. If you had a file system that could support, say, 20 million 10-byte files in a directory without unacceptable overhead, that would be a perfectly valid design for your app.

Look at it this way: existing filesystems force application developers to reinvent the wheel for every application, by implementing a separate database layer on top of the filesystem. Wouldn't it be nicer if the filesystem itself was powerful enough to act as a basic database on its own?

BeFS and ReiserFS were both moves in this direction, and I think it's a good idea.

--

I don't care if it's 90,000 hectares. That lake was not my doing.
Re:System Pages, RAID, Tail Blocks, and Addressing by Anonymous Coward · 2006-03-24 05:36 · Score: 0

First of all, most OSes these days use a memory page size of 4k.

Well yeah, that's because x86 only supports 2 or 3 different page sizes and the alternatives are way too huge.
Re:System Pages, RAID, Tail Blocks, and Addressing by marcosdumay · 2006-03-24 08:40 · Score: 1

With a 512 bytes sector, a 32 bits sector number can address 2TiB. It is not much nowadays, but increasing the sector size is the wrong way to deal with it. With a 4Ki sector, the capacity becomes 16TiB, what is a bit more, but will become little very soon.

The only solution is getting 64 bit sector numbers. Now, there will be a long time (if ever) until we need something bigger than 64 bits.

--
Rethinking email
Re:System Pages, RAID, Tail Blocks, and Addressing by k8to · 2006-03-24 08:52 · Score: 1

You are confusing interface with implementation. File system implementations have gotten more sophisticated. File system interfaces have gotten simpler.

--
-josh
Re:System Pages, RAID, Tail Blocks, and Addressing by McSnarf · 2006-03-24 10:07 · Score: 1

Nope... The file system in question actually SUPPORTS record structures, multiple access methods, file allocation in cylinders, multiple catalogues etc.
A lot of flexibility to get the most performance out of something that used to cost a fortune. The OS would take care of application data at a record level, if needed, adding an index, allocating for optimum speed or space. Very powerful, very sophisticated, but a pain to use.
Today's file systems offer access to blobs of storage, thanks to abstraction layers. Something that, in the 80s would have been pretty expensive to do. For the stuff the general user is doing, this is by far the best approach.
Allocating storage in units less than 4K doesn't make much sense nowadays.
Example:
I just created a file called "phonenumber.odt". A OpenOffice 2.0 text file containing my name and my phone number.
Size for this is 2.76 KB. It uses 8KB on the disk.
And when it comes to sophistication - remember that in the days of proprietary "minicomputers", with operating systems like VMS, AOS/VS and the like, file systems had features that I'd love to see in current operating systems from an usability point of view.
As an example, SINTRAN-III, a slightly outdated OS, would allow for automatic versioning of files. You would edit the latest version and it would autosave not only the last version but up to 254 older ones in total. Having access to the last three versions of anything (not limited to text!) in the times before version control was really nice to have.
Both the files systems of AOS/VS (Data General) and Guardian 90 (Tandem) allowed for block suballocation and extents, respectively.
Basically, a lot of "new" trends have been there before.
Re:System Pages, RAID, Tail Blocks, and Addressing by runderwo · 2006-03-24 16:48 · Score: 1

First of all, most OSes these days use a memory page size of 4k.
i386 OSes do, since they hardly have a choice.
Having your IO system page match your CPU page makes it much more efficient to DMA data and the like.
IO system page? Give me a break. The IO hardware has no knowledge of what the system page size is, nor does it need to, because unless you are using something like Virtual DMA, the IO hardware is dealing with physical contiguous blocks of memory. The CPU's paging capabilities are completely irrelevant!
Testing has shown that this is generally a helpful.
What testing, please link to it.

--
LRC, the best-read libertarian site on the web

No Change In Performance by Anonymous Coward · 2006-03-23 21:21 · Score: 0

The FAT32 file system already uses a 4KB cluster size or larger. I doubt changing the hard disk sector size to 4KB will signifcantly change the disk performance.

Re:No Change In Performance by danmcn · 2006-03-23 21:36 · Score: 1

Coward I'm beginning to see why you use your handle. Disk block, Disk Sector size and cluster size have nothing to do with each other

Dynamic Partitioning by Whiteox · 2006-03-23 21:36 · Score: 1

There have been some good comments about FS/sectors and such. I think it can be dumbed down to 2 options:

Create a file system and sector size to maximise capacity or.....
Create a FS and sector combo to maximise perfomance (speed).

As far as the defragmentation issue, this could be lessened by creating a 'system managed' partitioning structure that allows file reads and writes only on the drive surface it actually needs: ie a partition that grows. The less mapping it has to do- the faster it is. I really think that the HD logic can really be tweaked on this one.

--
Don't be apathetic. Procrastinate!

Re:Dynamic Partitioning by danmcn · 2006-03-23 21:51 · Score: 1

Dymanac drive allocation? I don't even want to think about the system overhead that would take.
Re:Dynamic Partitioning by x2A · 2006-03-24 02:25 · Score: 1

Modern FS's get around this anyway... a really simple example would be:

|-FS-|-----data-----|-FS-|-----data-----|

so each 'fs' bit contains allocation etc for the following data block, not the whole disc. This has an added extra benefit, of not needing to seek to the beginning of the disc to update FS data when making a write - just to it's own closer fs block.

--
The revolution will not be televised... but it will have a page on Wikipedia

MOD PARENT UP ! by alexhs · 2006-03-23 21:37 · Score: 1

I was going to write the same.

Allocation size is irrelevant as many advanced systems are supporting fragments (however still not implemented in ext2/ext3 :-( ), but sector size matching memory page size can increase performances.

And from a past discussion some people are thinking that the 512 bytes comes from the memory page size of the VAX.

--
I have discovered a truly marvelous proof of killer sig, which this margin is too narrow to contain.

MOD PARENT UP by Anonymous Coward · 2006-03-23 21:40 · Score: 0

OS is already using 4K

Yep. Unless some witless fool alters it from the default to a smaller size, thereby increasing VM overhead.

The comments to this story indicate that most people have no clue what a sector is. They think "cluster" or "block" size has something to do with disk sectors.

Configureable Sector Size by cnvogel · 2006-03-23 22:12 · Score: 2, Insightful

Wow, finally, a new block size, never heard of that idea before.

Doesn't anyone remember that SCSI-drives that support a changeable block-size are around since basically forever? Of course with harddisks it was used mostly to account for additional error-correcting / parity bits, but also magneto-optical media could be written with 512 or 2k (if I remember correctly).

(first hit I found: http://www.starline.de/en/produkte/hitachi/ul10k30 0/ul10k300.htm allows 512, 516, 520, 524, 528 but there are devices that do several steps between 128 and 2k or so...)

Re:Configureable Sector Size by Anonymous Coward · 2006-03-24 02:26 · Score: 0

You are talking about another thing. Even if you could set the block size to more than 512 bytes, there is no way in hell you can do an atomic write on say 2048 bytes. They are talking about the important physical stuff here. The atomic write sector size is and has always been 512 bytes. It is crucially important for file systems and databases. If you write 1024 bytes and cut the power in the middle, the disks can only guarantee that either 0 bytes, 512 bytes or 1024 bytes were written.

You got modded up funny but? by drachenstern · 2006-03-23 23:37 · Score: 1

Isn't there some truth to this? I thought there was currently a 512byte limit on bootsector virus sizes, or at least a 512byte limit to tell the system which and how to execute the next block.

Won't this also affect lilo and the like? Now I foresee all sorts of things needing to be rewritten, so that's why Microsoft knew they wouldn't ship till after Christmas. Wow, they're so clairvoyant! But honestly, the current forms should still work, just not take up all the space, eh? How does this affect the linux boot sector limit? It probably won't.

--
2^3 * 31 * 647

Re:You got modded up funny but? by ArsenneLupin · 2006-03-24 02:51 · Score: 2, Informative

Won't this also affect lilo and the like?
... and it will affect FAT. Not only does FAT use 512 byte sector sizes, but it also makes sure almost the entirety of the filesystem in aligned on an odd boundary of sectors.
(Boot sector is one, so we start off odd right after boot sector. There are usually 2 FAT copies (even), so after FAT offset stays odd. For root directory size, there is usually no compelling reason to make it an even size, however usually Windows makes it an even size anyways, guaranteeing that start of cluster space stays odd).
So, to make a long story short, even if cluster size is a multiple of 4K, this wont help, because it is oddly aligned (meaning that each write of a 4K cluster would always straddle 2 sectors!)
Presumably, Windows will make appropriately parametrized FAT systems once these disks become available, but there will be implications when restoring old FAT images on the new drives.
BIOSes will also need to deal with these disks, or how will you be able to boot if you replace your old PC's hard disk with a 4K sector disk, while still keeping the old motherboard?
And even if the BIOS can deal with it, forget about dd'ing your old system over to the new disk, because of the FAT issue mentioned above.
Re:You got modded up funny but? by WWWWolf · 2006-03-24 03:37 · Score: 2, Interesting

Well, current Linux bootloaders probably deal with lack of space just fine. For example, GRUB installs itself as 512-byte stub loader ("stage 1") + the rest of the boot loader stored in an ordinary file in the filesystem ("stage 2"). I don't think GRUB's design will change much: It's meant to be so that stage 2 and the menu.lst can be updated without touching the boot block, anyway.

And it's probably not the OS or boot loader that sets limits to the boot block size, it's probably the BIOS that loads the stuff to memory...
Re:You got modded up funny but? by InfiniteWisdom · 2006-03-24 07:17 · Score: 2, Interesting

You could easily have a "compatibility" mode where the interface returns 512 byte blocks even though its stored internally as 4096-byte blocks. You'd sacrifice performance, of course, but that probably not a huge issue when you're running legacy systems on newer hardware.
Re:You got modded up funny but? by drachenstern · 2006-03-24 17:42 · Score: 1

but surely a bios that recognized a pata 512 and a sata 4k could still read from both drives, b/c of different h/d i/o controllers being put in place. many boards already have those. then you would need a o/s that would recognize both drives that could be iso bootable for the purpose of transferring all files from the one drive to the other. this is not outside the realm of the possible for the /. crowd. so really what is at stake is that if you buy one of these new drives, you at least need a new mobo, or, depending on the complexity of the mobo you have, at least a rom flash.

cost of new h/d: $400
cost to use new h/d: $1200

sign me up!! I already don't have enough money as it is.

--
2^3 * 31 * 647

Size != storage by tomstdenis · 2006-03-23 23:39 · Score: 2, Informative

You're all missing one key point. Your 512 byte sector is NOT 512 bytes on disk. The drive stores extra track/ecc/etc information. So a 4096-byte sector means less waste, more sectors, more useable space.

Tom

--
Someday, I'll have a real sig.

Re:Size != storage by Anonymous Coward · 2006-03-24 02:06 · Score: 0

Why can't this change be transparent and let drive firmware/hardware does the translation?
Just have a logical sector addressing scheme at 512 Bytes, but the underlining data structure
stored on the HD can be any size. I don't even care if it is the entire track.
It is not that difficult to do that.

Need I remind you that the TCPIP packets are segmented and reassembled...
This case is a lot easier as the logical sectors are smaller.
Re:Size != storage by Gothmolly · 2006-03-24 02:06 · Score: 1

No it means FEWER sectors, and therefore less waste. Like enabling Jumbo Frames on GigE. If the cost to create or analyze a sector or frame is constant (since CPU power >>>> HD read rate) then increasing the payload size (frame size or sector size) should provide higher throughput.

--
I want to delete my account but Slashdot doesn't allow it.
Re:Size != storage by tomstdenis · 2006-03-24 02:23 · Score: 1

the ECC and sector alignment are done in HARDWARE. If you wanted a literal "bit level" access to the device it would be wickedly slower.

Tom

--
Someday, I'll have a real sig.

Crash course in ATA by Anonymous Coward · 2006-03-24 00:25 · Score: 0

There are two parts to an ATA operation.

The first is the command. This is accomplished by filling in the fields of a data structure called a "taskfile". It contains the 48-bit LBA to begin the operation, the number of sectors to read, the ATA command to issue (e.g. read or write), and other information which is pertinent to the command. This taskfile is then written to the taskfile registers of the ATA drive. The drive reads the taskfile registers and begins the operation specified by the command. A multi-sector command is easy to make, and it is the norm to make multi-sector requests.

The second part is the data transfer (for data transfer commands). This consists of the drive data being transferred in 1 sector blocks. If DMA is enabled, the data is transferred automatically. If PIO is used, the data must be read from the ATA data registers one block (16 or 32-bit data chunks) at a time. Either way, the sector must be read one at a time. When the sector is transferred, the drive checks the data for coherence (no errors) and reports an error or not in the ATA error register.

With a larger sector size, the number of error checks and sector head seeks is reduced. This results in a higher speed data transfer due to the elimination of those things.

Re:Crash course in ATA by Anonymous Coward · 2006-03-24 01:27 · Score: 0

With a larger sector size, the number of error checks and sector head seeks is reduced. This results in a higher speed data transfer due to the elimination of those things.

The error checking is done in the drive, essentially at "wire speed". I don't see this being a bottleneck in any ATA or SCSI drives I'm familiar with.

If file system I/O is always done at a FS block size of 4 KB or larger, or if VM pages are 4 KB or larger or page file clustering is used, then these units will be contigious on disk and I/O is already being done in chunks that are multiples of 4 KB. I don't see where the reduction in the number of seeks would come from if you increased the disk sector size up to 4 KB.
Re:Crash course in ATA by Anonymous Coward · 2006-03-24 01:43 · Score: 0

The state of the art is such that what you say is true, in practice. However, there is no requirement in the ATA spec that the hard disk either do the error checking in real time (though it is difficult to think of any other good way to do it) or to process multi-sector commands in sequential data blocks on the physical disk. As was discussed earlier in the thread about interleaving, the actual access order of each sector may not necessarily be in order since the disk head may be more ready to read sector 3 instead of sector 2 in a series. That practice alone shows that there is a non-negligble (from a HW perspective, it's all microseconds to us) lag between each sector, even in multi-sector block accesses.
Re:Crash course in ATA by x2A · 2006-03-24 02:13 · Score: 1

Don't they read the whole track into a cacheline anyway?

--
The revolution will not be televised... but it will have a page on Wikipedia

GRUB and LILO by Mike+deVice · 2006-03-24 00:28 · Score: 1

I really don't know much about how drives store data. So this may be a really stupid question. But do larger sectors also mean the boot sector? Is this good news for boot loaders?

Re:GRUB and LILO by SirTalon42 · 2006-03-24 07:42 · Score: 1

Grub probably won't care cause it only stores a stub in the boot sector (stage1), and the rest of it on the file system (/boot/grub/* or /grub/* generally), I'm not sure how LILO works so I can't answer that question, though most likely it won't make much of a difference either.

512 bytes/sector only by Anonymous Coward · 2006-03-24 00:30 · Score: 0

Did you know you can't access a disk with a sector size diffrent from 512 in Windows, any windows (9x, NT). It can not be donne with Windows API. You have to call BIOS interrupts in Windows 9x, thus running in real mode, or use a driver like http://simonowen.com/fdrawcmd/fdrawcmd.sys in NT. 30 years ago, CP/M OS could use any sector size...

OK, I'll ask.... by Anonymous Coward · 2006-03-24 00:47 · Score: 0

Geekzone tells us that IDEMA (Disk Drive, Equipment, and Materials Association)

WTF does the "I" stand for?

Re:OK, I'll ask.... by x2A · 2006-03-24 02:28 · Score: 2, Funny

It's like the film... I, DEMA... about an intelligent disk drive who err... needed to save the world *cough*

--
The revolution will not be televised... but it will have a page on Wikipedia

Old 512 byte sector? What about my 336 bytes? by Eunuchswear · 2006-03-24 00:59 · Score: 1

1983, trying to convince the CDC engineer that yes, I did want him to configure the disk for 336 byte sectors.

Ah the joys of using a Harris 24 bit word/8 bit byte/112 word disk sector machime.

--
Watch this Heartland Institute video

Been there, done that by sloepoke51 · 2006-03-24 01:25 · Score: 1

Back it the late 70's (1979?) Digital Research of CP/M fame, provided the same capability in CP/M 2.0. They called it their Sector Blocking / Deblocking algorithm. As you increase the sector size, which has NOTHING to do with the minimum allocated size the OS uses, you get more disk space per track. I played around with sector blocking / deblocking on my (still functioning) Thinker Toy's 2D disk controller (double density floppy disk controller). I was able to increase the size of the sector from 256 bytes to 1024 bytes which gave me extra space. I don't remember the differences now, but disk space went from something like 490 KB to 596 KB for a single sided 8 inch floppy. A few years later, I had access to the Shugart 14 inch Winchester Hard disk, which at 256 bytes per sector gave 20 MB of space. The drive allowed bigger sectors and I played around with 1024 byte sectors, and it gave over 26 MB of disk space.
Since the OS determines the size of the minumum number of "logical" sectors per allocation unit, this determines how efficient the file system is. Also big files like big allocation units, while small files like small allocation units. It's just a trade off for speed against performance.

LBA by x2A · 2006-03-24 01:33 · Score: 1

"The operating system does/should not "know" anything about how the data is physically stored by a device"

You're talking about LBA, but that only applies to cylinders/heads. The OS does map to the sector (eg, file inode stored at sector 12345 from the partition beginning, which says that file begins on sector 23123 etc). If it didn't use sectors, it would need an extra 7 bits to store the location of everything within the filesystem.

The filesystem also communicates with the driver using sector numbers. It's only when you reach the 'file' abstraction level (either IO calls or memory mapped) that you switch to using bytes.

Although modern FS's will share a block for small files (or the tails of files), I don't think they do this for the actual FS data structures, which I'd guess are block quantized, so they would need to be aware of the block change to make use of the rest of the sector (either storing more info per inode, or more than one inode per sector).

--
The revolution will not be televised... but it will have a page on Wikipedia

Re:LBA by jesup · 2006-03-24 05:10 · Score: 2, Interesting

Not all operating systems use block/sector numbers at the device-driver level (and there are good arguments against it, though most OS's do it).

The Amiga used byte-offsets and lengths for all IO's. This did eventually cause problems when disk drives (which started at 10-20MB when the Amiga was designed) got to 4GB, but a minor extension allowing 64-bit offsets solved that. 64-bit offsets shouldn't overflow very soon....

For the device driver, it's no big deal to shift the offset if the sector size is a power-of-two, and it allows for weird-ass devices with non-power-of-two sector sizes (like old MAC SCSI drives), devices without a sector paradigm, etc all using the same API. Thus you can mount a 2048-byte block FS on a 512-byte sector device without knowing or caring; you can (with a cooperative device driver) mount a 512-byte FS on a 2048-byte sector device (if the device is willing to accept arbitrary-offset transfers, which they can, though it hurts speed), or mount a block-oriented FS on a bytestream-oriented device (like a file...).

Not on NTFS by Anonymous Coward · 2006-03-24 01:43 · Score: 0

NTFS doesn't necessarily allocate a cluster for each file. Small files are stored entirely in the MFT record. You probably meant so me other filesystems.

I'm talking crap by x2A · 2006-03-24 01:49 · Score: 1

sorry, early-after-waking slashdot post :-p

The filesystem does communicate with the driver with sector numbers, but it uses its own block size for addressing, and then shifts the address to get the sector number.

Scratch pretty much all else I said!

--
The revolution will not be televised... but it will have a page on Wikipedia

How will du work now? by Anonymous Coward · 2006-03-24 01:52 · Score: 0

How will du work now?

To combat this... by x2A · 2006-03-24 02:02 · Score: 1

...the new bootstrap loader for Vista will be a mini VBScript interpreter... and built into the shell... oh those clever MS folk

--
The revolution will not be televised... but it will have a page on Wikipedia

Let's go back to 1986 by metamatic · 2006-03-24 02:05 · Score: 1

It was done with hard drives too. Back around 1986 I was working in data recovery. Tandon computers used to sell MS-DOS machines with 1KiB sectors. They ran a specially modified version of DOS.

The problem, of course, was that people wanted to upgrade to the latest MS-DOS from Microsoft. So they would replace Tandon DOS with MS-DOS, and suddenly their entire hard drive would be scrambled.

And then they'd call us.

--
GCHQ Quantum Insert installed. If only our tongues were made of glass, how much more careful we would be when we speak

Statistics... by x2A · 2006-03-24 02:07 · Score: 1

Statistically speaking, on an FS that allocates whole blocks, the waste space will be the block size * half the number of files on the drive.

Lets not kid ourselves, this is mostly gonna be useful for the drive we store our movies on ;-)

--
The revolution will not be televised... but it will have a page on Wikipedia

You talk like this is new. by Inoshiro · 2006-03-24 02:08 · Score: 1

Reiser is not the first file system with this idea:"Third, (and EVERYONE seems to be missing this) some file systems DON'T waste slack space in a sector. Reiserfs (v3 and v4) actually takes the underused blocks at the end of the files (called the "tail" of the file) and creates blocks with a bunch of them crammed together (often mixed in with metadata). This has been shown to actually increase performance, because the tail of files are usually where they are most active and tail blocks collect those tails into often accessed blocks (which have a better chance of being in the disk cache)."

Most filesystems don't, and haven't for decades, wasted these final blocks: "With larger block sizes, disks with many small files would waste a lot of space, so BSD added block level fragmentation, where the last partial block of data from several files may be stored in a single "fragment" block instead of multiple mostly empty blocks."

The performance boost is that we can store small dot files together in 1 block, and (with readahead) speed up things like logins and other operations that read in this small set of data.

FAT and FAT32 couldn't handle this, though (nor many other FS features). While I haven't studied NTFS in the detail that I've studied UNIX file systems, I doubt they don't have support for this. NTFS as of 5.0 supports pretty much every feature of a modern UNIX file system.

--
--
Internet Explorer (n): Another bug -- that is, a feature that can't be turned off -- in Windows.

You used to be able to do this manually by AlecC · 2006-03-24 02:12 · Score: 1

About 15 years ago, when the 3.5 in HDD held 500Mbyte, you could reformat your SCSI disk to get an optimim sector size. The disks I was using handled, I think, any even sector size from 128 bytes to 4096 bytes.

Because the disks were so low capacity, you wanted to use every byte, so I reformatted the disks to an optimum sector size for my application, which was about 1812 bytes IIRC. This achieved about 5% extra useful data on the system.

I think there was at one time a need for 2040 byte sectors for the IBM System/38, which had a 33rd bit on each word, which had to be saved to disk.

When the generation of disk changed to 1Mbyte, the controllers had an error one in every few tens of thousands of reads: it simply never completed the transaction. It never happened with 512 byte sectors, and the drive manufacturer only tested at 512 bytes, so we switched back to 512 and have been there ever since.

--
Consciousness is an illusion caused by an excess of self consciousness.

Tail Backing by Anonymous Coward · 2006-03-24 02:24 · Score: 0

Windows really would benefit from tail packing (although the infrastructure to support it would make backwards compatability near impossible).

And here I thought Windows user been taking it up the ass for a long time!

30 years smilling. by Anonymous Coward · 2006-03-24 02:30 · Score: 0

"...and is the only system capable of solving AI-hard problems (after sufficient programming)."

Like telling good jokes.

History of the 512-byte Sector Size by John_Sauter · 2006-03-24 02:48 · Score: 2, Informative

In 1963, when IBM was still firmly committed to variable length records on disks, DEC was shipping a block-replacable personal storage device called the DECtape. This consisted of a wide magnetic tape wrapped around a wheel small enough to fit in your pocket. Unlike the much larger IBM-compatible tape drives, DECtape drives could write a block in the middle of the tape without disturbing other blocks, so it was in effect a slow disk. To make block replacement possible all blocks had to be the same size, and on the PDP-6 DEC set the size to 128 36-bit words, or 4608 bits. This number (or 4096, a rounder number for 8-bit computers) carried over into later disks which also used fixed sector sizes. As time passed, there were occasional discussions about the proper sector size, but at least once the argument to keep it small won based on the desire to avoid wasting space within a sector, since the last sector of a file would on average be only half full.

Planned obsolesence... by Valleye · 2006-03-24 03:01 · Score: 1

From the article, "Backward compatibility with existing 512-byte products, both in hardware and in software, will be defined and accommodated during the phase over period."

I hope that this backward compatibility remains for a very long time or archives that are stored on older drives will be lost eventually.

I know that the expected life of data on magnetic drives is not all that long but it does not mean we do not try to recover ancient data. I still pull stuff from reel tape using dd from over 10 years ago. Migration is a solution but I am too lazy to do it.

Sector size isn't always power of two by milgr · 2006-03-24 03:04 · Score: 1

Usually, sectors are a power of two, but not always.

In my first real job, I worked for Prime Computer. They stored a bit of data along with each filesystem 512 byte block. If I remember correctly, they stored a forward and a backward link. If the filesystem was corrupted, they could restore many of the files because of the links in the data portion of the disk. This led to 528 byte sectors (if I remember correctly).

Having a weird sector size limited the manufacturers who would supply disks.

This decision also made file deletion slow -- as every block of a file was re-written.

The filesystem supported Several file-organization types. The two most prevelant were sequential access and dynamic access. Sequentlial access files had a pointer from the directory to the file. The file had forward and backward links at the head of each block. To read the 300th block, it would be necessary to read the first 299 blocks before reading the 300th block. To erase the file, each block would need to be read (to obtain the forward link to the next block), and then written.

Dynamic access files would have a table of pointers to disk block - residing in the first blocks of the file. Normal reads wouldn't see the data in the first block directly.

Amazing what cruft sits in my brain after over a decade.

--
Where law ends, tyranny begins -- William Pitt

I, for one... by Zaatxe · 2006-03-24 03:04 · Score: 0, Redundant

... welcome our new 4Kbytes sector overlords.

--
So say we all

Clippy Virus by Hoi+Polloi · 2006-03-24 03:05 · Score: 1

"I see you are trying to use your computer, would you like me to use it as a spam server?"

--
It is by the juice of the coffee bean that thoughts acquire speed, the teeth acquire stains. The stains become a warning

Extending the SCSI hard limit by January's+Child · 2006-03-24 03:24 · Score: 1

About time. SCSI has a 32-bit block number built into the protocol. With 512 byte blocks, that a 2 TB architectural limit. We can expect to be able to buy a 2 TB disk off the retailer's shelf in another two or three years.

Clueless Apple fanboys again by labratuk · 2006-03-24 03:28 · Score: 1

We're not talking about filesystems. If we were I can remember choosing my ext2 block size anything between 1k and 4k about six years ago. Apple is easily predated here.

--
Malike Bamiyi wanted my assistance.

Correction by dfghjk · 2006-03-24 03:30 · Score: 1

When using DMA it doesn't really matter since the whole transfer is performed by the drive. When using PIO, software can use MULTIPLE MODE versions of read and write commands to transfer more than one sector at a time. This results in better performance only because of host software and interface issues. It's not a hard drive internals thing. Multiple mode was added to IDE in the late 80's. You're a little out of date.

Changing the sector size will in no way reduce "the number of error checks and sector head seeks (sic)". It increases the density of user data on the disk with essentially no downside. PC's have not been able to use different sector sizes in the past since BIOS'es assume 512 bytes. It's about time that got fixed.

Forcing a 4K sector size means that the smallest IO will now be 4K. Disk and storage controller vendors will like that I'm sure. Designing high performance storage while worrying about 512 byte performance is ridiculous. Caches no longer have to worry about small sectors. Hoorah!

SATA and silent data corruption by Anonymous Coward · 2006-03-24 03:38 · Score: 0

So, some of the boys here have indicated that SATA disks, by design, can undergo silent data corruption - esp. when used in raid arrays.

Anyone know anything about this?

Re:SATA and silent data corruption by DigiShaman · 2006-03-24 05:59 · Score: 1

Anyone know anything about this?

Ya, it's BS and totally untrue. I've got two Hitachi drives in RAID-0 using the built in Promiss controller on my MB. I've never had a single problem.

Data in RAID-0 is twice as likely to get corrupted however if you have bad memory or memory timing. This happens a LOT. Only way to ensure data is getting read from and written too correctly is to make sure the bits aren't flipping in RAM. Run Memtest 86 Plus to verify. Also, people like to mount two hard drives next to each other without proper cooling. A drive that overheats will cause all sorts of controller problems and even stress on the barrings (spindle and actuactor arm).

--
Life is not for the lazy.

Yeah... by ichigo+2.0 · 2006-03-24 03:40 · Score: 1

Yeah, but does it run Linux?

Not All files are eaqually divisible by Anonymous Coward · 2006-03-24 03:49 · Score: 0

One thing people are forgetting is that you will always waste disk space with all files. All files are NOT divisible by 4096. There will be some remainder which will waste the remaining sector.

You would lose less disk space if you use smaller blocks. The larger the block more disk space will be wasted even if you have only large files and avoid small files.

Re:Not All files are eaqually divisible by fishbowl · 2006-03-24 04:33 · Score: 1

It's a tradeoff between storage granularity and potential for speed. If a system can read or write more in a given cycle, that can be exploited for performance. A big sector or cluster size can mean much better performance for some types of data, specifically, A/V media. The sector size on a low level format also dictates the way the firmware is written, and a bigger sector size might mean an order of magnitude more address space for the controller; I wonder if that's becoming some kind of bottleneck for the manufacturers? (I didn't RTFA, it was slashdotted.)

--
-fb Everything not expressly forbidden is now mandatory.

512 bytes vs 4096 bytes by DragonTHC · 2006-03-24 04:07 · Score: 1

most modern disk utils let you choose between 512b 1k 2k and 4k sectors

what we will start to see here is our inodes will fill up faster than our disks.
or, on NTFS our $MFT will fill up past the files.

I format all my drives with 512 byte sectors.
this means more sectors on a disk than with 4096 byte sectors.
if you have a lot of small files you should use 512 byte sectors.

the caveat is more sectors means more to go wrong.

--
They're using their grammar skills there.

Re:512 bytes vs 4096 bytes by KingMotley · 2006-03-24 05:32 · Score: 0

No. Most modern disk utils let you choose between 512b, 1k, 2k, 4k, 8k, 16k, 32k, or 64k CLUSTER sizes. All current drives come pre-formatted from the factory (ATA, SATA, and SCSI), and you can't what is called "low level format" the drive any more. What the OS calls "Formatting" for hard disks is no more than writing the file system structure onto the disk (And writing 0's to all unsused portions of the disk, checking for bad areas). This change isn't really that big of a deal. You'll get a slight bit of performance increase, and a little bit better disk space (Not all that much). Since most people use the default cluster size of 4k now anyhow, NO EXTRA SPACE WILL BE WASTED. And to the guy who complains that he wants to store 2 billion 13-byte files on his drive, well, those 13-byte files won't be affected anyhow since NTFS stores them directly into the directory structure. It's the files that are 128 bytes to 256 bytes that waste the most space. And if you have THAT many small files, you've got other issues. Buy old drives, use a different file system (Like FAT32), or just get more drive space.

Re:Vista by InsaneProcessor · 2006-03-24 04:13 · Score: 1

Vista will also ship with support for holographic storage, 3D-free air holographic display drivers, and cerebral direct interface.

Vista, like Visa, will leave you further in debt and miserable about it.

--

Athiesm is a religion like not collecting stamps is a hobby.

Not always true.... by thatseattleguy · 2006-03-24 04:32 · Score: 1

Larger sectors are better when you have larger files, smaller sectors are better if you have lots of tiny files. a 1B file will consume one full sector,

Not always the case. In many "modern" file systems (jfs, ReiserFS, xfs(?), and likely others), small files can be stored directly within the directory structure of the file system itself - eg, as part of the B*-tree that forms the file index. So you'll actually have a number of tiny files together taking up only one disk sector. Obviously that's not possible on older filesystem architectures (ext2/3, FAT, etc) where each inode (or the moral equivalent) points directly at the unique start sector for the file.

Don't know how NTFS works - someone more knowledgeable can chime in here.

--
/tsg/

Re:Not always true.... by networkBoy · 2006-03-24 05:35 · Score: 1

NTFS4 behaives like EXT2/3 FAT in this regard. Don't see why NTFS5 would be different.

-nB

--
whois gawk date unzip strip find touch finger mount join nice man top fsck grep eject more yes exit umount sleep dump
Re:Not always true.... by Anonymous Coward · 2006-03-24 06:50 · Score: 0

Maybe because they, you know, added features for a new version?

Actually small files are important by spitzak · 2006-03-24 05:01 · Score: 1

Efficient support for very small files would allow a lot of crap on Windows/Linux (such as the Registry and the Gnome copy of it) to be eliminated, and allow all "metadata" (things like the artist in a song) to be stored as files. Read up on the ReiserFS plans. Ideally 99.99% of the files on a disk would be less than 100 bytes.

However this is best solved by writing the filesystem to put these small files into the blocks with other data, such as many small files together, or inserted directly into the directory. Correctly done you would get a bunch of small files at once with a single read, and since use of this would probably need to look at many at once, this could be very efficient. In any case larger sectors are harmless.

SCSI has done it for ages by jbevren · 2006-03-24 05:38 · Score: 1

I have some 5+ year old scsi disks that can be low-level formatted with 4k sectors. This is nothing new, aside from the 'standard' increasing the sector size.

Memory serving, the scsi-2 spec also allows sector sizes up to 16k or more during a format, if the drive supports it.

All you GiB versus GB folks this is your chance... by Anonymous Coward · 2006-03-24 06:54 · Score: 0

...to be dicks and point out in a pedantic but historically and logically incorrect way that GB doesn't mean multiples of 1024 but 1000.
http://www.ss64.com/docs/bytes.html

So according to you, my 300 GB hard drive is really 300 GB even though common sense, the OS, history, and the 512 byte sector argue is isn't.

So c'mon, let your inner dick flap in the wind, along with your jaws. Please. Pretty please.

But first ask IDEMA to make the sectors 4000 bytes in length. Otherwise I'll conclude you are a lead-paint licking (mom's) basement dweller with nothing helpful to say ever. Or possibly someone working for the marketing department of a hard drive manufacturer, which would be worse.

Improve Existing Technologies by Nom+du+Keyboard · 2006-03-24 07:18 · Score: 1

Will this improve the storage on existing drives simply by removing 7/8's of the sector headers and inter-record gaps?

Or will it hurt overall storage because a Bad Sector now requires a full 4KB spare?

--
"It's the height of ridiculousness to say for those 9 lines you get hundreds of millions."

Except XFS? by Makoss · 2006-03-24 08:36 · Score: 1

Physical Disk Sector Sizes Supported

512 bytes through to 32 kilobytes (in powers of 2), with the caveat that the sector size must be less than or equal to the filesystem blocksize.

http://oss.sgi.com/projects/xfs/

--
Building a better backup.
Zettabyte Storage

Will we really hit the sector limit anytime soon ? by billcopc · 2006-03-24 08:38 · Score: 1

Bumping up the sector size from 512 to 4096 means we can access disks 8 times larger without widening the address registers. We already have 48bit sector addressing which yields up to 128 petabytes PER DISK. 10 years ago drives were 1/1000th the size of current models, perhaps in 10 years we will be seeing drives approaching the petabyte range.. perhaps not. Either way we don't need enlarged sectors for capacity reasons yet.

If this helps to reduce overhead in the design and manufacturing of hardware, then be my guest! What I would really love to see though, is more speed. Capacity grows several orders of magnitude faster than speed, and it is a significant bottleneck for most data-intensive jobs. That's why we have RAID in desktop rigs. Why not implement a black-box raid-like solution for future hard drives ? To hell with form factors, give me a 5.25" height hard drive that boasts 2 terabytes across 8 platters running in parallel, just like a bigass raid-0 stripe, but transparent. 8 platters times 512 bytes per sector = 4096 byte striped sectors. The real advantage would be the 400mb/sec sustained transfer rate (or better). That sort of performance leap would warrant a new interface. SATA is convenient, but we went from 133mb/sec with ATA-133 to 150mb.. coitus interruptus ?

--
-Billco, Fnarg.com

Wasted Space by nurb432 · 2006-03-24 13:27 · Score: 1

Wouldnt this waste a *lot* of space for items like shortcuts, weblinks, C modules..

I realize that drives are getting bigger and bigger, but is that an excuse to waste ?

--
---- Booth was a patriot ----

Re:Wasted Space by Script_God · 2006-03-24 17:09 · Score: 1

It's already being wasted. Almost every drive formatted with NTFS (or any other file system for that matter) will have a cluster size of 4KB -- 8 physical sectors. NTFS won't give less than 1 cluster to a file (very small files are an exception, they're stored in a more efficient manner already), so you're allocating 4KB to the small file to begin with. Changing the size of the sector will have no effect on the size of the cluster, it will just reduce the number of sectors in a cluster.

Hope they can get a new partition table format in by Krellan · 2006-03-24 13:46 · Score: 1

Good news. This will be nice to have in software, a 4096-byte sector. As others have said, this will exactly match the cluster size used by most modern filesystems, and the page size used by the Intel x86 architecture. This happy coincidence will mean that operating systems can just do a 1-to-1 read/write, and not need to waste time blocking/deblocking.

Isn't this already done in drive hardware? I thought, that in order to save physical space on the disc surface by reducing inter-sector overhead, the drive already internally uses much larger sectors than the current 512-byte standard. The disc controller just takes care of this automatically, doing the blocking/deblocking transparently. It reads/writes the larger sectors just fine, but just serves 512 bytes at a time to the computer, buffering the rest.

From a software point of view, 4096-byte sectors will be nice. I hope they take the time to get in a new partition table format! Drop the obsolete CHS fields, as they've been maxed out for a long time now. As for the LBA fields, widen them to 64 bits, so that there's plenty of room for the future. With 512-byte sectors and the current 32-bit LBA fields, the maximum is 4G sectors, 2TB. With RAID becoming popular these days, this limit is easily reachable! Going to 4096-byte sectors will push this limit back to 16TB, a good thing, but widening the fields to 64 bits will really push this limit out of sight. 72ZB. Maybe that will even be enough for Google? :)

--

Dr. Demento On The 'Net!

Re:Will we really hit the sector limit anytime soo by Down8 · 2006-03-24 14:47 · Score: 1

And SATA2 has already doubled to 300mb, a la 33mb > 66mb > 100mb > 133mb of ATA. That leaves 450mb & 600mb ready for the future, and possibly more (I don't know enough about SATA to specualte).

Short story: the comparison isn't 133mb to 150mb, but 33mb to 150mb (first gen of each).

-bZj

--
.sig

Re: Atomic Units by some+guy+I+know · 2006-03-24 18:01 · Score: 1

a 4MB block would be the smallest atomic unit you could write on a disk.

When oh when will the disk drive industry embrace ecological alternatives, such as solar units or hydro-electric units?

--
Those who sacrifice security to condemn liberty deserve to repeat history or something. - Benjamin Santayana

Re: Atomic Units by Anonymous Coward · 2006-03-24 21:14 · Score: 0

Hah. :p

Well yeah, talking of eliminating atomic units...

Might be nice if they could just drop that HDD block thing and allow to write anywhere on the disk. Ex, maybe with the read part of the I/O heads built in front of the write part, so that when updating part of a physical block, it could just read the unknown bits and overwrite them right after "on the fly" and compute the trailing CRC and ECC very fast. The old content could be saved in a buffer incase the CRC fails to use it with the ECC. Or the write part could be far enough that it has time to read the whole block before starting to overwrite it so that it could retry to read on errors.

But I dunno that much about HDD; there's probably a good reason it's not made this way.

Because... by JamesGecko · 2006-03-25 09:34 · Score: 1

"4096-bytes should be enough for anyone"

Re:Will we really hit the sector limit anytime soo by billcopc · 2006-03-26 15:40 · Score: 1

Shall we compare to the lovely rates of ATARI hard disks ? I had a fun little 20mb CoolDisk that was probably in the 100's of kb/sec range.

I don't think there's much point in comparing first gen ATA with first gen SATA, especially when it was initially 16mb/sec and not 33 :p Sure, SATA-150 is ten times faster than the first ATA hard drives from over a decade ago, but what does that prove ? That new interfaces are faster than older.. big surprise.

My original point was that SATA-150 isn't much of an improvement over ATA-133. Reading between the lines, that means I think the SATA working group should have aimed higher. When designing something next-gen, shoot for the stars, especially when dealing with a huge partnership of bureaucrats that will take years before actually producing something tangible. At least take that serial bus and parallelize it so I can keep using 80-pin conductors and get 8x the performance .. SATA-1200, now that would be something to sing about! But then we'll face the problem of a slow main bus across the motherboard.

This ain't rocket science, it's data transmission over copper. Why can't we have one general-purpose system with extremely high speeds that everything could plug into ? Many specialized server devices (blade systems especially) have crazy fast master busses while the lowly PC is still trucking along with 33-mhz PCI that the average FPGA-hacking toddler can max out, yet we change CPU sockets every 10 months, and VGA slots every 2 years.. Lose the legacy!

--
-Billco, Fnarg.com

Slashdot Mirror

Changes in HDD Sector Usage After 30 Years

360 comments