Slashdot Mirror


Hard Drive Capacity Confusion, Lucidly Explained

mrklin writes "James Wiebe of wiebetech.com has written a clear example of how hard drive capacity is calculated (PDF file) by hard drive manufacturers (base 10) and OS (base 2). He failed to name how the capacity should be described, though."

18 of 482 comments (clear)

  1. Link to text version of PDF file by Anonymous Coward · · Score: 1, Informative
  2. 6 pages?! by TwistedGreen · · Score: 5, Informative
    The 6 pages of the article, summarized in three lines:
    Hard drive manufacturers measure capacity in multiples of 1,000,000,000 (10^9) Bytes.
    Operating systems measure capacity in multiples of 1,073,741,824 (2^30) Bytes.
    Some people get confused because they both call it a gigabyte.
    I really don't think this is such a big deal. OSes are started to specify the proper GiB instead of GB, so there shouldn't be a problem anymore.
  3. Differences between drive sizes/companies by stfvon007 · · Score: 2, Informative

    Ive noticed that some companies tend to go a little over the hard drive specified size. Most notably with maxtor. My 160GB and 200GB hard drives are actually 163.9GB and 203.9GB. On the other hand Ive found that Western digital seems to have drives slightly smaller than their advertized capacity (59.8GB for a 60GB drive and 79.97GB for an 80GB drive)

    --
    All misspellings and grammatical errors in the above post are intentional and part of my artistic expression.
  4. Re:Does it matter anymore? by dtfinch · · Score: 4, Informative

    I'm a whiny nerd, and it doesn't matter much to me whether hard disk manufactures define sizes in multiples of base 10 or base 1010.

    But I want to know how each drive handles error correction. A sector isn't REALLY 100000000 bytes when stored on disk, but has extra information to help it detect and correct most small errors. Some manufacturer could skimp on the error correction to increase storage capacity or reduce cost, but the drive would likely crap out sooner than others on the market.

  5. Re:Base 2 by den_erpel · · Score: 3, Informative

    hear hear!

    a CDR 650/700 Mb
    a DVD[+-]R: 4.7 salesman Gb
    = 4.7*1000*1000*1000/1024 = 4589843 kb (= 4.37 Gb)

    AFAIK base-10 is just plain cheating.

    --
    Genius doesn't work on an assembly line basis. You can't simply say, "Today I will be brilliant."
  6. Naming reference by dcollins · · Score: 2, Informative
    He failed to name how the capacity should be described, though.

    Well, he does say this:
    ...because 1024 (a true kilobyte) is definitely not equal to 1000.


    And this:
    The author has recently heard about a naming convention that will attempt to clarify these terms, including confusion on kilobytes, etc.


    But personally I strongly reject this "kibibytes" attempt at CS revisionist history. Stick with what CS people have been using as measurements for decades, I say, and not submit to what the drive manufacturers want to use for inflated advertising.
    --
    We know where leadership by an anti-intellectual "strongman" who scapegoats minorities and likes boisterous rallies goes
  7. Re:Does it matter anymore? by fo0bar · · Score: 2, Informative
    Seen in isolation it doesn't really matter. But the point remains that the HD sellers are using the wrong count and the question that comes to the person who knows is "why?". The answer is simple - to mislead, by making the customer feel they are getting more than they actually are. In a free market it is important that any attempts to mislead the consumer be addressed, for it is a greedy system.

    The hard drive manufacturers are not trying to mislead anybody. They are using the correct notation for the capacity of the drive. 1GB is 1,000,000,000 bytes; 1GiB is 1,073,741,824. And since an 80GB disk is 80,000,000,000 bytes, they are in the right. As it stands, pretty much everybody else is in the wrong, and it just happens to make hard drive manufacturers looks a bit better.

  8. Re:What's next? by Max+Romantschuk · · Score: 2, Informative

    What's next? Monitor sizes? I love my 19" (18" viewable) monitor!

    Monitors are measured by the diameter of the actual physical glass tube inside the monitor. It's a clear and non-ambiguous way to measure things, not perfect, but it's no trickery.

    But when Joe Windows formats his new 120 gig HD and finds it only holds 112 GB he's going to feel cheated on those "missing" 8 GB.

    --
    .: Max Romantschuk :: http://max.romantschuk.fi/
  9. When all else fails, refer to Wikipedia... by Jerk+City+Troll · · Score: 2, Informative

    I think Wikipedia's entry on gigabyte should make this crap appear really stupid. Here's a clip from the entry:

    Because of irregularities in definition and usage of the kilobyte, the exact number could be any of the following:

    1. 1073741824 bytes - 1024 times 1024 times 1024, or 2^30. This is the definition used in computer science and computer programming.
    2. 1000000000 bytes or 10^9 - this is the definition used by telecommunications engineers and storage manufacturers.

    Since most people who buy computers are not in "computer science or computer programming", I would argue the value used by storage manufacturers is perfectly applicable when selling computers in the mainstream.

    Sadly, it appears lawsuits rather than education on a minor issue will be used to settle this matter, which will lead to a precedent that will be yet another aggrivation for the computer industry. Damnit, if you're a lay person, it's safe to say that 1,000 Megabytes is roughly 1 Gigabyte.

  10. Re:Does it matter anymore? by tankdilla · · Score: 2, Informative

    But if I'm going to buy a 120 GB hard drive, i expect there to be 120 * 2^30 = 128,849,018,880 bytes on the drive. The hard drive I got had 113 GB (113*2^30 = 121,332,826,112 bytes). That is a difference of 7,516,192,768 bytes (7 GB). If the box says 120 GB, there should be 120 GB on the hard drive. If there's actually 113 GB on the hard drive, that's the number that should be on the box. Allowing those two hard drives to be on the same shelf in the stores is misleading to consumers and it should be regulated. After using computers with a HD of 6 GB, and space is gone before you know it, one tends to notice the difference between 113 GB and 120 GB.

    --

    -Look lively. LOOK LIVELY!!! --Mr. Shmallow

  11. Re:Ditch binary units by Monkelectric · · Score: 4, Informative
    Huh? no reason to use binary units? What are you smoking and can I have some? :)

    The reason we use binary units is for engineering reasons ... Back in the way back time there was no such thing as a disk drive, and there was only ram. Ram had/has to be made in a power of two because it has to completley fill its address space so the NEXT ram chip begins where the other ends. Otherwise you'd have holes in your address space.

    --

    Religion is a gateway psychosis. -- Dave Foley

  12. Re:Does it matter anymore? by kryonD · · Score: 3, Informative

    Please take note that the amount of free space on an empty, but FORMATTED hard drive will always be a noticable chunk less than full capacity as the OS requires storage space overhead for the file system.

    I just finished explaining this to someone who was whining about their 128MB USB keychain drive only having 123MB of space.

    Your directory structure has to be kept somewhere.

    --
    I've dirtied my hands writing poetry, for the sake of seduction; that is, for the sake of a useful cause. --Dostoevsky
  13. Re:Does it matter anymore? by ColaMan · · Score: 2, Informative

    On the contrary, with every drive manufacturer pushing their physical media to the limit , sector errors happen a *lot*.

    Disks die suddenly because they *suddenly* run out of redundant sectors to remap your data to. This remapping happens transparently to the OS, inside the drive electronics and can usually only be picked up by deteriorating S.M.A.R.T. characteristics. There's only so many redundant sectors and once they're all in use your drive goes downhill will every bump and jolt.

    --

    You are in a twisty maze of processor lines, all alike.
    There is a lot of hype here.
  14. Re:Does it matter anymore? by vrt3 · · Score: 3, Informative
    if I'm going to buy a 120 GB hard drive, i expect there to be 120 * 2^30 = 128,849,018,880 bytes on the drive.

    if I'm going to buy a 120 GB hard drive, I expect there to be 120 * 10^9 = 120,000,000,000 bytes on the drive.

    The hard drive I got had 113 GB (113*2^30 = 121,332,826,112 bytes).

    The hard drive I got had 113 GiB (113 * 2^30 = 121,332,826,112 bytes).

    That is a difference of 7,516,192,768 bytes (7 GB).

    That is a difference of - 1,332,826,112 bytes... actually there were more bytes than you should have expected.

    --
    This sig under construction. Please check back later.
  15. Article inaccurate and uninformed by rpwoodbu · · Score: 3, Informative
    The basic point of the article is accurate: that HDD manufacturers use "standard" metric prefixes and OSes use "computer-ese" "metric-esque" prefixes, thus the confusion. However, the article notably lacks in these areas (and perhaps less notably in others):
    • It uses terms like "binary math" versus "decimal math". Last I checked, they were both equally viable ways of doing math, and as any viable method of doing math should be, they both always get the same answer! See section 3.5 if you want to get really mad! It isn't that the math is different that is causing a problem, it is that the algorithm is different. It just so happens that the algorithm was inspired by a number which is convenient when dealing with binary because it is an even power of 2.
    • There is no discussion of why HDD makers use normal math while OS makers use "computer-ese". It isn't wholly discountable that HDD makers are interested in making their drives look as big as possible against the competition, and if one manufacturer says a Gigabyte is 10^9 bytes then they all have to. And he paints the 1024-byte KiloByte basically as a stupid idea, which it isn't (albeit confusing).
    • The explanation (such as it is) for how much data is lost to OS overhead is inaccurate at best. He got his info for the Mac from the Drive Utility (akin to Disk Management or fdisk in MS-land), but got his WinXP info probably from the explorer. Fdisk will not report any filesystem size considerations, just the partition sizes, so neither should the Drive Utility. I'm betting the 1026 "lost" bytes are the partition table. This makes it look like the Mac loses 1026 bytes, while Windows tosses about 11 MB out the door. While I'm not trying to advocate for Windows, that simply isn't fair. He goes on to say that he has "no explanation for these variations", which brings me to my next point.
    • He can't explain the size variations between OSes, yet he makes this statement:
      We note that operating systems take a portion of drive capacity for use as file tables. A typical drive utilizes 70MegaBytes for this function, which is not significant on a drive with a capacity of 120GB.
      So now he's trying to explain it, and not doing a very good job. First of all, the FS overhead will vary roughly proportionally to the size of the partition, so giving out a number like 70 MB and saying that a "typical drive" loses this much is careless at best. Secondly, I'm not conviced that he doesn't actually have 70 MB of data on that drive. There's no accounting for the 11 MB that aren't showing up as "used", which sounds like FS metadata to me. I don't have a drive handy to format, so I don't know if Windows shows "0 used" on a clean NTFS drive or not (oh, is he using NTFS or FAT32... the world may never know). The bottom line: he should have used the Disk Management tool to compare apples to apples (no pun intended).
    • And the bottom bottom line is that he's in the storage business, and shouldn't be so ignorant. He's got a degree in mathematics for crying out loud!
    I appreciate that this needs to be explained, and I know all too well that the average computer user (read average American) can hardly count, much less do it in binary, so a simple explanation is good. But I never think things should be simplified to the point of gross inaccuracy. This is just further compounded with the obvious lack of a clue. Someone write a better (and perhaps shorter) account for this, please!
  16. Floppy by !the!bad!fish! · · Score: 2, Informative
    As long as I can fit 1.7M on a 1.3M floppy, why should I care?

    --
    Kids today are tyrants. They contradict their parent, gobble their food, and tyrannize their teachers. - Socrates 400 BC
  17. Re:Does it matter anymore? by jimbolaya · · Score: 2, Informative

    Read the article, and you'll see that the directory structure takes up a negligible amount of space. The primary difference is the base-10 vs. base-2 issue.

    --

    There ain't no rules here; we're trying to accomplish something.

  18. Re:Does it matter anymore? by HTH+NE1 · · Score: 2, Informative

    And then compare to what is commonly called the 137 GB limit, when if you were to call it the 128 GiB limit you'd have a better idea of why that is the limit. Calling it the 137 GB limit is just going along with the drive capacity numbers.

    It wasn't always like this though. When storage capacities were under 1 MB, the units were in KiB (though the notation didn't exist). Then as they started to pass the 1000 KiB value, they starting having 1.44 "MB" disks which are really 1440 KiB disks, where the "MB" is a mixture of metric and binary measures (1,024,000 == 1000 * 1024).

    Then they started with the all-metric units for hard drives. Everyone except apparently Maxtor which still has one factor of 1024 in their units, which is why a 45 "GB" drive from them was actually 46.1 GB (42.9 GiB).

    Though not all. Some Maxtor models do use 1 GB == 1,000,000,000 bytes, but there are some where they apparently have 1 "GB" == 1,024,000,000 bytes. (Reminder: 1 GiB == 1,073,741,824 bytes.) This is the source of the drives that are larger than their metric capacity states. They're larger by 24 MB (metric) for every 1 "GB". That's what makes their 80 "GB" drives actually 81.92 GB.

    How is this creeping into their figures? Because instead of a byte count, they're calculating capacity from a block count, and blocks on disks are still governed by binary units (usu. 512 or 1024 bytes per block, though I've seen filesystems with a block size of 1 MiB). They inherit the binary factor from that.

    --
    Oh, say does that Star-Spangled Banner entwine / The myrtle of Venus with Bacchus's vine?