Slashdot Mirror


Hard Drive Capacity Confusion, Lucidly Explained

mrklin writes "James Wiebe of wiebetech.com has written a clear example of how hard drive capacity is calculated (PDF file) by hard drive manufacturers (base 10) and OS (base 2). He failed to name how the capacity should be described, though."

34 of 482 comments (clear)

  1. Does it matter anymore? by Dancin_Santa · · Score: 5, Insightful

    With storage prices falling through the floor, does it matter to anyone except whiny nerds whether the byte counts are done in base 10 or base 2?

    In the words of William Shatner, "Get a life!"

    1. Re:Does it matter anymore? by gooru · · Score: 2, Insightful

      With storage prices falling through the floor, does it matter to anyone except whiny nerds whether the byte counts are done in base 10 or base 2? I don't think that's the point. The point is that what is advertised is NOT what you get. The problem doesn't just apply to hard drive manufacturers but to everyone under the sun. It's a question of being open and truthful about what you are really selling.

    2. Re:Does it matter anymore? by geekmetal · · Score: 2, Insightful
      With storage prices falling through the floor, does it matter to anyone except whiny nerds whether the byte counts are done in base 10 or base 2?

      Seen in isolation it doesn't really matter. But the point remains that the HD sellers are using the wrong count and the question that comes to the person who knows is "why?". The answer is simple - to mislead, by making the customer feel they are getting more than they actually are. In a free market it is important that any attempts to mislead the consumer be addressed, for it is a greedy system.

      --
      There are two kinds of egotists: 1) Those who admit it 2) The rest of us
    3. Re:Does it matter anymore? by cujo_1111 · · Score: 1, Insightful

      Technically I believe they are correct.

      Isn't the correct multiplier for the prefix giga, 10^9?
      Isn't the correct multiplier for the prefix mega, 10^6?
      Isn't the correct multiplier for the prefix kilo, 10^3?

      Come up with your own prefixes if you want to measure them at base 2.

      --
      If I point out that you are incorrect, making me a foe does not make you any more correct.
    4. Re:Does it matter anymore? by MSZ · · Score: 2, Insightful

      The hard drive manufacturers are not trying to mislead anybody.

      Oh, they are. Just in a less obvious way.

      They are using the correct notation for the capacity of the drive.

      I will then suppose that when you buy 512MB memory module, you expect it to have exactly 512000000 bytes of capacity, right? It's the proper way, right?

      The traditional and accepted way is to go with powers of 2. This is incomaptible with ISO/SI/whatever but it's they way we all (except some deviants and marketeers) love.

      Now I would believe the HD makers are doing this for the pure love of standards if only they would clearly describe the product as having size calculated with nontraditional units. In reality it seems that they want to sell product with decimal G capacities but have customers believe they are buying disk with conventionally calculated capacity and hoping that no one would notice.

      --
      The moon is not fully subjugated. I demand a second assault wave preceded by a massive nuclear bombardment.
    5. Re:Does it matter anymore? by olderchurch · · Score: 2, Insightful
      I will then suppose that when you buy 512MB memory module, you expect it to have exactly 512000000 bytes of capacity, right? It's the proper way, right?

      Actually, yes. As a scientist I have always wondered how the computer nerds (which i'm myself now) can get away with using Kilo and Mega inappropriatly. I'm very glad the IEC is finally trying to come up with a solution. It will get a lot clearer for everybody.

      --
      Disclaimer: This opinion was created without the use of any facts
    6. Re:Does it matter anymore? by Fweeky · · Score: 3, Insightful

      Um; if your drive's reporting a lot of reallocated sectors you should RMA it -- even with top-end 80G platters, sector remapping happens seldom.

      There are plenty of failure modes which will result in lots of remapped sectors, but that's a side-effect of the drive having difficulty reading/writing in general due to component failure, which to be honest is probably less common now than it has been.. uh.. ever (cooked and/or shocked to death drives excepted).

    7. Re:Does it matter anymore? by Theatetus · · Score: 4, Insightful
      But the point remains that the HD sellers are using the wrong count and the question that comes to the person who knows is "why?". The answer is simple - to mislead

      Maybe I'm being a naive optimist here, but there seems to be a much more sensible reason:

      The way memory is addressed makes it convenient to use the base-2 units.

      Storage is not addressed in a way that makes it particularly convenient to use base-2 units.

      Got that? That's why we use them on memory. Storage is not addressed that way, so like everything else we tend to use base 10 to describe it.

      --
      All's true that is mistrusted
    8. Re:Does it matter anymore? by ergo98 · · Score: 5, Insightful

      "In reality it seems that they want to sell product with decimal G capacities but have customers believe they are buying disk with conventionally calculated capacity and hoping that no one would notice."

      This is all so absolutely ridiculous. Firstly, about 99% of people on the streets, including most computer users, aren't mentally calculating the power of 2 capacities when you say that a hard-drive has 40GB, or a memory module has 512MB -- Instead they mentally have an awareness that 40GB is "big, but 80GB is better", and "512MB is good". I highly doubt they're going to get their shiney new drive, and DRATS! - they have 42949672960 of virus filled emails to fit in there, but instead they only got 40000000000.

      Secondly, hard drive manufacturers, as a general rule, have used the power of 10 rule since before I first became interested in computers about 18 years ago - this is the standard, and if you haven't read the byline "GB refers to 1,000,000,000 bytes" then you just haven't been looking.

      This whole campaign is just contrived and attention seeking nonsense. I suspect that someone just finished their "Computers 101" course, and they think they've discovered an amazing fraud being perpetrated upon the public by those dastardly harddrive manufacturers.

    9. Re:Does it matter anymore? by fo0bar · · Score: 2, Insightful
      I will then suppose that when you buy 512MB memory module, you expect it to have exactly 512000000 bytes of capacity, right? It's the proper way, right?

      I would expect the module to contain 536870912 bytes, but that's only because I know that memory manufacturers are using the wrong unit of measurement. If they advertised the module as 512MiB, then I would clearly know the capacity. (But probably nobody else would because most of the industry has been perpetuating this incorrect unit of measurement. Who's misleading people again?)

      Look at it this way. Say there are 2 local hardware stores. If somebody walks into Store A and buys a 1 yard board, he gets a 1 yard (3 foot) board. Then he walks into Store B and sees a 1 yard board advertised, but it's actually 1 meter (~3.28 feet). But nobody complains because they're "close enough".

      Over time the two stores become national home improvement retailers. People are also buying more lumber in bulk. But because of Store B's false advertising early on (even if it is advantageous to the customer), people are now convinced that 1 yard is ~3.28 feet. So when they go into Store A and ask for 10,000 yards of lumber, they get angry that they're "only" getting 30,000 feet of lumber, not 32,808 like they expect.

      Store A (hard drive manufacturers) are the ones in the right, but because Store B (pretty much everybody else) made the populus accept the "close enough" argument, Store A is now looking bad.

      Now I would believe the HD makers are doing this for the pure love of standards if only they would clearly describe the product as having size calculated with nontraditional units.

      First, nearly every hard drive I've bought in the last 8 years or so have had that warning. Second, I'm going to love the day I walk into Home Depot and see the disclaimer, "1 foot is represented as 12 inches here. Your method of representing feet may vary."

    10. Re:Does it matter anymore? by edwdig · · Score: 2, Insightful

      Storage is not addressed in a way that makes it particularly convenient to use base-2 units.

      Yes it is. The smallest addressable unit of a hard disk is a sector - which is 512 bytes.

    11. Re:Does it matter anymore? by PainKilleR-CE · · Score: 2, Insightful

      So what's a byte again?

      The operating system uses GB just as it has been used in the computer industry since it's beginning. The NIST can't change that, regardless of how much they'd like to, and the prefix Gibi (or GiB, though cool in some ways, isn't any better) just isn't going to happen in normal speech any time soon.

      Actually, though, as far as I have discerned, most geeks know quite well what's going on with the hard drive sizes. It's the average user that comes home with a new drive and has someone install it for them that asks wtf is going on when their 120,000,000 byte hard drive that was advertised as 120GB is actually 113GB.

      --
      -PainKilleR-[CE]
    12. Re:Does it matter anymore? by miyoo · · Score: 3, Insightful
      It's not really that hard to figure out. AFAIK, ALL hard disk manufacturers report their drive sizes in terms of 10^9 bytes. Because of some grand conspiracy to decieve? No. Simply because statistically speaking a person who walks down the aisle of his local electronics store is more likely to buy the drive with the big number "120" on it than the one that has a "113". Anybody who used the 'binary' system would be giving up a lot of sales because people would simply choose the one with the bigger number.

      AMD started calling their processors names like "XP2000" rather than advertising the clock speed. AMD was getting killed because most people measure the value of their computer by how many GHz it is (AMD being behind Intel), not by how well it actually runs their applications (AMD being comperable). Misleading? Maybe, but I think they pretty much had to do this to stay competetive.

      In other words, they're not lying about hard disk sizes, they're marketing. They don't actually want to deliberately deceive people because that would make their customers angry and give them a bad name. But they do want to influence their customers' perception of the value they are getting from a particular product. Why do you think you're paying $199.99 for that hard disk instead of $200.00?

  2. Base 2 by The+ZoNiE · · Score: 1, Insightful

    Our computers are binary, so the hard drives that we put in them should be measured using the binary (Base-2) representation.

    Build me a microprocessor containing transistors that switch between 10 different voltage levels before you continue with your Base-10 tomfoolery.

    1. Re:Base 2 by NanoGator · · Score: 2, Insightful

      "Our computers are binary, so the hard drives that we put in them should be measured using the binary (Base-2) representation."

      Eh, no. Binary is interesting to computers, not to humans. Humans care about numbers multipliable by 10.

      A human can understand the concept of a byte, a single letter. However, a human, unless he's really into computers, doesn't care much about how many bits are in a byte. It may be 8-bits per byte, but what about error correction etc?

      A human can easily multiply 1000 by 1000 and know what the answer is, but ask him to do 1024 by 1024 and he's going to scratch his head. But if he knows that he's got 1,000 useful bytes/characters, then he doesn't need to know about how many bits are in a byte, and the powers of 2, etc.

      So no, I don't agree with you. Human readability is at issue here. If somebody really wants to know how many bits are on an HD, they're wanting to know more than most people who'll plunk down money for a drive.

      (note: I realize you didn't necessarily mean bits, but I did kind of need to make that point so the rest of my statement made more sense. Hope I didn't sound like I was misrepresenting what you said.)

      --
      "Derp de derp."
    2. Re:Base 2 by Ho-Lee-Chow · · Score: 2, Insightful

      No, dummy. He is talking notation, not numbers. We have to change the words we use to describe numbers in computer science, not the numbers themselves.

      The "kilo" in kilobytes is an abuse of SI metric notation. "kilo", "mega" and "giga" mean 1000, 1 000 000, and 1 000 000 000 to physicists, engineers, chemists, and the general scientific community. How arrogant or short-sighted were computer scientists to think that they could simply re-define these prefixes to mean 1024, 1024 * 1024, and 1024 * 1024 * 1024?

      The real solution is to stop abusing widely accepted terminology and switch to the suggested "kibibytes", "mebibytes" and "gibibytes". Yes, it sounds stupid, but that's only because it's unfamiliar. It's not as stupid as using one set of prefixes for two different purposes. In fact, it's that very usage that led to this stupid conflict between "hard drive manufacturer gigabytes" and "operating system gigabytes".

      From a consumer standpoint it makes sense to make 1K = 1000 bytes and so on, but from a computer viewpoint, it's best to leave it as is. All in all, people should research what a kilobyte is (in terms of how many bytes it is) before they become experts in storage capacities for computers.

      Geez. Repeat after me: Computer are intended to be used by PEOPLE, not the other way around. Nobody, I repeat, nobody, outside of the CS community uses kilo, mega, or giga to mean anything but 10^3 (10 to the power of 3), 10^6 or 10^9. Why should Joe Sixpack on the street, or even a Physics professor with no CS knowledge, have to "research" what "gigabyte" means in the context of computer science? It should mean 1 000 000 000 bytes, plain and simple. If someone wants to express the number 1024^3, they should make up a new word such as "gibi-" instead of using existing terminology.

      Of course, this will NEVER happen, because in any given community, the majority of people would rather stick with widely accepted and entrenched mistakes than bother to change their behaviour or ideas. Just witness the ridiculous C notation for assignment:

      a = a + 1

      In many other programming languages and mathematics "=" means "equality" NOT assignment; saner languages use ":=" for assignment. Yet, because of C's popularity, we will be stuck with this abuse of notation forever, especially since any new languages (such as Java) will try to cater to C programmers.

      If you can't see why this is a mistake, consider this. In a language with "=" for equality and ":=" for assignment, you only have learn one new thing: that ":=" means assignment. In C, you have to learn two things: "=" means assignment, NOT equality and "==" means equality. How stupid is that? Everyone already knows that "=" means equality; why change that? Everyone already knows that "kilo" means 1000; why change that?

      Now, thanks to the "grandfathers of CS" or whoever, I have to remember my standard SI prefixes (okay, that's no problem), I need to know that in most CS applications kilo, mega, giga, etc. mean 1024, 1024^2, 1024^3, etc. and I need to remember that in CERTAIN CS applications kilo, mega, giga, etc. have their standard meanings.

      Oh sorry, but what was I thinking? It's the hard drive manufacturers who are stupid.... (sarcasm). Did you ever think that one of the reasons they use the standard definition of "giga-" to calculate drive sizes is that most NORMAL PEOPLE (i.e. the majority of computer users) don't know that giga means 1024^3? More to the point, how many ordinary people care to calculate (or memorize) the exact value of a gigabyte? (Of course, I'm sure another reason is that they get to "inflate" their hard drive sizes).

      To summarize my overly long post, one of the main reasons computer consumers are constantly being ripped off, misled and confused is that CS geeks like us keep forgetting or never cared that computers are nothing more than tools for people. Maybe you need to take a Human-Computer Interaction course or something, if you can't understand that.

  3. Ditch binary units by achurch · · Score: 4, Insightful

    As far as ordinary users (i.e. anyone who doesn't have to deal with TLBs, memory pages, disk sectors and the like) are concerned, there's really no reason left to use binary units; 2^9 bytes per sector, 8 sectors per filesystem block, etc. are all low-level conveniences that the user shouldn't have to even notice. Though I personally am too used to the binary units to switch easily, the vast majority of users probably wouldn't even notice the difference, aside from their computers finally reporting the right size for their hard disks. Granted, overcoming the huge momentum for binary units will be difficult, but one could always consider it practice for getting the USA to accept metric.

    1. Re:Ditch binary units by Jugalator · · Score: 2, Insightful

      Granted, overcoming the huge momentum for binary units will be difficult, but one could always consider it practice for getting the USA to accept metric.

      So you're saying that USA should use 1 KB = 1000 bytes, while the rest of the world don't need to? (sounds weird to me)

      Or are you saying that a group of people should try to enforce a new global standard where 1 KB = 1000 bytes? (sounds impossible to me)

      --
      Beware: In C++, your friends can see your privates!
    2. Re:Ditch binary units by achurch · · Score: 2, Insightful

      I'm saying that the world should adopt 1kB = 1000 bytes, and that getting the world to do so would be nearly as difficult as getting the USA to switch to metric.

  4. WTF? by MarvinIsANerd · · Score: 5, Insightful

    This is not a matter of base-10 vs base-2... a base-10 number is written as "2875" for example. A base-2 number is written as "10100110". A base-16 number is written as "8A3F0"...

    This is a matter of UNITS used - like inches vs. feet, or in this case GiB vs GB.

    Geez, get the terminiology right...

    1. Re:WTF? by Lars+T. · · Score: 2, Insightful

      Yes, and the unit is Byte in both cases. Giga is shorthand for a factor of 1,000,000,000 like kilo is for a factor of 1,000. The problem is that some decades ago some geek thought that 1024 is close enough to 1000, so it would be k3wl to use "Kilo" (with a capital K) for a factor of 1024 (a base two factor). Hey, Kilo should be enough for everybody, nobody will ever run into having to distinguish between Mega (factor of 1,000,000) and, errm, Mega (or mega?) (factor of 1024*1024 - or 1000*1024?).

      --

      Lars T.

      To the guy who modded me down from perfect to terrible Karma - Apple haters still suck

  5. I've said this before by Sunlighter · · Score: 4, Insightful

    About two years ago there was a debate about this. Can't remember the details of that debate. Maybe it was when those "mebibytes" were introduced. I still say now what I said then.

    I think there should be "short megabytes" and "long megabytes", and the same for gigabytes. Like this:

    • One short ton is 2,000 pounds and one long ton is 2,240 pounds.
    • One short kilobyte is 1,000 bytes and one long megabyte is 1,024 bytes.
    • One short megabyte is 1,000,000 bytes and one long megabyte is 1,048,576 bytes.
    • One short gigabyte is 1,000,000,000 bytes and one long gigabyte is 1,073,741,824 bytes.
    • One short terabyte is 1,000,000,000,000 bytes and one long terabyte is 1,099,511,627,776 bytes.
    • And so forth...

    Then all we need is to get hard drive manufacturers and OS vendors to state whether they are using short or long tons, er, gigabytes.

    As to abbreviations, take Donald Knuth's suggestion. Use the capital letter twice to suggest binaryness. 1 MMB = one long megabyte; 1 GGB = one long gigabyte. I like this much better than the now-standardized MiB men-in-black abbreviation for long megabytes (which are still not called long megabytes in the standard, they are called mebibytes, which sounds silly and no one uses it).

    Who's with me?

    --
    Sunlit World Scheme. Weird and different.
    1. Re:I've said this before by kzadot · · Score: 2, Insightful

      Look all you have done is renamed something perfectly good to something longer and more stupider sounding.

      Mebibytes does not sound silly and people do use it. Long megabytes? Yuck...

      Computer scientists never intended for thier misuse of kilo, mega etc, to become a standard, it was always just a shorthand slang.

      Hard drive manufacturers have got it right this time. Now that we have the new kibi, mebi etc, units, there is no excuse to falsley claim that kilo can be anything other than 1000x.

      kilo = 1000x
      kibi = 1024x

      Problem solved, end of story.

    2. Re:I've said this before by Anonymous Coward · · Score: 1, Insightful
      Mebibytes does not sound silly and people do use it. Long megabytes? Yuck...

      Uh, sure. Whatever you say buddy. The fact is, no one outside of the uberzealot-correctness crowd uses kibi,mibi,gibibytes. While I don't think the the short/long megabyte idea is great, it's certainly better than the current 'standard' which IS fairly difficult to pronounce. Quite frankly, until they come up with a name that's doesn't completely wreck the flow of trying to speak the unit name, no one's really going to adopt it - which is probably rather unfortunate. Try having an out loud conversation with someone using the 'correct' units by name. You'll find they're totally jarring and don't flow easily. This is a bad thing when the whole point is to be able to quickly relate a common idea.
  6. Re:But seriously by dtfinch · · Score: 4, Insightful

    Those are too hard to pronounce. Who not just distinguish them by prefixing the metric ones with the word "metric", as we do with tons and metric tons.

    kilobyte = 1024 bytes
    metric kilobyte = 1000 bytes

  7. Re:This needs an article? by CrackHappy · · Score: 2, Insightful

    Oh man, that just brought back memories. A bunch of geeks sitting at Round Table pizza for a BBS party all trying to get the highest in base 2 decimals. 512, 1024, 2048, 4096, 8192, etc. all shouting to be heard.

    No wonder you'd never see a woman at those parties, must have scared them off. of course, nowadays, you see women geeks much more often, thank God.

    --
    1f u c4n r34d th1s u r34lly n33d t0 g37 l41d Capitalization really works: i helped my uncle jack off a horse
  8. article sidesteps the entire issue by drfireman · · Score: 4, Insightful

    The only relevant issue is the meaning of words like kilobyte, megabyte, and gigabyte. Wiebe describes how you can arrive at two different answers for drive capacity depending on how you define the word "gigabyte," but does so completely uncritically. For example, he describes the drive manufacturer logic and writes that "the drive's claim of 123.5 GB is verified with this simple mathematical formula." But the issue is what the word "gigabyte" means, and the formula presented sheds no light on the word's conventional usage or etymology. I personally was raised to use these terms to correspond the numbers that are powers of two. Wiebe doesn't give me any point of reference to shed light on whether it's reasonable to use the meanings drive manufacturers do. (Of course I already know the answer, but that's beside the point.)

    Wiebe uses some other odd logic, exemplified in point 3.7. He writes that the consumer was never cheated, because a drive advertised as having a capacity of 123.5GB had just that in "decimal based" capacity. This is a bizarre way to characterize the complaints. Consumers who believe they were cheated aren't claiming they didn't get 123.5GB for any definition of the word gigabyte. They're claiming they didn't get 123.5GB by the conventional definition of the word as commonly used in connection with computers. In my view, they're right, although I don't personally get too upset about it.

  9. Re:But seriously by itsme1234 · · Score: 2, Insightful

    Oh, ye - so you want:

    1 kg = 1024 g
    1 metric kg = 1000 g

    1 km = 1024 m
    1 metric km = 1000 m

    Thanks, but no thanks.

  10. Re:Mistake!! by Jugalator · · Score: 2, Insightful

    Hey Jimmy, assuming you're using FAT32 as your XP filesystem, which uses 73.8 MB of space for every gigabyte, not just 73.8 megs one time, that adds up to roughly 8,856MB of space used for the filesystem. Which on a labeled 123.5 GB drive, leaves you with roughly 115GB of space! Wow! The HD manufacturers were right!

    The OS *do* use a negligible amount of drive space in these days with 100+ GB hard drives. And you're confusing file systems with operating systems. Just because an OS allow you to use a file system that waste resources, doesn't mean the OS itself use a lot of drive space.

    --
    Beware: In C++, your friends can see your privates!
  11. And yet... by arb · · Score: 3, Insightful

    ...he ignores the fact that HD manufacturers are happy using bytes which are 8 bits, all the while flaunting the established convention that MB/GB refers to binary megabytes and binary gigabytes. Why don't they specify the size of their HDs in bits?

  12. Re:Naming reference by Piquan · · Score: 4, Insightful

    But personally I strongly reject this "kibibytes" attempt at CS revisionist history. Stick with what CS people have been using as measurements for decades, I say,

    Why shouldn't CS people stick to what the rest of the sciences have been using for decades, that "kilo" means 1000? This CS thing of making "kilo" stand for 1024 is an attempt at revisionist history.

    There's always another perspective.

  13. Re:But seriously by mindriot · · Score: 2, Insightful

    No, because "kilo" is, in fact, a metric prefix. So a simple kilobyte should have its standard meaning as the SI unit prefix implies. You might however call the other one "non-metric," "binary," or "bastard" :-)... problem is, no one will use such terms. It is understood as an unwritten rule that anything suffixed with "byte" implies that the prefixes "kilo," "mega" etc. refer to 2^10, 2^20 etc. factors. Even more interesting, I suppose the "general public" doesn't even know how much a gigabyte is anyway, you might as well call it a hogshead - the simple rule is that 20 Quux is less than 40 Quux, whatever the unit may be... so, since people will only compare the size of their hard drive (please, spare the obvious jokes here...) to that of other hard drives (and not memory or whatever), it would be good enough to ensure that at least for a given type of storage medium, all manufacturers calculate their unit prefixes the same way.

    I would think that any greater change (like writing MiB or MMB vs. MB) will only create more confusion. I just remember when, a couple of years back, German computer manufacturers were forced to specify things like floppy disk sizes and screen diagonals not only in inches, but in centimeters - ever tried to buy an 8.3 cm floppy? That just doesn't work. The computer business just has its own weird set of units, but in fact no one really cares (except for maybe some nit-pickers going for the law suit), and a change of prefixing would, while being scientifically correct, not serve any good purpose. (Before you say "then we Americans can continue using pounds and miles too!" - that's a totally different question in my opinion that bears issues of "compatibility" and "ease of use" etc. etc.)

  14. Re:But seriously by danheskett · · Score: 3, Insightful

    Or we could just beat the hard-disk manufacturers with a stick until they understand that most people expect 1 kilobyte to be 1024 bytes :P

    You are out of touch. If you conducted a scientific survey of 100 random adults who own PCs and asked them:

    "How many bytes are in a kilobyte?" you really think that more than 50 would answer "1024"?

    I'd be surprised if more than 10 did, personally.

    100% of the non-geek population equates kilo with base 10, not base 2.

  15. Computers and Cars by vraxoin · · Score: 3, Insightful

    This issue reminds me of a practice used in another industry. The auto industry commonly reports horsepower and torque for their cars as measured at the engine's crank/flywheel vs at the wheels. While the measurements themselves are an accurate reflection of an engine's general performance alone you typically do not just buy an engine, you buy a system which is the car. When the engine's performance in measured within the context of the car--meaning at the wheels--then the truth is revealed. That revelation shows, on average, a loss of 10-20% when power is measured at the wheels vs the crank. Which spec do you think a manufacturer is going to release?