Slashdot Mirror


Ubuntu Will Switch To Base-10 File Size Units In Future Release

CyberDragon777 writes "Ubuntu's future 10.10 operating system is going to make a small, but contentious change to how file sizes are represented. Like most other operating systems using binary prefixes, Ubuntu currently represents 1 kB (kilobyte) as 1024 bytes (base-2). But starting with 10.10, a switch to SI prefixes (base-10) will denote 1 kB as 1000 bytes, 1 MB as 1000 kB, 1 GB as 1000 MB, and so on."

31 of 984 comments (clear)

  1. Cannonical is just trolling us by Hadlock · · Score: 5, Funny

    First, screwing with GUI buttons, now this? Mark Shuttleworth, I'm calling you out on your BS
     
    ;)

    --
    moox. for a new generation.
    1. Re:Cannonical is just trolling us by g-to-the-o-to-the-g · · Score: 5, Informative

      If you read closely, you'll see that the summary is kind of misleading. What canonical is actually doing is using SI prefixes for base-10 units, and IEC prefixes for base-2 units.

      In other words, they will use 1kB for 1000 bytes and 1KiB for 1024 bytes. This is a good thing, it just means the UI should be consistent and you don't need to second-guess.

    2. Re:Cannonical is just trolling us by Blue+Stone · · Score: 4, Funny

      Kibibytes always makes me think of cat treats.

      Is that what we want? More lolcats in our hardrives? Fuxxoring up our filesizes?

      --
      Corporation, n. An ingenious device for obtaining individual profit without individual responsibility. - Ambrose Bierce
    3. Re:Cannonical is just trolling us by Reemi · · Score: 4, Insightful

      I am more confused by people mixing b (bit) and B (Byte).

    4. Re:Cannonical is just trolling us by polar+red · · Score: 5, Insightful

      when the C64 came out with 64K No-ONE doubted it had 65536 Bytes of RAM. if it would came out now, there would be confusion, so the kibi-business introduced confusion. people who don't understand the difference between binary and decimal have no place in IT

      --
      Yes, I'm left. You have a problem with that?
    5. Re:Cannonical is just trolling us by Anonymous Coward · · Score: 4, Funny

      Actually 'B' is 'Bel' like in decibels - dBs. So kB is kilobels etc. This causes me no end of confusion as I wonder why hard-drive manufacturers are measuring drive capacity on a logarithmic scale compared to some unspecified standard size.

      I think we should decimalize completely, remove the confusing 8-bit byte and introduce a new 10-bit unit called a "dyke".

      That way there'd be no confusion.

    6. Re:Cannonical is just trolling us by TheRaven64 · · Score: 5, Insightful

      Other posters have pointed out that bits and bytes are not SI units, but they've not pointed out that we use 1024 because it's more useful. We use base 10 for physical quantities because it means that you can very easily do base-10 logarithms and most arithmetic on physical quantities is easier if you can do logarithms on the base that you use in your head.

      Storage is always indexed by some binary quantity, so you need to do base-2 logarithms. You can trivially calculate how much space a 32-bit address space gives you: 2^32 bits, divide the 32 by 10 gives you 2^22 KB, 2^12 MB, 2^2 GB, 4GB. Try doing that with 1KB = 1000B in your head. You can easily tell how much space your 32-bit filesystem can store if it is addressing 512B blocks (the size of most hard disk blocks). 512 is 2^9, so it's 2^9 x 2^32 bytes. Add the exponents and you get 2^41 byes, or 2TB. What happens if we start using 4KB blocks instead? Well, 4 is 2^2, K means 2^10, so 2^12 x 2^32 = 2^44, or 16TB.

      Redefining KB makes these calculations harder. The only kind of calculations it makes easier are things that involve bytes and some other SI units that use the SI prefixes in the same equation. About the only other SI quantity that you ever see in an equation with bytes is seconds and you almost never talk about kiloseconds or megaseconds...

      --
      I am TheRaven on Soylent News
    7. Re:Cannonical is just trolling us by camperdave · · Score: 5, Funny

      When I make a cake, I don't use 1 cup of flower...

      I am glad for everyone who might eat your baking.

      --
      When our name is on the back of your car, we're behind you all the way!
    8. Re:Cannonical is just trolling us by The_Wilschon · · Score: 4, Insightful

      OTOH, if the OS is reporting GiB, then it ought to say GiB, not GB. Reporting that a "10 GB" (written on the box) hard disk has "9.3 GB" of space is confusing and misleading. If your definition of correctness in notation is adherence to internationally accepted standards for notation, then it is also incorrect. If you RTFA, then you will find that Ubuntu 10.10 is requiring that all applications either report "10GB" or "9.3 GiB", but not "9.3 GB" or "10 GiB". This is, in fact, a switch to correct and less misleading behavior. Whether or not it is more or less confusing may be a different matter.

      --
      SIGSEGV caught, terminating

      wait... not that kind of sig.
    9. Re:Cannonical is just trolling us by swillden · · Score: 4, Insightful

      when the C64 came out with 64K No-ONE doubted it had 65536 Bytes of RAM

      No kid playing with his first or second computer, anyway. Old hands used to dealing with memory measured in kilowords (with the standard SI meaning of "kilo") would have had to ask. They might have had to ask how big a byte was, too. There's a reason standards call them octets, you know.

      You just think this is some kind of carved-in-stone standard because it's what you were first exposed to.

      --
      Note to ACs: I usually delete AC replies without reading them. If you want to talk to me, log in.
    10. Re:Cannonical is just trolling us by Dewin · · Score: 5, Funny

      Is that what we want? More lolcats in our hardrives? Fuxxoring up our filesizes?

      Mebi? Or mebi not.

      --
      Of course nobody reads the FAQ! If people read the FAQ, the Questions wouldn't be so Frequently Asked.
    11. Re:Cannonical is just trolling us by ObsessiveMathsFreak · · Score: 4, Insightful

      Maybe for all the physicists, chemists, and engineers; but has kilo never meant 10^3 for computer programmers, computer engineers or computer scientists. Same with mega- giga- and so one. They have all each had a very specific meaning in the base 2 number system, which is ultimately the most important base system for people working with computers.

      We don't have 10 hours a day, 10 days a week. We don't have 10 bits in a byte or 100 degrees in a circle. I'm a huge proponent of the SI system but only in areas where it is appropriate to apply it. Lengths, weights, magnetic flux density, all fine. But there are many applications and areas which are not appropriate to shoehorn into the decimal system. Binary computer memory sizes are one such application. It is not appropriate to group base 2 numbers using a base 10 units.

      --
      May the Maths Be with you!
    12. Re:Cannonical is just trolling us by TClevenger · · Score: 4, Insightful

      What do you mean never? "Kilo" has always meant 10^3 for HDDs, likewise for mega, giga, etc.

      Sorry, you're wrong; disks used base-two definitions, too. A 360K floppy is 362,496 bytes formatted, and a Seagate ST-225 20 megabyte hard drive had a little over 21,000,000 bytes formatted. It wasn't until some hard drive manufacturer couldn't quite hit a gigabyte that they redefined "gigabyte" so that they could call their 976MB drive "1 gigabyte."

    13. Re:Cannonical is just trolling us by AaxelB · · Score: 5, Insightful

      But there are many applications and areas which are not appropriate to shoehorn into the decimal system. Binary computer memory sizes are one such application. It is not appropriate to group base 2 numbers using a base 10 units.

      I agree entirely. However, SI prefixes *are* in base 10, and just redefining them in specific contexts to mean something in base 2 is unnecessarily confusing. Kilo is accepted to mean thousand, and redefining it in specific contexts to mean 2^10 is just unreasonable. To use your phrase, it's not appropriate to shoehorn this system of decimal prefixes into describing a naturally binary system (which is precisely what happened in CS).

      I understand it's how we've been doing things for decades, but why on earth are so many CS people arguing *against* decreasing ambiguity? I find the whole KiB thing to be a relatively elegant solution, which maintains the familiar letters so there's nothing new to learn, but makes it clear what units you're using. The only reason to resist it that I can see is just blind and unthinking resistance to change -- the exact same reason so many people resist the metric system and SI at all.

      You seem to be arguing "if it ain't broke, don't fix it", but I think it is a little broke and we should fix it.

  2. Thing is by davidjgraph · · Score: 4, Insightful

    Anyone who's too stupid to understand the difference, isn't going to care. Someone, somewhere, has too much time on their hands...

    1. Re:Thing is by beelsebob · · Score: 5, Insightful

      Actually, they are. This is most likely for the exact same reason as apple likely did it –reduced support costs. They don't need to deal with shit tons of people complaining that their 1000GB disk isn't 1000GB, it's only 931.3GB.

      Along with of course the most obvious reason – it's *correct* that way.

    2. Re:Thing is by beelsebob · · Score: 5, Insightful

      And they accomplish this by measuring wrong? Great effing job!
      Wrong? By who's standard, the SI standard, the ISO standard and the IEEE standard all agree on this point.

  3. ubuntu joins apple... by the+unbeliever · · Score: 4, Insightful

    Apple did this with Snow Leopard, which makes me a cranky geek.

    Why can't the OS manufacturers pressure the hard drive companies to market their sizes correctly? =(

    1. Re:ubuntu joins apple... by Shinobi · · Score: 5, Insightful

      HD manufacturers are presenting the sizes correctly. SI prefix = hard-defined base-10, it's just computer engineering and computer sciences that broke the established standard.

    2. Re:ubuntu joins apple... by The+Wild+Norseman · · Score: 5, Funny

      So yes, SI as base-10 precedes the standards-breaking use among computer engineers and computer scientists by well over 100 years.

      Wait a second. Are you talking 100 years, or 102.4 yirs?

      --
      "A government is a body of people usually -- notably -- ungoverned." -Shepherd Book
    3. Re:ubuntu joins apple... by hanabal · · Score: 4, Informative

      he kilo prefix is derived from the Greek word ("chilioi"), meaning thousand. It was originally adopted by Antoine Lavoisier and his group in 1795, and introduced into the metric system in France with its establishment in 1799.

      So while "SI" wasn't around. It was already as established standard

  4. Really annoying by Ma�djeurtam · · Score: 5, Interesting

    I work mostly on OS X and this so-called feature annoys me to no end. I do not know the size of my files anymore, I have to go to the terminal just to know the size of a file (bash hasn't been polluted by this feature).

    I've been using computers for 20+ years and I do _not_ want to change how I think file sizes, especially since I feel that base 10 is the wrong way to count. What's next? Imperial units for us Europeans?

    The most annoying? That nobody has hacked Snow Leopard to restore real units.

    --
    Instant Karma's gonna get you, Gonna knock you right on the head (John Lennon, 1970)
    1. Re:Really annoying by Culture20 · · Score: 4, Funny

      What's next? Imperial units for us Europeans?

      Hell no. Imperial units for file sizes. A byte will be twelve bits, a kilobyte will be 3 bytes, and a megabyte will be 5280 bytes. A petabyte will be 5.87849981x10^12 megabytes. There won't really be such things as terabytes or gigabytes, which will make drive manufacturers happy because most of their drives are measured in TB or GB.

    2. Re:Really annoying by Anonymous Coward · · Score: 5, Funny

      I feel that base 10 is the wrong way to count.

      You must have a horribly difficult time in this world.

  5. Just use the right prefix by mmontour · · Score: 5, Insightful

    As long as they use the correct prefix, I don't really mind whether they use base 2 or 10 to display the numbers.

    RAM sizes are naturally powers of 2 due to how the individual memory cells are addressed, so it makes sense for RAM capacity to always be listed in GiB.

    Hard drives, on the other hand, have nothing that is fundamentally based on a power of 2. They arbitrarily use a sector size of 512 (or 4096) bytes, but everything else (number of heads, number of tracks, average number of sectors per track) has no power-of-2 connection. Therefore there's nothing wrong with reporting their size in SI notation.

    The original shorthand of calling 1024 bytes a "K" was not too bad because it's only a 2.4% error. However the error gets worse as you go up each level, and by the time you're talking about a TB/TiB it's something that people actually care about.

    1. Re:Just use the right prefix by tuomoks · · Score: 4, Informative

      Sorry, 512 or whatever base-2 sector size is not arbitrary - the disk controlling hardware / buffers / controllers / channels / etc and especially the transfer sizes, multipliers in headers, and so on are (still) base-2. If you ever do performance / capacity calculations or estimates for storage size, etc, you very fast find base-2 very handy.

      The disk size error is not a big deal - there always is an overhead that changes by storage type, file system, fixed physical characteristics, key / data compression used, replication, whatever - so? The public (and I think many in IT) really don't know and/or have to know more than if they have enough or need more!

  6. Bye Ubuntu, was nice knowing you. by Culture20 · · Score: 5, Insightful

    I've used Ubuntu exclusively on my desktops for several years now. It's nice to know that I can always switch to another distro when they do something BAT SHIT INSANE like this: https://wiki.ubuntu.com/UnitsPolicy

    Change the GUI window buttons from right to left? Meh. Change the way file sizes are read so that User X and User Y see different file sizes using the same filesystem, even potentially the same remotely mounted disk?

    Now I have to draft a letter to our research department telling them to stay the hell away from Ubuntu because their data will potentially be wrong (unless they take pains to remember the kilo=/=kibi switch).

  7. Good move by the_other_chewey · · Score: 5, Insightful

    I'm surprised by the majority here that is against this. What kind of nerds exactly are you?
    SI prefixes are defined as base-10, period. Every other use is simply wrong.
    Being consistently wrong for a very long time doesn't make it better, it is just proof of
    an unwillingness to admit to a stupid initial mistake you didn't even make yourself.
    As nerds, you're supposed to be better than that.

    How can you be all for standards-compliance with browsers and rile against a much
    stronger, decades-old ISO standard (which is based on a centuries old definition from the
    beginning of the metric system - "kilo" has been 1000 for over 200 years)?

    On the other hand, you are the same crowd regularly writing about "mbit/s" while meaning "Mbit/s",
    thereby being off by just a tiny, unimportant, paltry factor of a billion.
    Seriously, what's wrong with you?

    -- an annoyed scientist

  8. Re:Annoying... by Kjella · · Score: 5, Informative

    Because the context is a problem every time you mix computers and what you're doing on a computer. Let's say you record a CD, 16 bits/sample @ 44.1kHz. That's a bitrate of 16 * 44.1 = 705.6 kbit/s second right? If I want to send it over the LAN too? What if I need to allocate a memory buffer, is it still 705.6 kbit/s? And what if I want to store it to disk, do I need to allocate 705.6 kbit per second of music? Computers aren't not remotely consistent with themselves, a 100 Mbit LAN is 100,000,000 bits/second. Hard drives too but they're hardly the only ones, floppies weren't even consistent with themselves most being 1.44*1000*1024 bytes.

    Things get confusing all the time because a 1 MB, 1 KHz (1024*1024*1000) bus is not equal to a 1 kB, 1MHz bus (1024*1000*1000) which is why everyone dealing with networks never used kilo = 1024. The 56k modem is 56,000 bits, ISDN is 64,000 bits and so on right up to SATA 6Gbit/s which is 6,000,000,000 Gbit/s (and even more confusing because it's in 8/10 bit encoding, but that's another story). So both inside and outside the machine we're switching between base 2 and base 10 all the time.

    A particularly confusing item was codecs. Should they follow the "size" standard so a 128 kbit/s MP3 would take up 128 kbit/s, or the network standard so that a 128 kbit/s would take 128 kbit/s of network bandwidth? I think now most settled on k = 1000, that is to say if you encode a one second clip at 128 kbit/s it'll only take up 125 kbit on your disk. Confusing as fuck? Hell yeah. Let's just settle this and be done with it, with the i = base 2, without it base 10. Just forget the lame names, and let the prefixes do the talking. MB = megabyte, MiB = megabyte. That's what I'm doing at least.

    --
    Live today, because you never know what tomorrow brings
  9. Re:Annoying... by growse · · Score: 4, Insightful

    Nothing? How many clocks per second does a 2GHz CPU run at?

    --
    There is nothing interesting going on at my blog
  10. Re:Mod parent up (or not) by FrozenGeek · · Score: 4, Interesting

    Well, when I was in university, studying computer science (who'd've guessed that a /. contributor studied comp sci?), KB was defined as 2^10 bytes, MB was 2^20, and GB was a pipe dream (hey, I graduated from uni in 1986). So, for us, kB WAS defined as 1024 bytes. That's how ALL of my textbooks, which I still have btw, defined kB.
    Perhaps it was not defined as 1024 bytes everywhere (comp sci types are notorious for having multiple standards), but it was defined as 1024 bytes in a fair number of places.

    --
    linquendum tondere