Ubuntu Will Switch To Base-10 File Size Units In Future Release
CyberDragon777 writes "Ubuntu's future 10.10 operating system is going to make a small, but contentious change to how file sizes are represented. Like most other operating systems using binary prefixes, Ubuntu currently represents 1 kB (kilobyte) as 1024 bytes (base-2). But starting with 10.10, a switch to SI prefixes (base-10) will denote 1 kB as 1000 bytes, 1 MB as 1000 kB, 1 GB as 1000 MB, and so on."
Anyone who's too stupid to understand the difference, isn't going to care. Someone, somewhere, has too much time on their hands...
Apple did this with Snow Leopard, which makes me a cranky geek.
Why can't the OS manufacturers pressure the hard drive companies to market their sizes correctly? =(
As long as they use the correct prefix, I don't really mind whether they use base 2 or 10 to display the numbers.
RAM sizes are naturally powers of 2 due to how the individual memory cells are addressed, so it makes sense for RAM capacity to always be listed in GiB.
Hard drives, on the other hand, have nothing that is fundamentally based on a power of 2. They arbitrarily use a sector size of 512 (or 4096) bytes, but everything else (number of heads, number of tracks, average number of sectors per track) has no power-of-2 connection. Therefore there's nothing wrong with reporting their size in SI notation.
The original shorthand of calling 1024 bytes a "K" was not too bad because it's only a 2.4% error. However the error gets worse as you go up each level, and by the time you're talking about a TB/TiB it's something that people actually care about.
It was never defined that way!
"kilo" has always meant "1000". That is the way that IT is DEFINED.
There has never been a point since the introduction of the 1024 "binary k" prefix that you didn't have to second-guess. RAM was different from disk, before that communications was already using SI kilo (or I should say, what would become SI kilo, since they predated the codification of SI).
The "binary" prefixes have always been problematic and don't help new people entering the field to understand anything, so they ought to go, or at least be segregated out so that there can be no confusion.
Can you be Even More Awesome?!
I've been using computers for 20+ years and I do _not_ want to change how I think file sizes, especially since I feel that base 10 is the wrong way to count.
How is it possible you survived working in IT for over 20 years and not being able to adapt to radical changes? These sort of things happen all the time. One moment you're working from LSB upward, then you're suddenly working from MSB downward. 8 bit changed into 16, into 32 and now in 64. Filenames can't be longer than 8 characters and now they can. A file can't be larger than 4 GB and now it can. And now finally, operating systems are beginning to understand SI units (which we've been using for all sorts of applications for hundreds of years) and *THAT* is a problem?
What's next? Imperial units for us Europeans?
A better comparison would be using metric units in the US, because metrics are based on SI and imperial units are more like the weird way bits and bytes are counted into kilobytes, megabytes etc.
Saying that 1024 is a kilo never made any sense to anyone. I'm really glad we're finally entering an age where computers represent datasizes in units people can understand.
Pretty good is actually pretty bad.
I am more confused by people mixing b (bit) and B (Byte).
I've used Ubuntu exclusively on my desktops for several years now. It's nice to know that I can always switch to another distro when they do something BAT SHIT INSANE like this: https://wiki.ubuntu.com/UnitsPolicy
Change the GUI window buttons from right to left? Meh. Change the way file sizes are read so that User X and User Y see different file sizes using the same filesystem, even potentially the same remotely mounted disk?
Now I have to draft a letter to our research department telling them to stay the hell away from Ubuntu because their data will potentially be wrong (unless they take pains to remember the kilo=/=kibi switch).
when the C64 came out with 64K No-ONE doubted it had 65536 Bytes of RAM. if it would came out now, there would be confusion, so the kibi-business introduced confusion. people who don't understand the difference between binary and decimal have no place in IT
Yes, I'm left. You have a problem with that?
I'm surprised by the majority here that is against this. What kind of nerds exactly are you?
SI prefixes are defined as base-10, period. Every other use is simply wrong.
Being consistently wrong for a very long time doesn't make it better, it is just proof of
an unwillingness to admit to a stupid initial mistake you didn't even make yourself.
As nerds, you're supposed to be better than that.
How can you be all for standards-compliance with browsers and rile against a much
stronger, decades-old ISO standard (which is based on a centuries old definition from the
beginning of the metric system - "kilo" has been 1000 for over 200 years)?
On the other hand, you are the same crowd regularly writing about "mbit/s" while meaning "Mbit/s",
thereby being off by just a tiny, unimportant, paltry factor of a billion.
Seriously, what's wrong with you?
-- an annoyed scientist
Other posters have pointed out that bits and bytes are not SI units, but they've not pointed out that we use 1024 because it's more useful. We use base 10 for physical quantities because it means that you can very easily do base-10 logarithms and most arithmetic on physical quantities is easier if you can do logarithms on the base that you use in your head.
Storage is always indexed by some binary quantity, so you need to do base-2 logarithms. You can trivially calculate how much space a 32-bit address space gives you: 2^32 bits, divide the 32 by 10 gives you 2^22 KB, 2^12 MB, 2^2 GB, 4GB. Try doing that with 1KB = 1000B in your head. You can easily tell how much space your 32-bit filesystem can store if it is addressing 512B blocks (the size of most hard disk blocks). 512 is 2^9, so it's 2^9 x 2^32 bytes. Add the exponents and you get 2^41 byes, or 2TB. What happens if we start using 4KB blocks instead? Well, 4 is 2^2, K means 2^10, so 2^12 x 2^32 = 2^44, or 16TB.
Redefining KB makes these calculations harder. The only kind of calculations it makes easier are things that involve bytes and some other SI units that use the SI prefixes in the same equation. About the only other SI quantity that you ever see in an equation with bytes is seconds and you almost never talk about kiloseconds or megaseconds...
I am TheRaven on Soylent News
Nothing? How many clocks per second does a 2GHz CPU run at?
There is nothing interesting going on at my blog
Exactly. Dont give in to the mistakes of HDD manufacturers and legalize their wrong advertising.
Doesn't pint/quart/gallon differ according to geography. Pint, Gallon and so on.
This article and this time of year piss me off.
You're exactly right. We don't suddenly re-define an established standard. And when it comes to physics, we don't suddenly re-define time...like every year when the stupid US government decides that it's magically an hour earlier or an hour later.
When I make a cake, I don't use 1 cup of flower and then decide to make bread, so I redefine the size of 1 cup to make reading the recipe easier...
There's no place like
Many computer nerds like to tout themselves as geniuses who have flexible minds. But the truth is that we're all afraid of change. And this switch from KiB to KB is change. It's not what you're used to, so it's going to confuse you.
But as a geek myself with an obsession for clear and precise terminology, I welcome the change. No longer will I wonder if someone's talking about KB vs. KiB, because it'll be consistent and explicit, at least on the computer systems developed by flexible-enough-minded people who are both willing to change and willing to correct a long-confusing problem.
It's true that the HD makers have taken advantage of this confusion. Back in the day when people almost always said KB when they meant KiB, HD makers used KB. But the fact is, once we adapt our terminology to be less ambiguous, we really can't be mislead by them anymore, and their deceptive marketing practices will be moot (at least when it comes to bytes of storage).
So, to summarize, stop being a stick in the mud and learn to adapt to change. Computers are and always have been an aspect of change in our society. Get over it and get with the program.
If you went to the terminal and saw this file
file.big 17,179,869,184
I suspect that you would naturally say that that file is about 17 gigs. Actually, it is 16 GiB exactly.
However, just looking at the file, no one would ever instinctively say that file.big is 16 GiB. The reality is that base-10 is what people naturally use and so it makes sense for the user interface to reflect that.
The
OTOH, if the OS is reporting GiB, then it ought to say GiB, not GB. Reporting that a "10 GB" (written on the box) hard disk has "9.3 GB" of space is confusing and misleading. If your definition of correctness in notation is adherence to internationally accepted standards for notation, then it is also incorrect. If you RTFA, then you will find that Ubuntu 10.10 is requiring that all applications either report "10GB" or "9.3 GiB", but not "9.3 GB" or "10 GiB". This is, in fact, a switch to correct and less misleading behavior. Whether or not it is more or less confusing may be a different matter.
SIGSEGV caught, terminating
wait... not that kind of sig.
when the C64 came out with 64K No-ONE doubted it had 65536 Bytes of RAM
No kid playing with his first or second computer, anyway. Old hands used to dealing with memory measured in kilowords (with the standard SI meaning of "kilo") would have had to ask. They might have had to ask how big a byte was, too. There's a reason standards call them octets, you know.
You just think this is some kind of carved-in-stone standard because it's what you were first exposed to.
Note to ACs: I usually delete AC replies without reading them. If you want to talk to me, log in.
1kB was never defined as 1024 bytes. People just started calling 1024 bytes as 1kB because it was close enough, and on one cared about being 2.4% off. Unfortunately as everytime we leap another 10^3 we're off by another 2.4%, and by the time we get to 10^12 we're off by 10%.
Maybe for all the physicists, chemists, and engineers; but has kilo never meant 10^3 for computer programmers, computer engineers or computer scientists. Same with mega- giga- and so one. They have all each had a very specific meaning in the base 2 number system, which is ultimately the most important base system for people working with computers.
We don't have 10 hours a day, 10 days a week. We don't have 10 bits in a byte or 100 degrees in a circle. I'm a huge proponent of the SI system but only in areas where it is appropriate to apply it. Lengths, weights, magnetic flux density, all fine. But there are many applications and areas which are not appropriate to shoehorn into the decimal system. Binary computer memory sizes are one such application. It is not appropriate to group base 2 numbers using a base 10 units.
May the Maths Be with you!
Well, it depends on what you are talking about. The situation is not as clear cut as you depict it.
1 kb on your disk is usually defined as 1024 bits... but 1 kb/s is usually defined as 1000 bits/second. As an example, a 1.5 Gb/s SATA interface is running with a 1.5 GHz clock, so it will transfer 1500000 bits per second (actually, the number of effective bits will be lower as it uses 8b/10b balancing).
What do you mean never? "Kilo" has always meant 10^3 for HDDs, likewise for mega, giga, etc.
Sorry, you're wrong; disks used base-two definitions, too. A 360K floppy is 362,496 bytes formatted, and a Seagate ST-225 20 megabyte hard drive had a little over 21,000,000 bytes formatted. It wasn't until some hard drive manufacturer couldn't quite hit a gigabyte that they redefined "gigabyte" so that they could call their 976MB drive "1 gigabyte."
But there are many applications and areas which are not appropriate to shoehorn into the decimal system. Binary computer memory sizes are one such application. It is not appropriate to group base 2 numbers using a base 10 units.
I agree entirely. However, SI prefixes *are* in base 10, and just redefining them in specific contexts to mean something in base 2 is unnecessarily confusing. Kilo is accepted to mean thousand, and redefining it in specific contexts to mean 2^10 is just unreasonable. To use your phrase, it's not appropriate to shoehorn this system of decimal prefixes into describing a naturally binary system (which is precisely what happened in CS).
I understand it's how we've been doing things for decades, but why on earth are so many CS people arguing *against* decreasing ambiguity? I find the whole KiB thing to be a relatively elegant solution, which maintains the familiar letters so there's nothing new to learn, but makes it clear what units you're using. The only reason to resist it that I can see is just blind and unthinking resistance to change -- the exact same reason so many people resist the metric system and SI at all.
You seem to be arguing "if it ain't broke, don't fix it", but I think it is a little broke and we should fix it.
Computer memory, in an abstract sense, tends to be looked at in a hierarchical way:
* Registers
* Caches
* RAM
* Secondary storage (swap)
A filesystem is a datastructure, arguably just nominally imposed on a dedicated swap-space of sorts.
When you buy a gig of RAM, you expect 2^30 bytes, not 10^9 bytes. I've never understood why HD think that their "secondary storage" does not belong under the paradigm of "computer memory" when talking about sizing, despite the fact that all modern OS's use swap space, and filesystems are all data structures whose constituents tend to fall on word boundaries.
--TheOrangeSquid Is it any wonder things seem so awry? We swim in a sea of confusion and don't have to think to survive
I disagree. It is not a matter of knowing about math. I should be able to interpret any measurement just by knowing what each unit stands for, I shouldn't need any deeper knowledge about math or history of computing. Because then I will be lost trying to interpret numbers from other fields, from other countries, etc and the whole point of the SI is to have a global standard.
Someone somwhere just ignored the proper definition of kilo and redefined it as 1024 and then most of us have continued that mistake. I see no problem in correcting this once and for all. It seems silly that trying to correct the problem would upset people.