Reaching Beyond Two-Terabyte Filesystems
Jeremy Andrews writes: "Peter Chubb posted a patch to the lkml, with which he's now managed to mount a 15 terabyte file (using JFS and the loopback device). Without the patch, Peter explains, "Linux is limited to 2TB filesystems even on 64-bit systems, because there are various places where the block offset on disc are assigned to unsigned or int 32-bit variables."
Peter works on the Gelato project in Australia. His efforts include cleaning up Linux's large filesystem support, removing 32-bit filesystem limitations. When I asked him about the new 64-bit filesystem limits, he offered a comprehensive answer and this interesting link. The full thread can be found here on KernelTrap.
Reaching beyond terabytes, beyond pentabytes, on into exabytes. I feel this sudden discontent with my meager 60 gigabyte hard drive..."
If all this should have a reason, we would be the last to know.
Aside from all sorts of quantum fiddly bit problems, I wonder just how long it will be before we can store the state of every neuron in a brain (doesn't have to be human, at least not at first) on a hard drive.
Of course, then what would you do with it?
AMCGLTD.COM. Where cats, science fictio
Switch to NTFS! Much bigger than 2TB
Petabytes, please!
Does this mean I can stop backing up all my pr0n to CDR?
No, it doesn't.
Invoicing, Time Tracking, Reporting
General Kernel stuff
Fix all kernel warnings
All kernel warnings? That's almost like being a fire-fighter in hell..
Roses are #FF0000, violets are #0000FF, all my base are belong to you
Well, we have here and RAIDED 60 TB array which runs well und Mac OS X. This is mainly because Darwin is based on FreeBSD. The BSD series comes from the professional/academic unix world and has automatical 64 bit support at all level for 9 years or so.
It's not very suprising that Linux is lacking these features. It's more hobbyist style and still contains some serious design failures like the missing microkernel Mac OS X has for some time now.
Many people here at slashdot bitch at the academic/professional world but at examples like this you see that professional, thoughful design always pays off in some time.
Owner of a Mensa membership card.
Gosh, some day Linux may actually catch up to BeOS, which can deal with 18 petabyte disks.
Now last time I looked the biggest common HD was a 180Gb Seagate Barracuda, so they would still need nearly 100 of these babies to get to 15Tb, costing well over $100,000, and that's before you get to the power/housing/cooling nightmare.
Or do they have some fancy way to store bits using thin air that the rest of us don't know of.
DWR is Ajax for Java
What's the "page widening bug"?
Simulate the human brain.
In an excruciatingly slow manner.
Then watch it go insane and try to take over the world.
Terminator 2, anyone?
what is that, 5 bytes? ;-P
intellectual property law is philosophically incoherent. it is your moral duty to ignore it or sabotage it
Great name for a person with size issues.
According to this page at SGI, XFS supports filesystems that are "9 exabytes" big, which is roughly 1,000,000 terabytes. Under kernel 2.4, the limit is still 2TB, but when the block devices are converted to 64bit, it'll be larger...
Keep it up guys - until they create some sort of 'Linux kernel mailing list' the Slashdot front page is my only source for this information.
26^3 = 9 x 10^18 = 9 exabytes
check out the feature list.
Once upon a time, I saw a big company producing some classified devices for the Soviet military-industrial complex. Of course, the company had an accounting department. And there was a company accounting database. It was a single file about 80 MBytes long (The typical drive size these days was 20-40 MB). To simplify the access tasks, the programmers that created the database software decided that all the data from time immemorial are to be kept in this file. The file grows with every operation, and since the data are thought to be needed forever there is no method to remove irrelevant entries.
The programmers didn't imagine that in pair of years the base will be so big that it will not fit into any available HDD.
Maybe it will be the lesson for some people who are going to misuse the file system features?
Woohoo! Now I have enough space to download every pirated movie before it comes out in theatres! Woohoo! *rips up Star Wars tickets*
Canadian Cynic, canadian politics is less boring than you
mod this up... what the hell is a peNtabyte??
since it came out. I remember being able to have something on the order of a half dozen petabytes per disk volume, with trillions of files, and have disk searches take almost no time, all because of a proper filesystem layout. Until Linux can do that, then it isn't worth considering for use as a fileserver.
Only two filesystems, XFS and JFS, seem to really
work with larger than 2 TB in size hard disks.
I attended the OPK Fest for the release of WIndows XP about... maybe less than a year ago and found out some interesting tidbits about how the new filesystem (NTFS) is used within XP.
I know one big complaint about the W2K OS is that it takes forever too boot. Talk about truth. (I started cooking oatmeal for 2 minutes and it took W2K about the same amount of time too boot) Then with the release of XP it suddenly got faster, much faster. Well at the OPK Fest I learned 2 things.
1. Being that there are 8 megs of space reserved for Windows use that are unmounted upon boot and are never really viewable unless you know the OS call. At the fest the explained they used this space to optimize the boot time on XP. When you shut XP down windows saves a current mini-image of all the important files it needs to boot upon next startup. When the system boots it reads from this partition and utilizes that information to boot Windows so it knows what programs need to be run, what services have been configured, etc. This in the end made XP boot up almost double the speed. (If not faster)
2. The second most intriging feature they annouced, (well the feature my jaw dropped over) was NTFS 5.0, which is also used on 2K. Even though there are some weird limitations that they talked about. NTFS 5 has the ability to handle harddisks as large as 16 pica bytes. 1000 gigs make a terra, 1000 teras make an exo 1000 exos make a pica.) On a good day that is the whole Kazaa network with room to spare.
I don't know all the specifics as a lot of that is in house info only but still, that is a lot of formattabel information. What would you do with 16000 exo bytes?
~Admrlnxn
"I got your mom in my trunk"
War Against Terror-bytes
but I worry about other data types.
For example, I grumple at the MS stupidity of putting all datafiles into one large container file in a database base under Access in Windows. Which is why I never use it. I prefer discrete files. If one gets hosed, then it is easier to fix.
obviously a database that is that big would run into other performance issues as well. Some of which is handled by moore's law, and some of which isn't.
for similar reasons I tend to divided my drive into various partitions, regardless of which OS I use.
"It is a greater offense to steal men's labor, than their clothes"
Your harddrive penis is indeed small!
As you may know if you've been following recent IEC and IEEE standards (or if you've ever bothered to figure out exactly how large a terabyte is), what disk manufacturers call a terabyte and what this article calls a terabyte differ slightly.
When used in the standard way, the "tera" prefix means 1 * 10^12, so a terabyte would be 1 000 000 000 000 bytes. Unfortunately, computer systems don't use base 10 ("decimal"), they use base 2 ("binary"). When trying to express computer storage capacities, somebody noticed that the SI prefixes kilo, mega, giga, tera, and so on (meaning 10^3, 10^6, 10^9, 10^12,
This discrepancy causes some confusion. For instance, if you could afford to purchase such a 2 terabyte hard disk, you might well be annoyed when your system tells you your disk is almost 200 gigabytes (2 * (2^40 - 10^12)) smaller than you thought it would be (most systems would report a 2 terabyte disk as a 1.8 terabyte disk).
The moral of the story is one of:
Interestingly the Slashdot community seems to think it should be a combination of 1 and 2.
IANAL, but I think Chubby Checker's got him beat hands down.
Danny.
I have written over 900 book reviews
Okay there were some remarks about storing neurons and stuff on HDs... but beside that, _WHAT_ the hell would need 2TB?
I'm still a high-school student so I haven't the faintest idea what goes on in enterprise-size companies.. so if someone could shed light on this I'd appreciate it
"The majority is always sane, Louis." -- Nessus
http://slashdot.jp
While not on the actual linux box, what about sizes of very large (e.g. > 2.1 TB) NFS mounts?
Imagine if the Trueman Show (like in the movie) was recorded as one huge MPEG video - you could store it one of of these! :-)
You could fit movies of everything anyone's ever seen on a Beowulf cluster of these filesystems!
Follow me
For those who wish to communicate with the rest of the world, the following calculations actually make sense:
For the uninitiated, these terms are described here
Even accounting for your typographical error, 2^63 != 9 * 10^18 (9223372036854775808 != 9000000000000000000)
Problem solved: Use lzip
MBA Managers won't notice ;-)
For the hardcore, we can build lzip into the FS. So we'll have Reiserfs, ext2, ext3, JFS, and lzipFS. Heck lzipFS might be faster than RAM!
A caveman dreams of being us, the incalculable power and riches. We dream of being Q, then what?
http://www-1.ibm.com/servers/eserver/clusters/s
"9 exabytes" big, which is roughly 1,000,000 terabytes
Very roughly, perhaps. 9 exabytes is actually 9,000,000 terabytes.
For those that haven't got hard disks this big, here's a list of names for sizes beyond megabytes and gigabytes.
From the Linux Kernel mailinglist on the status of XFS merge into 2.5:
I know it's been discussed to death, but I am making a formal request to you to include XFS in the main kernel. We (The Sloan Digital Sky Survey) and many, many other groups here at Fermilab would be very happy to have this in the main tree. Currently the SDSS has ~20TB of XFS filesystems, most of which is in our 14 fileservers and database machines. The D-Zero experiment has ~140 desktops running XFS and several XFS fileservers. We've been using it since it was released, and have found it to be very reliable. Uh, so Peter Chubb says there is a 2 TB limit, but these science guys on Fermilab are using Linux with 20 TB filesystems on the SGI XFS port.
Or is that _still_ at a meager 2GB limit?
Isn't this why typedefs like size_t, ssize_t, off_t, loff_t, and so on exist? Upping the filesystem size should just be a matter of changing whichever typedef is appropriate from an int to a long or a long long or something, right?
Here is a (somewhet incomplete) answer to the two questions everyone seems to have about 2TB of data:
1) Where would you store it?
Well, you could store it in a holographic Tapestry drive. The prototype, just unveiled a few months ago, stores 100GB in a removable disk, and that is nowhere near the max density of the technology. In their section on projects for the tech, they say that a floppy-sized disk should hold about 1TB in a couple years. Impressive.
2) What would you do with it?
Well, other than high-definition video or scientific experiments, nothing on your own PC, unless you are making a database of all the MP3s ever made or backing up the Library of Congress. But on a file server, you could easily use this much space. The 2TB limit will probably never affect most home users (realizes he will be quoted as an idiot in 10 years when 50TB HDs are standard). On the other hand, Tapestry will probably be useful in portable devices, esp video cameras.
I hereby place the above post in the public domain.
If you open the page you are reading right now, you will see that the page is widen. It's MS IE fault since they don't follow the standard. Well, if you are reading this news under different browser then, I guess you won't notice it.
Reaching beyond terabytes, beyond pentabytes, on into exabytes
Woohoo! A filesytem on a tape drive, that's what I need.
When information is power, privacy is freedom.
It seems to me that it would be more practical to make the file storage and management system be *independent* of the OS. This would allow storage companies to get economies of scale by not having to worry about OS-specific issues.
The "native" disk storage could be used as a kind of cache. The "big fat" storage would be like a *service* that could be local or remote. The OS would not care. It simply makes an API call to the "storage service".
Table-ized A.I.
Looks like we'll have to come up with a different naming scheme. Someone's already trademarked the exabyte.
Couldn't it weaken the trademark to have Western Digital or Seagate making a '9 exabyte' hard drive? Or HP or Sony making an 'exabyte-class' tape drive? Wouldn't a judge find (in favor of Exabyte) that the consumer would easily be confused?
*The USPTO are idiots.*
Give me my freedom, and I'll take care of my own security, thank you.
With this story posting, it's official:
Timothy is, without doubt, the worst Slashdot story poster in the universe.
Try running lmbench on MacOS X, or even on
plain FreeBSD. Now try on Linux, and then
ask yourself why you got your ass whipped.
There is a microkernel Linux called MkLinux.
It works just like MacOS X. Nobody uses this
because Linux users don't fall for the hype.
Microkernels are slow by nature.
Linus has the guts to say NO to the latest
cool trendy features. He has the guts to
say NO to supporting stuff that nobody will
be using for the next few years, and then
say YES when the time is right.
Microkernels suffer from communication
overhead. No matter how low you make it,
you still won't beat ZERO. So you are
doomed to slow performance, and only look
good when the non-microkernel OS that you
compare against is lame. Then you suffer
the complexity of managing the messages,
because when you glue simple components
together the GLUE IS NOT FREE. Implementing
UNIX semantics forces you to send lots of
messages or, as is the case with MacOS X,
to just give up and put a BSD kernel on top
of a now-pointless microkernel layer.
Remember all the hell when the world moved from 16 bit to 32 bit? All sorts of lazy code was broken. And here we go again. This isn't a Linux thing or a Windows thing; it's just the basic nature of human beings.
The good news is, once we move to a 64-bit processor, that's it. We'll correct the code one more time and that's the end of it, since 64 bit ints are sufficient for any imaginable program.
OK, lots of usable storage space is good, but what about the time epoch? Won't it run out in 2036 or something?
When's that going to be fixed?
Hi,
Jamie McCarthy here. Since IE is not open source, nor will it ever be, I simply cannot fix this bug. Yes, I *know* I can simply write a filter for it in Slashcode, but that just takes too much of my time. Yes, I know I use IE (for Mac) too, just like 88% of Slashdot, but I'm a fucking hypocrite, so what the hey? Anyways, just thought I'd keep you updated on the Page Widening Featu^H^H^H^H^H Bug on Slashdot.
Love,
Jamie
Actually it's rumored to be based on SQL Server.
Your numbers would yield 105 petabytes, not 105 exabytes. Huge difference.
Microkernels may be slightly slower by nature than monolithic kernels like Linux, but the difference is rapidly becoming a nonissue with increasing processor speeds and better kernel designs.
In the meantime, microkernels are allowing for a host of new and useful features that monolithics just can't do: user-mounted file systems, increased security at the kernel level, dramatically increased ease and speed of development of kernel-level components, the ability to load entire separate operating systems interfacing with the same or separate hardware with no external software... to name a few.
Eventually speed will no longer be considered a primary goal, in fact, it is slowly but surely becoming trivial. Microkernels will win out if monolithic/Linux advocates can only use the speed argument to try to show the superiority of their kernel.
If you're not part of the solution, you're part of the precipitate.
Intelligent people have no problem with the idea that a kilobyte has 1,024 characters. Hard drive manufacturers always have, but they are hardly paragons worthy of emulation.
Stop out the kibibyte nonsense now, before it gets any further.
When things go larger and larger, I get confused.
Okay, I know what's Exabyte, but what are the-still-larger ones ?
Muchas Gracias, Señor Edward Snowden !
Why the *$&% was the parent modded up? The day any sane person (as opposed to a hypocephilic metriphile) uses kibibye, mebibyte, gibibyte or any of those thrice-accursed neologisms is the day that the world begins to end.
Without metric (more correctly called SI), you would not have the terms kilo, mega, giga, and so on to abuse. These SI prefixes are so useful precisely because they are standardized in their meaning and are widely recognized.
Intelligent people have no problem with the idea that a kilobyte has 1,024 characters.
s/Intelligent/Ignorant
That kilo is a long-standardized term meaning 1000 in every field except for computer science should concern you. I suppose you are unfamiliar with the terms kilogram, kilometer, kilohertz, and the many other standard SI units (or prefix + unit combiniations). The exact meaning of kilo as 1000 has great weight thruout the world. When engineers, physicists, mathematicians, and indeed general people first use computers, it is most reasonable to expect kilo to mean 1000. To not accept this logical system agreed on by many of the world's most important standards bodies (including those in the computing and engineering fields) seems stubborn.
Would you have the world redefine kilometer as being 1024 meters?
Would you blatantly ignore IEC/ANSI C standards when coding to the point of confusion?
Rather than attacking the personality or state of mind of those in favor of this proposal (which by extension includes IEC and IEEE members), would you care to provide a reasoned argument against these standards?
I should also mention that what you seem attached too seems far from uniform in its use.