Linux Breaks 100 Petabyte Ceiling

512? That can't be right. by fonebone · 2001-11-06 21:20 · Score: 4, Funny

The 144 Petabyte figure is obtained by raising two to the power of 48, and multiplying it by 512.

Hm, that can't be right, I swear I heard it was supposed to be two raised to the power of 50, multiplied by 128.. hm.

--
when the rain comes, they run and hide their heads. they might as well be dead.

Re:Forgot my Greek by BitwizeGHC · 2001-11-06 21:27 · Score: 3, Informative

1e3 terabytes, or 1e6 gigabytes.

--
N4st0r, trixx0r h0bb1tz0rz! Th3y st0l3 0ur pr3c10uzz!

One Long Video by CritterNYC · 2001-11-06 21:29 · Score: 4, Funny

This would be handy for over 8200 years of DVD video.

--
Portable versions of Firefox, GIMP, LibreOffice, etc

Re:One Long Video by lala · 2001-11-06 21:55 · Score: 5, Funny

Great!
Finally they can release the uncut version of '2001: A space Odyssey'
Re:One Long Video by CritterNYC · 2001-11-07 05:02 · Score: 3, Funny

Oh, dear god. Please just put that suggestion down and back away.... SLOWLY!

--
Portable versions of Firefox, GIMP, LibreOffice, etc

Somewhat misleading by nks · 2001-11-06 21:33 · Score: 5, Interesting

The IDE driver supports such rediculously large files, but no filesystem that I know of currently does, not to mention the buffer management code in the kernel.

So does linux support 18pb files? kind of -- pieces of it do. But the system as a whole does not.

Re:Somewhat misleading by geirt · 2001-11-06 23:15 · Score: 3, Interesting

The glibc limits the file size to 64 bit (9 million terabytes), so unless the POSIX LFS api changes, that is the current maximum file size regardless of the file system (on x86 that is).

A 9 million terabyte file size limit isn't a large problem for me ....

--

RFC1925

Or in other words... by PD · 2001-11-06 21:34 · Score: 4, Interesting

2.197265625 trillion Commodore 64's.

98.7881981 billion 1.44 meg floppy disks.

1.44 million 100 gig hard drives

or

3.5 trillion 4K ram chips (remember those?)

--
If tits were wings it'd be flying around.

XFS by starrcake · 2001-11-06 21:40 · Score: 5, Informative

http://oss.sgi.com/projects/xfs/features.html

XFS is a full 64-bit filesystem, and thus, as a filesystem, is capable of handling files as
large as a million terabytes.

263 = 9 x 1018 = 9 exabytes

In future, as the filesystem size limitations of Linux are eliminated XFS will scale to the
largest filesystems

Re:Ok... by astrophysics · 2001-11-06 21:43 · Score: 3, Insightful

There are about 10^10 solar masses of mass in a large galaxy like our own. At ~10^33 g/ solar maxx, and 10^23 atoms per gram, That's 10^66~2^219 particles in our galaxy. Beleive me, scientists will make use of as much computing power, RAM, and storage space as they can get their hands on. If only the limiting factor were operating system limitations rather than the more practicalities realities of funding and costs of hardware.

Allright... I'll bite. by Bowie+J.+Poag · 2001-11-06 21:43 · Score: 4, Funny

"144 PB should be enough for anybody."

- Bowie J. Poag, November 7, 2001

--
Bowie J. Poag

Article got it wrong on BeOS - 18 EXAbytes! by Snard · 2001-11-06 21:46 · Score: 5, Informative

Just a side note: BeOS has support for files up to 18 exabytes, not 18 petabytes, as stated in the article. This is roughly 18,000 petabytes, or 2^64 bytes.

Just wanted to set the record straight.

--
- Mike

OK this is great... by TheMMaster · 2001-11-06 21:46 · Score: 5, Insightful

Now, I can really imagine someone that buys a 144Pb drive (array) and will use IDE?? I would personally go for SCSI there ;-)

What I am really wondering is: is there at the current moment ANY company/application/whatever that required this amount of storage? I thought that even a large bank could manage with a few TB's
Not intended as a flame, just interested

but still, this is a Good Thing (r)

--
Fighting for peace is like fucking for virginity

Random statistics.... by tunah · 2001-11-06 21:47 · Score: 4, Funny

Let's say you have this 144 petabyte drive. Okay it's friday, time to back up.

So you whip out your two hundred million cd recordables, and start inserting them. Let's say you get 1 frisbee for each 25 700Mb CDs.

This leaves you with eight million frisbees.

That's a stack 13 kilometres high.

So who needs this on a desktop OS again?

--
Free Java games for your phone: Tontie, Sokoban

Re:Random statistics.... by ColaMan · 2001-11-06 22:33 · Score: 5, Funny

So you whip out your two hundred million cd recordables, and start inserting them. Let's say you get 1 frisbee for each 25 700Mb CDs.

Silly Moo!

You back it up to your *other* 144 petabyte drive!

--

You are in a twisty maze of processor lines, all alike.
There is a lot of hype here.
Re:Random statistics.... by Anonymous Coward · 2001-11-06 22:59 · Score: 4, Informative

Suppose you copy at full PCI bus speed: 133 Megabytes per second. Said backup would take about 34 years.

144 PB, not really by tap · 2001-11-06 21:53 · Score: 5, Insightful

Sounds like all they are saying is that the new
IDE driver can support 48 bit addressing. With 2^48 seconds of 512 bytes, you get 144 PB. But there are a LOT of other barriers to huge filesystems or files.

For instance, the Linux SCSI driver has always support 32 bit addressing, good enough for 2 terabytes on a single drive. But until recently, you couldn't have a file larger than 2 gigabytes (1024x smaller) in Linux. I think that the ext2 filesystem still has a limit of 4 TB for a single partition.

So while the IDE driver may be able to deal with a hard drive 144 PB in size, you would still have to chop it into 4 TB partition.

Re:Ok... by kkenn · 2001-11-06 21:53 · Score: 3, Informative

Well, it's good to see that Linux has caught up, but the article is not correct that Linux is the first OS to support 48-bit ATA; FreeBSD has had this support for over a month now.

See for example: this file which is one of the files containing the ATA-6r2 code, committed to FreeBSD on October 6.

Uh, no? by srichman · 2001-11-06 22:02 · Score: 3, Informative

Correct me if I'm wrong, but isn't this very very misleading? The article states that the Linux IDE subsystem can now support single ATA drives up to 144 petabytes (i.e., Linux ATA now has 48 bit LBA support), but my understanding is that many other aspects of the the Linux kernel limit the maximum file size to much less.

I'm looking at the Linux XFS feature page, which states:

Maximum File Size
For Linux 2.4, the maximum accessible file offset is 16TB on 4K page size and 64TB on 16K page size. As Linux moves to 64 bit on block devices layer, file size limit will increase to 9 million terabytes (or the system drive limits).
Maximum Filesystem Size
For Linux 2.4, 2 TB. As Linux moves to 64 bit on block devices layer, filesystem limits will increase.

My understanding is that the 2TB limit per block device (including logical devices) is firm (regardless of the word size of your architecture), and unrelated to what Mr. Hedrick did. Am I wrong? Does this limit disappear if you build the kernel on a 64-bit architecture?

And, on 32-bit architectures, there's no way to get the buffer cache to address more than 16TB.

Somebody will probably correct me ... by King+Of+Chat · 2001-11-06 22:08 · Score: 3, Interesting

... but a couple of years ago, I was investigating OODBMSs. The sales bloke for (I think it was) Objectivity claimed that CERN were using their database for holding all the information from the particle detector things - which I can see being a shedload of data (3d position + time + energy). He was suggesting figures of 10 petabytes a year for database growth (so it must be frigging huge by now).

Of course, this was probably salescrap. Does anyone know the truth on this?

--
This sig made only from recycled ASCII

Re:Somebody will probably correct me ... by Anonymous Coward · 2001-11-06 22:47 · Score: 5, Insightful

Of course, this was probably salescrap. Does anyone know the truth on this?

The BABAR experiment at SLAC is using Objectivity for data storage. Unfortunately, I cannot find a publicly available web page about computing at BABAR right now.

The amount of data BABAR produces is in the order of magnitude of 10's of terabytes per year (maybe a hundered), and even storing this amount in Objectivity is not without problems. The LHC, which is currently under construction, will generate much more data than BABAR, but even if they reach 10 petabytes per year one day, I very much doubt that they will be able to store this in Objectivity.

1st desktop OS? Well, not quite. by mr · 2001-11-06 22:35 · Score: 5, Informative

Before you start thumping your chest about how superior or cutting edge *Linux is, go look at these two links
A slashdot story pointing out how without the FreeBSD ATA code, the Linux kernel would be 'lacking'
The FreeBSD press release announcing the code is stable

If The Reg actually researched the story, Andy would have notice it is not a 'first' but more a 'dead heat' between the 2 leading software libre OSes. Instead, The Reg does more hyping of *Linux.

--
If it was said on slashdot, it MUST be true!

Pebibytes? by Rabenwolf · 2001-11-06 22:44 · Score: 4, Informative

And this is even more impressive in pebibytes, too.

Well, according to the IEC standard, one petabyte is 10^15 (or 1e+15) bytes, while one pebibyte is 2^50 (or 1.125899e+15) bytes.

So 144 petabytes is 1.44e+17 bytes or 127.89769 pebibytes. Can't say that's more impressive tho. :P

Re:working with large files by Effugas · 2001-11-06 22:56 · Score: 4, Informative

SSH has done quite a bit of work to support +2GB files. As always, the following will and always has worked:

cat file | ssh user@host "cat > file"

More recent builds of SCP will also support +2GB, so:

scp file user@host:/path
or
scp file user@host:/path/file

will both work.

In fact, probably the best way for syncing two directories is rsync. Rsync's major weakness is that it's *tremendously* slow for large numbers of files, and I believe it has to read every byte of a large file before it can incrementally transfer it(so you're looking at 2GB+ of reading before transfering). The following will do rsync over ssh:

rsync -e ssh file user@host:/path/file
rsync -e ssh -r path user@host:/path

For incremental log transfers, I actually had a system built that would ssh into the remote side, determine the filesize of the remote file, and then tail from the total file size minus the size of the remote file. It was a bit messy, but it was incredibly reliable. Did have problems when the remote logs got cycled, but it wasn't too ugly to detect that remote filesize was smaller than localfilesize. Just a shell script, after all.

SFTP should, as far as I know, handle 2GB+ without a hitch.

Both SCP and SSH of course have compression support in the -C tag; alternatively you can pipe SSH through gzip.

Email me for further info; there's some SSH docs onto my home page as well. Good luck :-)

--Dan
www.doxpara.com

Example... by mirko · 2001-11-06 23:02 · Score: 3, Informative

Here is a recent article which may answer your question:

Large-Scale Video Archiving?

BTW, it may also re-open the debate:

Is Storage Capacity Outstriping Backup Capability?

--
Trolling using another account since 2005.

Reality check... by Anonymous Coward · 2001-11-06 23:09 · Score: 5, Informative

Does anybody realize, that, even with a data rate of the order of 1GB/s, much higher than what current platters can do, it takes about 5 years to fill such a disk.

I'm already fed up of the time it takes to back up large disks to tape. Drive transfer rate has not improved at the rate of disk capacity in the last few years and is becoming a bottleneck. It was unimportant when the backup time of a single disk was well below one hour (our Ultrium tapes give about 40Gb/hour).

Just figure that if you want to transfer 144PB in about one day, you need a transfer rate of the order of 1TB/s. Electronics is far from there since it means about 10 terabits/second. Even fiber is not yet there. Barring a major revolution, magnetic media and heads can't be pushed that far. At least it is way further than the foreseeable future.

Don't get me wrong, it is much better to have more address bits than needed to avoid the painful limitations of 528 Mb, 1024 cylinders etc... But, as somebody who used disks over 1Gb on mainfranmes around 1984-1985, I easily saw all the limitations of the early IDE interfaces (with the hell of CHS addresses and its ridiculously low bit numbers once you mixed the BIOS and interface limitations) and insisted on SCSI on my first computer (now CHS is history thanks to LBA, but the transition has been sometimes painful).

However, right now big data centers don't always use the biggest drives because they can get more bandwidth by spread the load on more drives (they are also slightly wary of the greatest and latest because reliability is very important). Backing up starts to take too much time,

In short, the 48 bit block number is not a limit for the next 20 years or so. I may be wrong, but I'd bet it'll take at least 15 years, perhaps much more because it is too dependent on radically new technologies and the fact that the demand for bandwidth to match the increase in capacity will become more prevalent. Increasing the bandwidth is much harder since you'll likely run into noise problems, which are fundamental physical limitations.

Waiting for the obligitory... (sp?) by Talez · 2001-11-06 23:16 · Score: 3, Funny

It is a start by Zeinfeld · 2001-11-06 23:21 · Score: 5, Interesting

The announcement is pretty irrelevant, all it says is that there is a Linux driver for the new disk drive interface that supports bigger disks.

The real advance here is that the disk drive weenies have at last realised that they need to come out with a real fix for the 'big drive' problem and not yet another temporary measure.

Despite the fact that hard drives have increased from 5 Mb storage to 100Gb over the past 20 years the disk drive manufacturers have time after time proposed new interface standards that have been obsolete within a couple of years of their introduction.

Remember the 2Gb barrier? Today we are rapidly approaching the 128Gb barrier.

What annoys me is that the disk drive manufaturers seem to be unable to comprehend the idea of 'automatic configuration'. Why should I have to spend time telling my BIOS how many cylinders and tracks my drive has? I have a couple of older machines with somewhat wonky battery backup for the settings, every so often the damn things forget what size their boot disk is. Like just how many days would it take to define an interface that allowed the BIOS to query the drive about its own geometry?

Of course in many cases the figures you have to enter into the drive config are fiddled because the O/S has some constraint on the size of drives it handles.

We probably need a true 64 bit Linux before people start attaching Petabyte drives for real. For some reason file systems tend to be rife with silly limitations on file sizes etc.

Bit saving made a lot of sense when we had 5Mb hard drives and 100kb floppy drives. It does not make a lot of sense to worry about a 32bit or 64 bit file size field when we are storing 100kb files.

If folk go about modifying Linux, please don't let them just deal with the drives of today. Insist on at least 64 bits for all file size and location pointers.

We are already at the point where Terrabyte storage systems are not unsusual. Petabyte stores are not exactly commonplace but there are several in existence. At any given time there are going to be applications that take 1000 odd of the largest disk available in their day. Today that means people are using 100Tb stores, it won't be very long before 100Pb is reached.

--
Looking for an Information Security student project suggestion?
Try http://dotcrimeManifesto.com/

Just how much is 144 PB? by Hektor_Troy · 2001-11-06 23:42 · Score: 5, Interesting

144 Petabytes doesn't sound like a lot. When putting it into writing:

144,000,000,000,000,000 or 144*10^15

it's impossible to comprehend.

Here's a way to visualise it - although it's also mindboggeling:

Take a sheet of paper with the squares on it. If you put a single byte in each 5mm by 5mm (1/5" by 1/5") square and use both sides, you'd need:

3,600,000 km^2 of paper to have room for those 144 PB. That's roughly 1,325,525 square miles for you people who don't use the metric system.

So when people say "it doesn't sound like a lot", you know how to get them to understand that it really IS a lot.

--
We do not live in the 21st century. We live in the 20 second century.

Re:Big deal by HalJohnson · 2001-11-07 01:00 · Score: 4, Informative

Typically I wouldn't even waste time answering such an obvious troll, but maybe you haven't realized what open source is all about, let me make it succinct.

This obviously mattered to the people who implemented it. If you'd rather see development move in a different direction, by all means, write some code that you feel is useful.

See, the people who implemented this probably don't give a damn what you feel is important, they care about what they feel is important.

It's really very simple, put up or shut up.

Limit is for a single IDE disk by wowbagger · 2001-11-07 01:02 · Score: 3, Informative

This limit is for a SINGLE IDE disk. Now, if you use Logical Volume Management (which is in the standard 2.4 kernel, no patches required) you can combine multiple disks into one.

Since my machine has 2 IDE controllers, with 2 buses each, and 2 drives per bus, you could make a system with 8 144 pB drives, put an XFS partition on it, and have 1152.92 pB of storage.

And for meaningless statistics sake: I make my MP3s (from CDs that I own, thankyouverymuch) at an average of 160 kb/sec. At that rate, the specified drive array would store 1826693 YEARS of MP3s. None of which would be Brittany Spears.

--
www.eFax.com are spammers

1.44 petabytes is half a lifetime by mrogers · 2001-11-07 02:30 · Score: 5, Insightful

The size of information storage devices and the bandwidth of networks are approaching meaningful limits: the size and bandwidth of human experience. Tor Norretranders claims in his book The User Illusion that the amount of information absorbed by the senses is around 11 Mbits per second. In other words, a totally immersive virtual experience with sight, sound, smell, taste, touch and motion could be transmitted over a standard Ethernet connection. An entire day of a human life could be recorded in perfect detail (with no compression) on a 120 GB disk. So there is a limit to how much information you could ever want to store. In your entire life you will experience less than 3.5 petabytes of information. 1.44 petabytes will never seem small to a human being.

However, there might one day be information processing systems to which 1.44 petabytes is a small amount of information. In a sense, these systems will have a richer experience of the world than human beings. I wonder if human consciousness would seem marvellous or valuable to such a machine.

Slashdot Mirror

Linux Breaks 100 Petabyte Ceiling

32 of 330 comments (clear)