Linux Breaks 100 Petabyte Ceiling
*no comment* writes: "Linux has broken the barrier with the 100 petabyte ceiling, and
doing it at 144 petabytes." And this is even more impressive in pebibytes, too.
← Back to Stories (view on slashdot.org)
The IDE driver supports such rediculously large files, but no filesystem that I know of currently does, not to mention the buffer management code in the kernel.
So does linux support 18pb files? kind of -- pieces of it do. But the system as a whole does not.
2.197265625 trillion Commodore 64's.
98.7881981 billion 1.44 meg floppy disks.
1.44 million 100 gig hard drives
or
3.5 trillion 4K ram chips (remember those?)
If tits were wings it'd be flying around.
Is 1 petabyte 1000^5 or 1024^5? (i.e. is it 10^15 or 2^50)?)
:-)
If 1kB = 1024 Bytes, then I've always assumed that 1MB = 1024kB (instead of 1000kB), 1GB = 1024MB, and so on.
Normally this doesn't make that much difference, but when you consider the cost of a 16 (144-128) petabyte hard drive, then the difference is more important
... but a couple of years ago, I was investigating OODBMSs. The sales bloke for (I think it was) Objectivity claimed that CERN were using their database for holding all the information from the particle detector things - which I can see being a shedload of data (3d position + time + energy). He was suggesting figures of 10 petabytes a year for database growth (so it must be frigging huge by now).
Of course, this was probably salescrap. Does anyone know the truth on this?
This sig made only from recycled ASCII
From my perspective, while obscenely large limits on file system sizes are no bad thing, I'm more interested by the prospect for scalability in the context of realistic problems. I see much larger challenges in establishing systems to maximally exploit locality of reference. I'd also like to see memory mapped IO extended to allow direct use to be made of entire large scale disks in a single address space using a VM-like strategy ... but I guess this will only be deemed practicable once we're all using 64 bit processors. Are there any projects to approximate this on 32 bit architectures?
The real advance here is that the disk drive weenies have at last realised that they need to come out with a real fix for the 'big drive' problem and not yet another temporary measure.
Despite the fact that hard drives have increased from 5 Mb storage to 100Gb over the past 20 years the disk drive manufacturers have time after time proposed new interface standards that have been obsolete within a couple of years of their introduction.
Remember the 2Gb barrier? Today we are rapidly approaching the 128Gb barrier.
What annoys me is that the disk drive manufaturers seem to be unable to comprehend the idea of 'automatic configuration'. Why should I have to spend time telling my BIOS how many cylinders and tracks my drive has? I have a couple of older machines with somewhat wonky battery backup for the settings, every so often the damn things forget what size their boot disk is. Like just how many days would it take to define an interface that allowed the BIOS to query the drive about its own geometry?
Of course in many cases the figures you have to enter into the drive config are fiddled because the O/S has some constraint on the size of drives it handles.
We probably need a true 64 bit Linux before people start attaching Petabyte drives for real. For some reason file systems tend to be rife with silly limitations on file sizes etc.
Bit saving made a lot of sense when we had 5Mb hard drives and 100kb floppy drives. It does not make a lot of sense to worry about a 32bit or 64 bit file size field when we are storing 100kb files.
If folk go about modifying Linux, please don't let them just deal with the drives of today. Insist on at least 64 bits for all file size and location pointers.
We are already at the point where Terrabyte storage systems are not unsusual. Petabyte stores are not exactly commonplace but there are several in existence. At any given time there are going to be applications that take 1000 odd of the largest disk available in their day. Today that means people are using 100Tb stores, it won't be very long before 100Pb is reached.
Looking for an Information Security student project suggestion?
Try http://dotcrimeManifesto.com/
I figure that at ATA-100 speeds, it would take 49 years to read the entire file.
144 * 2^50 # n bytes
/ 100 * 2^20 # bytes/sec ATA-100
= 1.44 * 2^30 # n I/O seconds
/ 60*60*24*365 # ~ secs/year
= 49.03 # n I/O years
144 Petabytes doesn't sound like a lot. When putting it into writing:
144,000,000,000,000,000 or 144*10^15
it's impossible to comprehend.
Here's a way to visualise it - although it's also mindboggeling:
Take a sheet of paper with the squares on it. If you put a single byte in each 5mm by 5mm (1/5" by 1/5") square and use both sides, you'd need:
3,600,000 km^2 of paper to have room for those 144 PB. That's roughly 1,325,525 square miles for you people who don't use the metric system.
So when people say "it doesn't sound like a lot", you know how to get them to understand that it really IS a lot.
We do not live in the 21st century. We live in the 20 second century.
Those were problems with the amount of space which can be physically stored in a space -- not a problem with the size of a file. This achievement has only passing significance, as there is currently no way to include that much storage in one device!
It's been a long time.