Large File Problems in Modern Unices
david-currie writes "Freshmeat is running an article that talks about the problems with the support for large files under some operating systems, and possible ways of dealing with these problems. It's an interesting look into some of the kinds of less obvious problems that distro-compilers have to face."
Who needs more than 512k of RAM??
The problem is nonexistant in the BSD's, which use the large file (64 bit) versions anyway. And that you have to use a certain -D flag if your OS (like Linux) doesn't use the 64 bit versions. Whoopdiedoo. Not so hard. Recompile and be happy.
Question answered, move along, nothing to see here :)
Cover your eyes and click this link!
Databases, Movie files, Backup files (think dumps to tapes). Animations, 3D modelling.... Lots of things need a > 2GB file size. Iain
---- "I would be careful in separating your weirdness, a good quirky quantum weirdness, from the disturbed weirdnes
Video. Raw, uncompressed, high-quality video with a sound channel is fucking HUGE. Look how big DivX files are, and they're compressed many, many times over.
And compressing video on-the-fly isn't feasible if you're going to be tweaking with it, so that's why people use raw video.
-Mark
Ever heard of something like movie-editing? You can get huge files really fast.
-- we're dressed in green, and we're feeling mean
Real analytical work can easily produce files this large. Output for analyses of structures with more than half a million elements and several million degrees of freedom can EASILY produce output of over two gigs. Yes, these results can and should be split, but sometimes it makes sense to keep them together as a matter of convenience. Plus, there IS a small performance hit when dealing with multiple files on most of the major FEA packages.
vmware uses files as virtual disks. 2GB would be a really, really small disk. UML does the same, using the loop device feature of Linux. Again, a filesystem in a file. Again, 2GB is not much. Simulating 20GB would need 10 files.
Feels like 64kbyte segments somehow...and I really don't want to have those back.
Come on. Even Bill Gates admitted that half a meg ain't enough.
640K, on the other hand, should be enough for anyone...
-Mark
my data warehouse at work is 600GB and grows at a rate of 4GB per day.
the production database that drives the sites is like 100GB
welcome to last week. 2GB is tiny.
A year spent in artificial intelligence is enough to make one believe in God.
I said this to some unix 'so called experts' in 95, and they said, oh why why do you need >2gig
I can just laugh at them now...
Liberty freedom are no1, not dicks in suits.
I am not agreeing (or disagreeing) with the original post, but having a database > 2 GB has nothing to do with having a single file over 2 GB. A db != a file system (except for MySQL perhaps).
I can think of some:
And that's just without thinking twice...there are probably many more reasons why people would want files >2 GB.
Every expression is true, for a given value of 'true'
--Bill Gates
We are seeing problems with off_t growing from 32 to 64 bits. We are also going to see this when we start going to a 64 bit time_t, as well (albeit not as badly - off_t is probably used more than time_t is.)
However, the pain is coming - remember we have only about 35 years before a 64 bit time_t is a MUST.
I'd like to see the major distro venders just "suck it up" and say "off_t and time_t are 64 bits. Get over it."
Sure, it will cause a great deal of disruption. So did the move from aout to elf, the move from libc to glibc, etc.
Let's just get it over with.
www.eFax.com are spammers
So my wife says to me, "Honey, do I look fat in this filesystem ?"
I replied, "Sweetie, I married you for your trust fund not your cluster size."
Oh, you're still not convinced, well see it this way: when in the future will you ever need to burn a DVD?
Well? A typical one sided DVD-R holds around 4 GB of data (somewhat more), if you use both sides, you can get more than 8 GB of data on it. That's way bigger than 2 GB, no? Now, how big must your image be before you burn it on there? well?
Right...
Yes. Just like "matrices" is the plural of "matrix". Not that the words have a similar etymology - according to dictionary.com it's, in the authors' words, "A weak pun on Multics".
Switch back to Slashdot's D1 system.
Two words:
Video Editing
Daniel
Carpe Diem
I have most all of my older system images available to inspect. The loopback devices under Linux are tailor made for this type of thing.
I am puzzled as to why you mention the seek times. Surely you would agree that the seek time should be only inversely geometrically related to size, the particular factors depending on the filesystem. Any deviation from the theoretical ideal is the fault of a particular OS's implementation. My experience is that this is not significant.
(user dmanny on wife's machine, ergo posting as AC)
We don't have this problem-- 4 petabyte maximum file size 1 terabyte tested at present http://www-1.ibm.com/servers/aix/os/51spec.html
On the Windows side many people like to save every message they send or receive to cover their ass just in case. This is very popular among US Government employees. Some people who get a lot of email can have their personal folders file grow to 2GB in a year or less. At this level MS recommends breaking it up since corruption can occur.
Can anyone give a good reason for needing files larger than 2gb?
Forensic analysis of disk images. And yes, from experience I can tell you that half the file tools on RedHat (like, say, Perl) aren't compiled to support >2GB files.
Because the sentences mean different things.
/Janne
"It is an interesting problem that some distro-compilers have to face."
talks about the problem facing distro compilers, whereas
"It's an interesting look into some of the kinds of less obvious problems that distro-compilers have to face."
Talks about the article adressing these problems.
Trust the Computer. The Computer is your friend.
It has a nice small 1gb filesystem limit. I have partitioned my hard disk in to 64 little chunks and it runs very slowly, and unstabilly, but its completley open source and im happy.
"oh yes, those were the days." - misty eyed smile - "when i was young and filesizes were small. you should have seen it. today's youth is so spoiled that they don't even learn assembly language any more. i tell you, you're all going to die because of your large files, yes, die!" - madly waves his cane in the air - "2gb, that's more than anybody will ever need and you are greedy for even more! the holy bit will punish you for this, it will!" - dies of a heart attack.
In a world everything is small and manageable. Unfortunately, some databases need tables BIGGER than 2gb. Even splitting that table into multiple files still finds you with files larger than two gb. Try adding more tables? OK. Now they've grown to over 2gb and the more tables the more complicated everthing gets. I still need to back these suckers up and a backup vendor that I won't name can't help me because their software wasn't large file (for Linux) ready. So let's get into the game with this and make it the default so we don't need to worry about these problems in the future. Linux IS an enterprise solution.....(my $.02)
We use a Unidata database here for an ERP system, each database is more than 2gb a piece (more like 20 gb) of relatively small files, when the directories are tarred for backup reasons they are usually over 2gb which means that gzip won't compress them. Unless I'm missing something I don't see an alternative for files large than 2gb in this case. Sure on the personal computing level the closest thing you probably get is ripping DVD's but there are other things out there, and I realize this is tiny in comparison to some places.
The seek times alone withinr these files must be huge
Who moded that as Insightful? Sure, if you are using a filesystem designed for floppy disks, it might not work well with 2GB files. In the old days where the metadata could fit in 5KB a linked list of diskblocks could be acceptable. But any modern filesystem uses tree structures which makes a seek faster than it would be to open another file. Such a tree isn't complicated, even the minix filesystem has it.
If you are still using FAT... bad luck for you. AFAIK Microsoft was stupid enough to keep using linked lists in FAT32, which certainly did not improve the seek time.
Do you care about the security of your wireless mouse?
I just wonder why we don't learn from past (limits) and remove this limits "forever". E.g. 1 month ago I recieved question of possibility building 10 TB Linux cluster (physics are crazy ;-)).
;-)
There surely MUST be some way how to do this - I just imagine some file (e.g. defined in LSB) which would define this limits for COMPLETE system (from kernel, filesystems, utils to network daemons). I know there are efforts to things like this but if we'd say (for example) thay that distribution in 2004 won't be marked "LSB compatible" if ANY of programs will use any other limits I think it will create enough preasure on Linux vendors.
Just a crazy idea
1) Splitting up a big file turns an elegant solution into a an inelegant nightmare.
2) Instead of 10 different applications writing code to support splitting up an otherwise sound model, why not have 1 operating system have provisions for dealing with large files.
3) You are going to need the bigger files with all those 32 bit wchar_t and 64 time_ts you got!
This is my sig.
Maybe in your problem domain that's true. I work with retailer data mines and we've hit the 2GB file limit, oh, 4-5 yrs ago? We've been forced to partition databases causing maintainance issues, scalability issues, and the like, just because of the size of a B-tree index.
True, it looks like the optimal solution is lower-level partitioning, rather than expanding the index to 64bits (tests showed that the latter is slower), but that still means that the practical limit of 1.5-1.7 GB per file (because you have to have some safety margin) is far too constraining. I know installations who could have 200GB files tomorrow if the tech was there (which it isn't, even with large file support).
I am also guessing that numerical simulations and bioinformatics apps can probably produce output files (which would then need to be crunched down to something more meaningful to mere humans) in the TB range.
Computing power will never be enough: there will always be problems that will be just feasible with today's tech that will only improve with better, faster technology.
the datafile size averages 8GB in the warehouse.
A year spent in artificial intelligence is enough to make one believe in God.
"of the kinds" really adds nothing to the meaning here, nor does "have to"
Thus we have:
The same sentence, but much cleaner!
Thanks! I'll be here all week.
My Ass hurts.
Over Christmas and New Years, I helped my wife run a simulation of 1000 different patients for an acedemic pharmacokinetics paper. The run took ten days and had an input file of about 1.5 GB. If her computer was faster, or she had access to more computers, she would have wanted to simulate more patients and would easily have needed support for files larger than 4 GB. As CPUs get faster and hard disks get larger, there will be much more demand for these large files as well as more than 4 GB per process.
What a fool believes, he sees, no wise man has the power to reason away.
I remember reading in the BeOS Bible that the BeOS filesystem could contain files as large as 18 petabytes. Makes you wonder two things: What's the biggest filesystem that you could use with a BeOS machine? and Why don't other OSs have filesystem like this. Espcecially with those awesome extended attributes. I weep for the loss of the BeOS filesystem...
*slight crashing sound*
I'm not a specialist on this matter, so maybe you can enlighten me, where I am wrong or misunderstood you.
> fragmentation: large files increase to fracmentation of most file systems
What kind of fragmentation?
Small files lead to more internal fragmentation.
Large files are more likely to consist of more fragments, but when splitting this data into small files, those files are fragments of the same data.
>entropy pollution
What kind of entropy? Are you speaking of compression algorithms?
Compression ratios are actually better with large files than small files, because similarities between files across file-boundaries can be found. Therefor, gzip(bzip2) compresses a single large tar-file. (Simple test, try zip on many files and then zip without compression and subsequent compression on the resulting file).
>data pollution
How should limiting file size improve that situation? Then, people tend to store data in lot of small files. What a success. People will waste space, whether there is a file size limit or not.
>These limits are there for very good reasons and in my opinion they are even much to big.
Actually, they are there for historical reasons.
And should a DB spread all its tables over thousands of files instead of having only one table in one file and mmapping this single file into memory? Should a raw video stream be fragmented into several files to circumvent a file limit?
>[...] original K&R Unix [...] was much faster than modern systems
Faster? In what respect?
"Between strong and weak, between rich and poor [...], it is freedom which oppresses and the law which sets free"
In my previous job we regularly processed credit data files >2 GB. All the data is processed serially (as someone else mentioned), so seek time is not an issue (nor is it an issue in a binary data file - seek to 1.4GB. Done. Next.).
The real issue we ran up against was compression... we wanted to have the original and interm data files available on-disk for awhile in case of reprocessing. The processing would generally take up 10x as much space as the original data file, so you compressed everything. Except that gzip can't handle files >2GB (at the time an alpha could, but we didn't want to touch it). Nor can zip. So we had to use compress. Yay. (bzip could handle it, but was decided against by the powers that be).
Compression of large files is still an issue, unless you want to split them up. Unless you download a beta version gzip still can't handle it. As I understand it zip won't ever be able to do it. There are some fringe compressors that can handle large files, but, well, they're fringe.
-D_FILE_OFFSET_BITS=64 and -D_LARGEFILE_SOURCE
This forces all file access calls to their 64-bit variants, and you'll explicitly need to use structs like off64_t instead of off_t where needed. And I believe most large file support is really available only past glibc 2.2
Additionally you need to use O_LARGEFILE with open etc. So legacy applications that use glibc fs calls have to be recompiled to take advantage of this, and may need source level changes. Won't work on older kernels either.
I sure hope that was a joke. Because otherwise it would be one of the most clueless comments I have seen.
Sure spliting data into a lot of smaller files is going to reduce the fragmentation slightly, but it is not going to improve your performance. Because the price of accessing different files is going to be higher than the price of the fragmentation.
In the next two arguments you managed to make two opposite statements both incorrect. That is actually quite impressive.
First you say large files increase the entropy of the data stored on the disk. Which is wrong as long as you compare to the same data stored in diffeerent files. Of course if the number of files on the disk is constant smaller files will lead to less entropy, but most people actually want to store some data on their disks.
Then you say large files are highly redundant, which is the opposite of having a large entropy as claimed in your previous argument. And in reality the redundancy does not tend to increase with filesize, but might of course depend on the format of the file.
All in all you are saying that people shouldn't store many data on their disks, and the little data they do store should be as compact as possible, while still allowing it to be compressed even further when doing backups. You might as well have said people shouldn't use their disks at all.
Finally claiming older Unix versions were faster is ridiculous, first of all they ran on different hardware. And surely on that hardware they were slower than todays systems. And even if you managed to port an ancient Unix version to modern hardware, I'm sure it wouldn't beat modern systems in todays tasks. Which DVD player would you suggest for K&R Unix?
Do you care about the security of your wireless mouse?
Some numbers for *uncompressed* video:
NTSC/YUV2/stereo: ~111gb for a cinema movie (1hr 45min)
PAL/YUV2/stereo: ~125gb for same
HTDV/surround: ~908gb for same
With huffyuv (very low CPU usage, lossless) you should be able to cut that by a factor of 2-3. But it's still *huge*
Kjella
Live today, because you never know what tomorrow brings
One of the ways to keep errors from creeping into programs is to put limits on things so high that you can never reach them in the practical world.
The 31 bit limit on time_t overflows in this century - 63 bits outlasts the probable life of the Universe so it is unlikely to run into trouble.
That is the best argument I know for a 64 bit file size; in the long run it is one less thing to worry about.
There is something innate in the education, learning, and daily working of a programmer that makes them not want to use 'too big' of a number for a certain task.
it either
A) Wastes Memory Space
B) Wastes Code Space
C) Wastes Pointer Space
D) Or Violates some other tenant the programmer believes
So, When they go out and create a file structure, or something similar, they don't feel like exceeding some 'built-in' restriction to their way of thinking.
And usually, at the time, it's such a big number that the programmer can't think of an application to exceed it.
Then, one comes along and blows right through it.
I've been amused by all the people jumping on the 'it don't need to be that big' bandwagon. I can think of many applications that ext3 or whatever would need to use to make big files. they include:
A) Database Servers
B) Video Streaming Servers
C) Video Editing Workstations
D) Photo Editing Workstations
E) Next Big Thing (tm) that hasn't come out yet.
As a rock-in-roll Physicist once said, No matter where you go, there you are.
I had a problem with HP-UX apparently not wanting to transfer via NFS (when the NFS server is on HP-UX 11.0) files larger than 2GB. I had to backup a Solaris computer's hard disk using DD across NFS. This usually worked when the NFS server is Solaris. However, last friday it failed, when the server was setup on HP-UX. I had to resort to my little Blade 100 as the NFS server, and I had no problems with it.
/etc/exports and then restart NFS daemon (or send SIGHUP)?
I have noticed that on the SAME DAY some folks have asked question about the 2 GB filesize limit in HP-UX on comp.sys.hp.hpux !! Apparently, HP-UX default tar and cpio don't support files over 2 GB, either. Not even in HP-UX 11i. I never thought HP-UX stinked this bad...
How does Linux on x86 stack up? I decided not to use it for this backup, since I had my Blade 100, but would it have worked? Oh, btw, is there finally implemented on Linux a command like "share" (exsts in Solaris) to share directories via NFS, or do I still need to edit
Sigged!
PAL: Max 720x576x25fps interlaced (50 Hz)
NTSC: Max 640x480x29.97fps interlaced (60 Hz)
No, the don't have same frequency, nor scanlines. Some european TVs will take PAL-60, like PAL only at 60Hz though. Also I don't think the color space works in the same way, but not sure about that one. That was why I used YUV2 (16bit) for both.
Kjella
Live today, because you never know what tomorrow brings
For most programs, it would require little more
than to change the typedef that defines __time_t
in bits/types.h.
For stupidly written programs that assume the
size of __time_t or that use __time_t in unions,
each will need to be addressed individually to
make sure things still work correctly.
*sigh* back to work...
A much bigger problem is that Linux filesystems have a capacity limit of 2TB.
Many servers now have the physical capacity of over 2TB on a filesystem storage device.
Unfortunately this is still a very significant limitation.
This problem is much more commonly encountered than file size limitations.
Maurice W. Hilarius Voice: (778) 347-9907
Once upon a time (prior to 1978) there was no lseek() call in Unix. The value for the offset was 16 bits . Larger seeks were handled by using the different value for "whence" (the third argument to seek()) which causes seeks to occur in 512-byte increments. This resulted in a maximum seek of 16,777,216 bytes, with an arbitrary seek() often requiring two calls, one to get to the right 512-byte block and a second to get to the right byte within the block. (Thank goodness they haven't done any such silliness to break the 2GB barrier.)
When Research Edition 7 Unix came out, it introduced lseek() with a 32-bit offset. 2,147,483,648 bytes should be enough for anyone, hmmm? :-).
Wrong. The C99 standard says in section 6.3.1.8 paragraph 1:
Here, the common real type is unsigned int, and the description of the addition and subtraction operators (section 6.5.6) does not specify a different type for the result when both operands have arithmetic type.
If you disagree, please cite relevant parts of the standard to support your case.