Slashdot Mirror


Large File Problems in Modern Unices

david-currie writes "Freshmeat is running an article that talks about the problems with the support for large files under some operating systems, and possible ways of dealing with these problems. It's an interesting look into some of the kinds of less obvious problems that distro-compilers have to face."

4 of 290 comments (clear)

  1. Not really that groundbreaking... by CoolVibe · · Score: 4, Interesting

    The problem is nonexistant in the BSD's, which use the large file (64 bit) versions anyway. And that you have to use a certain -D flag if your OS (like Linux) doesn't use the 64 bit versions. Whoopdiedoo. Not so hard. Recompile and be happy.

  2. Re:Why large files by Anonymous Coward · · Score: 5, Interesting

    Real analytical work can easily produce files this large. Output for analyses of structures with more than half a million elements and several million degrees of freedom can EASILY produce output of over two gigs. Yes, these results can and should be split, but sometimes it makes sense to keep them together as a matter of convenience. Plus, there IS a small performance hit when dealing with multiple files on most of the major FEA packages.

  3. Re:Why large files by CoolVibe · · Score: 5, Interesting
    raw video can easily exceed 2 GB in size. Why raw video? Because (like others said) it's easier to edit. Then you encode to MPEG2, which will shrink the size somewhat (usually still bigger than 2 GB, ever dumped a DVD to disk?), so it'll be "small" enough to burn onto a DVD or somesuch. Oh, editing 3 hours of raw wave data also chews away at the disk size. Also, since you need to READ the data from the media to see if it looks nice, you need to have support for those big files as well. Right, now why don't we need files bigger than 2 GB again? Well?

    Oh, you're still not convinced, well see it this way: when in the future will you ever need to burn a DVD?

    Well? A typical one sided DVD-R holds around 4 GB of data (somewhat more), if you use both sides, you can get more than 8 GB of data on it. That's way bigger than 2 GB, no? Now, how big must your image be before you burn it on there? well?

    Right...

  4. Re:Wrong point of view. by Yokaze · · Score: 4, Interesting

    I'm not a specialist on this matter, so maybe you can enlighten me, where I am wrong or misunderstood you.

    > fragmentation: large files increase to fracmentation of most file systems
    What kind of fragmentation?

    Small files lead to more internal fragmentation.
    Large files are more likely to consist of more fragments, but when splitting this data into small files, those files are fragments of the same data.

    >entropy pollution
    What kind of entropy? Are you speaking of compression algorithms?

    Compression ratios are actually better with large files than small files, because similarities between files across file-boundaries can be found. Therefor, gzip(bzip2) compresses a single large tar-file. (Simple test, try zip on many files and then zip without compression and subsequent compression on the resulting file).

    >data pollution
    How should limiting file size improve that situation? Then, people tend to store data in lot of small files. What a success. People will waste space, whether there is a file size limit or not.

    >These limits are there for very good reasons and in my opinion they are even much to big.

    Actually, they are there for historical reasons.
    And should a DB spread all its tables over thousands of files instead of having only one table in one file and mmapping this single file into memory? Should a raw video stream be fragmented into several files to circumvent a file limit?

    >[...] original K&R Unix [...] was much faster than modern systems

    Faster? In what respect?

    --
    "Between strong and weak, between rich and poor [...], it is freedom which oppresses and the law which sets free"