Slashdot Mirror


Large File Problems in Modern Unices

david-currie writes "Freshmeat is running an article that talks about the problems with the support for large files under some operating systems, and possible ways of dealing with these problems. It's an interesting look into some of the kinds of less obvious problems that distro-compilers have to face."

290 comments

  1. Re:Why large files by mr.henry · · Score: 3, Funny

    Who needs more than 512k of RAM??

  2. Not really that groundbreaking... by CoolVibe · · Score: 4, Interesting

    The problem is nonexistant in the BSD's, which use the large file (64 bit) versions anyway. And that you have to use a certain -D flag if your OS (like Linux) doesn't use the 64 bit versions. Whoopdiedoo. Not so hard. Recompile and be happy.

    1. Re:Not really that groundbreaking... by Anonymous Coward · · Score: 1, Informative

      What happens when you need more than 2^64 bytes storage? Cheat with granularity? The same problem still exists and isn't solved. Your train of thought is the same which allowed 32-bits to be used in the first place. Recurssive expansion would be the only real solution.

    2. Re:Not really that groundbreaking... by Anonymous Coward · · Score: 0

      No kidding. I just ripped a DVD to disk and ended up with a 7 GB file. I would have been fux0red had I been running Linux instead of FreeBSD, cause I didn't even think of it beforehand.

    3. Re:Not really that groundbreaking... by statusbar · · Score: 2, Funny

      2^64 = 17,179,869,184 gigabytes!

      17,179,869,184 gigabytes ought to be enough for ANYBODY!

      --jeff++

      --
      ipv6 is my vpn
    4. Re:Not really that groundbreaking... by Citizen+of+Earth · · Score: 1

      17,179,869,184 gigabytes ought to be enough for ANYBODY!

      That would be 16 exabytes, but I think that, alas, realistically you will only be getting eight exabytes since you will probably deal with signed integers at some stage in most systems. (Are we allowed to use the term 'exabyte' since it's a registered trademark of a corporation? Does this trademark have any substance given that it is a common English word?)

  3. video, mp3's, even dvds are beyond 2gb by xintegerx · · Score: 2, Informative

    Question answered, move along, nothing to see here :)

    1. Re:video, mp3's, even dvds are beyond 2gb by bns_robson · · Score: 2, Funny

      Your link doesn't work. I get a DNS failure loking up host 578.291.762.662

  4. Re:Why large files by tgeerts · · Score: 1

    Video + Audio >= 2GB

  5. Re:Why large files by voodoopriestess · · Score: 3, Informative

    Databases, Movie files, Backup files (think dumps to tapes). Animations, 3D modelling.... Lots of things need a > 2GB file size. Iain

    --
    ---- "I would be careful in separating your weirdness, a good quirky quantum weirdness, from the disturbed weirdnes
  6. Re:Why large files by Big+Mark · · Score: 5, Insightful

    Video. Raw, uncompressed, high-quality video with a sound channel is fucking HUGE. Look how big DivX files are, and they're compressed many, many times over.

    And compressing video on-the-fly isn't feasible if you're going to be tweaking with it, so that's why people use raw video.

    -Mark

  7. Re:Why large files by Ogion · · Score: 2, Insightful

    Ever heard of something like movie-editing? You can get huge files really fast.

    --
    -- we're dressed in green, and we're feeling mean
  8. Unices? by ToKsUri · · Score: 0, Offtopic

    Unices plural for unix?

    1. Re:Unices? by moonbender · · Score: 2, Informative

      Yes. Just like "matrices" is the plural of "matrix". Not that the words have a similar etymology - according to dictionary.com it's, in the authors' words, "A weak pun on Multics".

      --
      Switch back to Slashdot's D1 system.
    2. Re:Unices? by Anonymous Coward · · Score: 0

      Speech impediment, insensitive clod!

    3. Re:Unices? by Looke · · Score: 1

      Geeks seem to have a weird fascination for strange spellings. "-ces" is the traditional plural ending of Latin words ending in "x". Obviously, "Unix" does not originate from Latin, and "Unices" is thus nothing but a (bad) joke. (The same applies to "emacsen", and there are a few others around as well.)

    4. Re:Unices? by Anonymous Coward · · Score: 0

      No. An army women named Unice all with files stuffed up their butts.

    5. Re:Unices? by Anonymous Coward · · Score: 0

      Like the ever popular viri or virii

      For some reason that one just makes me cringe in the same way wierd or definately does.

      Unices just sounds kind of cool though. go figure :)

    6. Re:Unices? by david-currie · · Score: 1

      I'd never heard emacsen, but VAXen is commonly used for multiple VAX machines, I believe.

    7. Re:Unices? by N1KO · · Score: 1

      Virus comes from latin.

    8. Re:Unices? by Anonymous Coward · · Score: 0

      And its latin plural was never viri, or virii

    9. Re:Unices? by Q+Who · · Score: 1

      Just like "matrices" is the plural of "matrix".

      "Matrices" is a plural form of "matrix." The other one is "matrixes."

    10. Re:Unices? by bunratty · · Score: 1
      Oh, that brings up a pet peeve of mine -- when people call a matrix a "matricee"! When I hear someone say that word, I roll my eyes and think "this guy has no idea what he's talking about!"

      Getting back on topic, maybe the plural for Unix should be Unixen, like the plural for Vax is Vaxen?

      --
      What a fool believes, he sees, no wise man has the power to reason away.
    11. Re:Unices? by Anonymous Coward · · Score: 0

      I think that Vaxen as plural of Vax is similar to vixen as plural of fox. Boxen makes more sense than Unixen.

    12. Re:Unices? by Anonymous Coward · · Score: 0

      There is no plural, as virus is a plural word in latin already.

    13. Re:Unices? by yuri+benjamin · · Score: 1

      There is no plural, as virus is a plural word in latin already

      Really? What declension does the word virus belong to?
      I seem to recall that some declensions in latin have both the singular and plural ending in -us but it's ages since I studied latin - over a decade ago. I'm not even sure any more how to spell declension.

      --
      You make the mistake of thinking you can educate the fundamental stupidity out of people. You can't.
    14. Re:Unices? by superyooser · · Score: 1
      Rule of Grammar: When a word ends in x, you make it plural by adding es.
      Examples: tax => taxes; sex => sexes; fox => foxes; box => boxes

      The word ox, with its plural oxen, is a freak of English grammar. It is the exception, not the rule.

      Examples of this bogus pluralization applied to similar words:

      I hate doing my taxen.
      Both sexen have positive and negative characteristic qualities.
      The hunters shot three foxen in the woods.
      Your boxen will be shipped in 4-6 weeks.
      Both en and es have the same number of keystrokes and bits. en has no advantage, except the appearance of 1337ness to people who don't know better. So please stop using it and trying to one-up the dictionary. (This goes for virii and Unices too.) I know it's only being used with geeky words so far, but that only makes the rules of pluralization even more complicated.

      The English language is convoluted enough without deliberately introducing more irregularities.

  9. Re:Why large files by Anonymous Coward · · Score: 5, Interesting

    Real analytical work can easily produce files this large. Output for analyses of structures with more than half a million elements and several million degrees of freedom can EASILY produce output of over two gigs. Yes, these results can and should be split, but sometimes it makes sense to keep them together as a matter of convenience. Plus, there IS a small performance hit when dealing with multiple files on most of the major FEA packages.

  10. Re:Why large files by amigaluvr · · Score: 0

    Good answer. A 2gb movie would have to go for nearly 4 hours and that includes audio. Explain?

    a 20mb mp3 can go well into an hour. Explain?

    If you really need a movie which hits tjhat many hours you would be breaking it up into cd sized chunks anyway

  11. 640K is enough for you! by Anonymous Coward · · Score: 0

    Title says it all... Who are *YOU* to decide that *we* do not need 2GB files?

    1. Re:640K is enough for you! by SoSueMe · · Score: 1

      Who are we to tell them what they have to accomodate?
      Don't like the way a particular *NIX works? Don't use it.
      Try something else.

    2. Re:640K is enough for you! by Anonymous Coward · · Score: 0

      We're the users. We're the ones who will pay for an use their products. Do you think that when you go to a vendor with the requirements for your million dollar project, that the vendor can just throw away the parts it doesn't like? Of course not.

  12. Re:Why large files by hbackert · · Score: 4, Informative

    vmware uses files as virtual disks. 2GB would be a really, really small disk. UML does the same, using the loop device feature of Linux. Again, a filesystem in a file. Again, 2GB is not much. Simulating 20GB would need 10 files.

    Feels like 64kbyte segments somehow...and I really don't want to have those back.

  13. Re:Why large files by Big+Mark · · Score: 2, Funny

    Come on. Even Bill Gates admitted that half a meg ain't enough.

    640K, on the other hand, should be enough for anyone...

    -Mark

  14. data warehouse, and any database for that matter by CrudPuppy · · Score: 5, Insightful

    my data warehouse at work is 600GB and grows at a rate of 4GB per day.

    the production database that drives the sites is like 100GB

    welcome to last week. 2GB is tiny.

    --
    A year spent in artificial intelligence is enough to make one believe in God.
  15. Its funny how some lamers dont listen... by cheekyboy · · Score: 3, Insightful

    I said this to some unix 'so called experts' in 95, and they said, oh why why do you need >2gig

    I can just laugh at them now...

    --
    Liberty freedom are no1, not dicks in suits.
    1. Re:Its funny how some lamers dont listen... by FooBarWidget · · Score: 1

      No you can't, both Linux and FreeBSD support files > 2 GB. Apparenly you've laughed all for nothing.

    2. Re:Its funny how some lamers dont listen... by Anonymous Coward · · Score: 0

      'doze couldn't even give you a >2gb PARTITION in 95.
      Now that's funny.

    3. Re:Its funny how some lamers dont listen... by Anonymous Coward · · Score: 0

      Your mom was showing her >2gb partition to everyone in '95.

      Now that is funny.

    4. Re:Its funny how some lamers dont listen... by abirdman · · Score: 1

      In 1995, IIRC the biggest hard drive in common use was (quite a bit) less than a gig. It's not surprising the designers decided to use the smaller, more efficient 32 bit addressing. As in all other things related to computers in general (didn't billg once ask why anyone would need more than 640K of RAM, or is that urban myth?), needs change over time.

      As hardware develops, the software develops to address it. I remember someone who was shocked that Lotus 123 could create a spreadsheet larger than a (320K 5 1/4" DS) diskette, because how could you save it?

      We've come a long way. We'll go a lot further. 64 bit file sizes will seem small and quaint to our childrens' children.

      Now, If I could just tar my Linux file system over to my son's spare 60 gig hard drive through SAMBA, I'd have cheap, fast, effective backup. But I have to do it 2 gigs at a time. Grrr... gotta go check up on that BSD stuff.

      --
      Everything I've ever learned the hard way was based on a statistically invalid sample.
    5. Re:Its funny how some lamers dont listen... by Anonymous Coward · · Score: 0

      Now, If I could just tar my Linux file system over to my son's spare 60 gig hard drive through SAMBA, I'd have cheap, fast, effective backup. But I have to do it 2 gigs at a time. Grrr... gotta go check up on that BSD stuff.

      Very much doubt it, Recentally I backed up some data into a tar file that came to 5.5GB on my GNU/Linux System and transferred and uncompressed it onto another system fine. Seems to be no problem to me...

    6. Re:Its funny how some lamers dont listen... by abirdman · · Score: 1

      Mine stops with an error at 2 gigs, believe me. Guess I've got to go jiggle the wires or something. The problem might be SAMBA. Or I have to upgrade from RH 7.2

      --
      Everything I've ever learned the hard way was based on a statistically invalid sample.
    7. Re:Its funny how some lamers dont listen... by Wolfrider · · Score: 1

      --I have had the same problem under SuSE 7.3, even with the latest Linus kernel. The problem is fixed in Debian / Knoppix:

      www.knopper.net
      www.knoppix.net

      --
      .
      == WolfriderV6 == I'm willing to admit that *I just might* be wrong... Are you??
  16. Re:Why large files by Timesprout · · Score: 1

    For when Jaron Lanier decides to update his website with 10,000,000 lines of script

    --
    Do not try to read the dupe, thats impossible. Instead, only try to realize the truth
    What truth?
    There is no dupe
  17. Re:Why large files by amigaluvr · · Score: 1

    Oh I see now raw video is larger than I thought, oops

  18. Re:Why large files by Anonymous Coward · · Score: 0

    Ever heard of something called data processing? Banks, credit card companies, public utilities, etc. all have huge databases that get processed all the time, and involve working as well as final output files in the range of 2 to 20 gigabytes in size.

    Seek times are irrelevant in these situations, since the files are processed serially.

    Get out into the real world and see what real industry does with real computers.

  19. Re:Why large files by AvitarX · · Score: 1

    Maybe high quality audio+vidio for say...
    making a movie will be larger then that.

    I guess a lot of the editing would probably be done scen by scene, and then you could on the fly merge and compress them so that at no point you use more then 2gb, but it seems that if you make a 2 hour dvd it would be nice to keep the 4gb image file on your hardrive if you planned to reburn it.

    Not a scattering of scenes that it would recreate the image on the fly.

    It is kind of a dumb question when we have computers being marketed as home dvd makers why would be need that big of a file.

    --
    Wow, sent an e-mail as suggested when clicking on "use classic" banner, and got a fast response that addressed my msg
  20. Re:data warehouse, and any database for that matte by hector13 · · Score: 2, Insightful
    my data warehouse at work is 600GB and grows at a rate of 4GB per day. the production database that drives the sites is like 100GB welcome to last week. 2GB is tiny.
    And you store this "production database" as one file? didn't think so (or atleast I hope you don't).

    I am not agreeing (or disagreeing) with the original post, but having a database > 2 GB has nothing to do with having a single file over 2 GB. A db != a file system (except for MySQL perhaps).

  21. Wrong point of view. by Krapangor · · Score: 0, Interesting
    There is not a problem with support of large files in Unix system, there is a problem with incompetent people using too large files in Unix systems.
    It's an old and well known problem that programmers and users tend to keep very large files for laziness and logical errors.
    However it's also an old and well known fact that large files are bad for performance per se due to several reasons:
    • fragmentation: large files increase to fracmentation of most file systems, at least of any system with uses single indexed trees/B-trees and nonlinear hashes
    • entropy pollution: large files increase to overall entropy on the harddisk leading to worse compression ratios for backup and maintenance
    • data pollution: the use of large files tempts users to store all kinds of redundant, reducible, linear and irrelevant data wasting storage space and I/O time
    So I don't see why admins should provide a "work-around" for the filesize limits. These limits are there for very good reasons and in my opinion they are even much to big. You should always remember that the original K&R Unix had only 12 bits for file size storage and was much faster than modern systems, in fact it did run on 2,2 MHz processors and 32 kB of RAM which wouldn't be sufficient for even a Linux of Windows XP bootloader.
    Think about it.
    --
    Owner of a Mensa membership card.
    1. Re:Wrong point of view. by Anonymous Coward · · Score: 1, Interesting

      As others have noted, there are plenty of good reasons to have files greater than two gigs including video editing and scientific research. The file size limits aren't there for a very good reason at all. Someone years ago had to weigh whether to make small files take up a huge amount of room by using 64 bit addresses that would allow multi-terabyte files to exist against using 32 bit addresses that would make small files smaller and create a 2 gb file limit. At the time, it made perfect sense because nobody was using files anywhere near 2 gb... But now they are.

    2. Re:Wrong point of view. by KDan · · Score: 4, Insightful

      Two words:

      Video Editing

      Daniel

      --
      Carpe Diem
    3. Re:Wrong point of view. by N1KO · · Score: 1

      In a couple of years, will todays large files be considered large? Ten years ago having hundreds of 4MB files on a pc would've been considered crazy. Now everyone with an mp3 player is used to it.

    4. Re:Wrong point of view. by heby · · Score: 5, Funny

      "oh yes, those were the days." - misty eyed smile - "when i was young and filesizes were small. you should have seen it. today's youth is so spoiled that they don't even learn assembly language any more. i tell you, you're all going to die because of your large files, yes, die!" - madly waves his cane in the air - "2gb, that's more than anybody will ever need and you are greedy for even more! the holy bit will punish you for this, it will!" - dies of a heart attack.

    5. Re:Wrong point of view. by Anonymous Coward · · Score: 1, Insightful

      the use of large files tempts users to store all kinds of redundant, reducible, linear and irrelevant data wasting storage space and I/O time

      As opposed to a million 4k files that are each 1k of header?

    6. Re:Wrong point of view. by cvande · · Score: 5, Insightful

      In a world everything is small and manageable. Unfortunately, some databases need tables BIGGER than 2gb. Even splitting that table into multiple files still finds you with files larger than two gb. Try adding more tables? OK. Now they've grown to over 2gb and the more tables the more complicated everthing gets. I still need to back these suckers up and a backup vendor that I won't name can't help me because their software wasn't large file (for Linux) ready. So let's get into the game with this and make it the default so we don't need to worry about these problems in the future. Linux IS an enterprise solution.....(my $.02)

    7. Re:Wrong point of view. by costas · · Score: 4, Insightful

      Maybe in your problem domain that's true. I work with retailer data mines and we've hit the 2GB file limit, oh, 4-5 yrs ago? We've been forced to partition databases causing maintainance issues, scalability issues, and the like, just because of the size of a B-tree index.

      True, it looks like the optimal solution is lower-level partitioning, rather than expanding the index to 64bits (tests showed that the latter is slower), but that still means that the practical limit of 1.5-1.7 GB per file (because you have to have some safety margin) is far too constraining. I know installations who could have 200GB files tomorrow if the tech was there (which it isn't, even with large file support).

      I am also guessing that numerical simulations and bioinformatics apps can probably produce output files (which would then need to be crunched down to something more meaningful to mere humans) in the TB range.

      Computing power will never be enough: there will always be problems that will be just feasible with today's tech that will only improve with better, faster technology.

    8. Re:Wrong point of view. by Q+Who · · Score: 1

      Lmao...

      Your other trolls are nice too, but this one is hilarious... "entropy pollution", hehe :)

      "Linux of Windows XP bootloader", this one is amazing. I wonder whether it's a typo, or intentional...

    9. Re:Wrong point of view. by Yokaze · · Score: 4, Interesting

      I'm not a specialist on this matter, so maybe you can enlighten me, where I am wrong or misunderstood you.

      > fragmentation: large files increase to fracmentation of most file systems
      What kind of fragmentation?

      Small files lead to more internal fragmentation.
      Large files are more likely to consist of more fragments, but when splitting this data into small files, those files are fragments of the same data.

      >entropy pollution
      What kind of entropy? Are you speaking of compression algorithms?

      Compression ratios are actually better with large files than small files, because similarities between files across file-boundaries can be found. Therefor, gzip(bzip2) compresses a single large tar-file. (Simple test, try zip on many files and then zip without compression and subsequent compression on the resulting file).

      >data pollution
      How should limiting file size improve that situation? Then, people tend to store data in lot of small files. What a success. People will waste space, whether there is a file size limit or not.

      >These limits are there for very good reasons and in my opinion they are even much to big.

      Actually, they are there for historical reasons.
      And should a DB spread all its tables over thousands of files instead of having only one table in one file and mmapping this single file into memory? Should a raw video stream be fragmented into several files to circumvent a file limit?

      >[...] original K&R Unix [...] was much faster than modern systems

      Faster? In what respect?

      --
      "Between strong and weak, between rich and poor [...], it is freedom which oppresses and the law which sets free"
    10. Re:Wrong point of view. by kasperd · · Score: 2, Interesting

      I sure hope that was a joke. Because otherwise it would be one of the most clueless comments I have seen.

      Sure spliting data into a lot of smaller files is going to reduce the fragmentation slightly, but it is not going to improve your performance. Because the price of accessing different files is going to be higher than the price of the fragmentation.

      In the next two arguments you managed to make two opposite statements both incorrect. That is actually quite impressive.

      First you say large files increase the entropy of the data stored on the disk. Which is wrong as long as you compare to the same data stored in diffeerent files. Of course if the number of files on the disk is constant smaller files will lead to less entropy, but most people actually want to store some data on their disks.

      Then you say large files are highly redundant, which is the opposite of having a large entropy as claimed in your previous argument. And in reality the redundancy does not tend to increase with filesize, but might of course depend on the format of the file.

      All in all you are saying that people shouldn't store many data on their disks, and the little data they do store should be as compact as possible, while still allowing it to be compressed even further when doing backups. You might as well have said people shouldn't use their disks at all.

      Finally claiming older Unix versions were faster is ridiculous, first of all they ran on different hardware. And surely on that hardware they were slower than todays systems. And even if you managed to port an ancient Unix version to modern hardware, I'm sure it wouldn't beat modern systems in todays tasks. Which DVD player would you suggest for K&R Unix?

      --

      Do you care about the security of your wireless mouse?
    11. Re:Wrong point of view. by Daytona955i · · Score: 1

      We do too learn assembly... I specifically learned about the MIPS architecture. Hated it but they do still teach it in CS classes. We touched on it a bit in Programming language concepts and then in Systems Architecture I and II, we actually had to write assembly code. I remember the happy day when I got my one assigmnet to work, we had to grab the keyboard interupts and display them. None of my non-CS friends could understand why I was so happy to have text that I typed appear on the screen.
      -Chris

    12. Re:Wrong point of view. by smoondog · · Score: 1

      There is not a problem with support of large files in Unix system, there is a problem with incompetent people using too large files in Unix systems.

      You are a troll. It is not up to administrators to decide how big a file needs to be. I do scientific research and deal regularly with datasets larger than 300GB. Single files often in the range of 2GB-10GB. For me to split up my data would create an enormous headache, and would be very slow.

      -Sean

    13. Re:Wrong point of view. by mickwd · · Score: 1

      And the amazing thing is, everyone else seems to be taking it seriously.

      Is it just me, or is Slashdot getting much less informed as the user count continues to increase ?

    14. Re:Wrong point of view. by Anonymous Coward · · Score: 0

      that trend should be obvious to blind freddy

    15. Re:Wrong point of view. by Simon+Brooke · · Score: 1
      Is it just me, or is Slashdot getting much less informed as the user count continues to increase ?

      It's not just you.

      --
      I'm old enough to remember when discussions on Slashdot were well informed.
    16. Re:Wrong point of view. by ProtonMotiveForce · · Score: 1

      Are you trolling? That's the biggest load of shit I've read in this thread. You are by far the worst offender of the nimwits coming out of the woodwork whining that nobody needs files bigger than 2GB.

      You even mention K&R Unix and claim it was faster than modern systems and use that as some kind of yardstick?

      Jeses Christ, that's stupid. It's not 1975 any more, and none of your blathering has any relevence to the modern day. Technology progresses, take your dinosaur ass to a VMS shop and bore us all with your claims of how advanced VMS is, but don't tell people what they need and don't need, and certainly don't bandy about the term "incompetent" when you're so obviously projecting.

    17. Re:Wrong point of view. by orangesquid · · Score: 1

      At least 2GB is better than the Multics large file support situation! Files were limited to the size of segments, which were at most 255K 36-bit words, which is equivalent to roughly one megabyte! The Multics designers didn't consider most users would have to ever have larger files than this. The first database product (ever!), MRDS, was severely limited, so Multics programmers created a (kludgy) workaround. Modern operating systems are designed differently and thus aren't limited to such (small) file sizes.

      We have conquered this problem before, by redesigning filesystems to allow files bigger than segments, and we can conquer it again by allowing files bigger than the addressable range of a 32-bit processor's full word.

      --
      --TheOrangeSquid Is it any wonder things seem so awry? We swim in a sea of confusion and don't have to think to survive
    18. Re:Wrong point of view. by binford2k · · Score: 1

      ;; signal/noise ratio is getting worse; I now read posts at +3 or above

      Heh, how ironic that your post is only at 2 now ;)

    19. Re:Wrong point of view. by Anonymous Coward · · Score: 0

      When you say the tech isn't there yet, are you referring to your DB software, or what? I'm running Mac OS 10.2 and the biggest file I currently have is 55GB. I've played with files that were over 100GB with no problems at all, and the stated maximum file size is 2TB.

      I certainly understand what a pain it can be to switch systems, especially in your line of work, but if OS X can handle files in the TB range, surely some DB and analysis software (not to mention other operating systems) must be able to deal with them as well. I'd start looking towards migrating to something that'd work better for you in the long run.

    20. Re:Wrong point of view. by Wolfrider · · Score: 1

      --How about freaking TAR BACKUP FILES, you narrow minded moron?!

      --Have you tried backing up your 60GB Windoze partition to a compressed tar file, and gotten stung by that paltry 2Gig file limit under an older distro?? That pissed me right off!!

      --Now I use Knoppix, and no worries. :)

      --
      .
      == WolfriderV6 == I'm willing to admit that *I just might* be wrong... Are you??
  22. Re:Why large files by Idaho · · Score: 2, Insightful
    Can anyone give a good reason for needing files larger than 2gb?

    I can think of some:

    • A/V streaming/timeshifting
    • Backups of large filesystems (since there exist 320 GB harddisks now, I don't think I should create 160 .tgz files just to back it up, do I?)
    • Large databases. E.g. the slashdot posts table will be easily >2 GB, or so I'd guess. Should the DB cut it in two (or more) files, just...because the OS doesn't understand files >2 GB? I don't think so...

    And that's just without thinking twice...there are probably many more reasons why people would want files >2 GB.

    --
    Every expression is true, for a given value of 'true'
  23. that's not a good question by xintegerx · · Score: 0, Redundant

    Don't moderate up ignorance.

    That's whining... But I see his point--the only reason right now is for video files. If you want to get your video from your camcorder, it's not going to go straight to CDRW or DVD, it's going to your HARD DRIVE storage. You are going to edit it, right?
    Since you probably want to have the best quality, a single file will take a lot of space. (No I don't do this video thing, but I did my own research. Many people do have video, and for computer editing there is no reason to cap a file size.)

    Ok fine, I guess he kind of has a point in that question....

    1. Re:that's not a good question by Anonymous Coward · · Score: 0

      Video is certainly not the only thing that will make gargantuan files. Massive databases and simulation data will eat 2gb like a twinkie.

  24. Comment removed by account_deleted · · Score: 1

    Comment removed based on user account deletion

  25. 640 K ought to be enough for anybody by cyber_rigger · · Score: 3, Funny

    --Bill Gates

    1. Re: 640 K ought to be enough for anybody by Looke · · Score: 1

      Yeah, we should all switch to OpenOffice. I had a 20,000 row, 13 MB Excel file, which I resaved in OpenOffice Calc format. It came out at a sweet 640 KB ;-)

    2. Re: 640 K ought to be enough for anybody by Anonymous Coward · · Score: 0

      Said during the days of 8 bit computing, when microsoft switched to a 16 bit architecture.
      People said: why would anyone *ever* need 1 MB? Bill explained it was actually 640KB in DOS, wich at that time would be enough for all the uses (since before that one was limited to 8bits)

    3. Re: 640 K ought to be enough for anybody by commanderfoxtrot · · Score: 1

      OpenOffice is good in that all of its files are easily taken apart by hand. They are ZIPped archives, which are then gzipped to save, as you say, a lot of space.

      I had a problem with OpenOffice last week when it didn't want to open my saved report. Or the last 6 hourly backups either :-(. Luckily I was able to get the bare text out and redo the report in LaTeX which worked (as usual) like a dream.

      Does anyone know when the **MAJOR** OpenOffice multitasking bug will be fixed? Basically AFAIK, OO calls sched_yield all the time so running e.g. seti will stop OO from doing anything.

      --
      http://blog.grcm.net/
    4. Re: 640 K ought to be enough for anybody by acid_zebra · · Score: 1

      Of course, when you moved the files between formats, you lost any and all undo information. Excel stores this undo information in the file, so when you change a lot of things in this file, over time it will grow. Of course, OO also compresses the files, which also helps a lot.

      --
      -- No Sig is a Good Sig
  26. It will happen with time_t, too by wowbagger · · Score: 5, Informative

    We are seeing problems with off_t growing from 32 to 64 bits. We are also going to see this when we start going to a 64 bit time_t, as well (albeit not as badly - off_t is probably used more than time_t is.)

    However, the pain is coming - remember we have only about 35 years before a 64 bit time_t is a MUST.

    I'd like to see the major distro venders just "suck it up" and say "off_t and time_t are 64 bits. Get over it."

    Sure, it will cause a great deal of disruption. So did the move from aout to elf, the move from libc to glibc, etc.

    Let's just get it over with.

    1. Re:It will happen with time_t, too by koreth · · Score: 1
      First of all, it's a Y2038 problem rather than a Y2106 problem because time_t is signed in many places. Simply switching to an unsigned time_t (who uses time_t to represent pre-1970 values?) will buy us an extra 68 years with minimal application grief, but the underlying problem will still be there.

      It boggles my mind that Sun, for example, went to the trouble of building a whole host of interfaces and a porting process for 64-bit file offsets (see the lf64 and lfcompile64 manpages on Solaris) and yet they didn't bother to increase the size of time_t at the same time. If everyone is going to be recompiling their apps anyway, why not fix it all in one go?

      On the application side, it should be noted that this isn't a problem for code written in Java, whose equivalent of time_t is already 64-bit (in milliseconds, granted, but that only eats about 10 of the extra 32 bits.) Obviously the Java VM won't be able to make up for the underlying OS not supporting large time values, but at least the applications won't have to change.

      First one to start whining about Java's year-584544016 problem gets whacked with a wet noodle.

    2. Re:It will happen with time_t, too by Ozric · · Score: 1

      One going support, would be my guess.

    3. Re:It will happen with time_t, too by stripes · · Score: 1
      First one to start whining about Java's year-584544016 problem gets whacked with a wet noodle.

      I remember seeing a Sun press relase about Java being Y2K complient, how long it would last, and that Sun promised to fix it at least 3000 years beofre it became a problem. Or something like that. It amused me greatly at the time.

    4. Re:It will happen with time_t, too by Spoing · · Score: 1

      Only 35 years? Phew! Talk about a cutting it close!

      --
      A firewall can not protect you from yourself. Turn off what you do not need. Do not use the firewall to do your work.
    5. Re:It will happen with time_t, too by Anonymous Coward · · Score: 0

      Rather than going to 32 bits, one can change it to be unsigned, which is the same size, and gives a few years grace for people to transition to 64-bit time_t.

    6. Re:It will happen with time_t, too by John+Sullivan · · Score: 1

      A lot of the Y2K problem was caused by code written within the last 35 years. It wasn't a disaster, but it was a big problem, and a huge drain on many companies' resources as the tried to fix things up at the last minute.

      --
      This is my World Wide Web of Whatever
    7. Re:It will happen with time_t, too by Spoing · · Score: 1
      A lot of the Y2K problem was caused by code written within the last 35 years. It wasn't a disaster, but it was a big problem, and a huge drain on many companies' resources as the tried to fix things up at the last minute.

      Agreed, somewhat. Unlike the Y2K problems, the time issue in Unix systems is fairly transparent to the implementation. Update the OS, and the problem goes away in many cases. Y2K, in the case where 2 bits where used instead of 4, were usually program specific.

      Because of that, in 35 years few programs will be impacted.

      --
      A firewall can not protect you from yourself. Turn off what you do not need. Do not use the firewall to do your work.
    8. Re:It will happen with time_t, too by John+Sullivan · · Score: 1

      Hmm, not really. The OS and applications often exchange time_t's, and they're often stored as 4-byte quantities in data files. Update the OS and you either break binary compatibility, or have to dual the API which provides no benefit to applications until they are updated too. Update the app and you may still have an enormous data conversion problem. Which defines the scale of the problem - there's an awful *lot* of bad application code out there. In mission critical systems. Even in places where the people using it day to day have forgotton or never known it exists.

      Plus history teaches us that organisations *will* wait until the last minute before updating mission critical systems. Ignore that lesson and you're certainly doomed to a repeat of Y2K.

      --
      This is my World Wide Web of Whatever
  27. huh? by inviagrated_amnesiac · · Score: 0, Redundant

    wtf kind of sentence construction is this:

    "It's an interesting look into some of the kinds of less obvious problems that distro-compilers have to face."

    Why not:

    "It is an interesting problem that some distro-compilers have to face."

    1. Re:huh? by KDan · · Score: 1

      It's certainly something that George Orwell would have frowned upon, but it's not incorrect sentence construction per se.

      PS: Read that Orwell article if you haven't yet, it's really very good

      --
      Carpe Diem
    2. Re:huh? by JanneM · · Score: 2, Informative

      Because the sentences mean different things.

      "It is an interesting problem that some distro-compilers have to face."

      talks about the problem facing distro compilers, whereas

      "It's an interesting look into some of the kinds of less obvious problems that distro-compilers have to face."

      Talks about the article adressing these problems. /Janne

      --
      Trust the Computer. The Computer is your friend.
    3. Re:huh? by david-currie · · Score: 1

      Because it's not an interesting problem. It's a fucking boring problem if _you_ have to deal with it. But it's interesting to read about because it's the kind of thing you probably haven't thought about if you don't compile distributions. I meant what I wrote.

    4. Re:huh? by Anonymous Coward · · Score: 0

      More words is meaning smarter! Especially big words. 1up, yatta!

    5. Re:huh? by RumpRoast · · Score: 2, Interesting
      Actually you changed the meaning of that sentence. I think really we object to:
      "It's an interesting look into some of the kinds of less obvious problems that distro-compilers have to face."

      "of the kinds" really adds nothing to the meaning here, nor does "have to"

      Thus we have:

      "It's an interesting look into some of the less obvious problems that distro-compilers face."

      The same sentence, but much cleaner!

      Thanks! I'll be here all week.

      --

      My Ass hurts.
    6. Re:huh? by david-currie · · Score: 1

      Now this I can accept. I promise to think about what I write next time. ;)

  28. A woman's perspective . . . by pariahdecss · · Score: 5, Funny

    So my wife says to me, "Honey, do I look fat in this filesystem ?"
    I replied, "Sweetie, I married you for your trust fund not your cluster size."

    1. Re:A woman's perspective . . . by egreB · · Score: 1

      Now, THAT's about the best thing I've read all day (-8 Thank you, my friend, you just made it to the Quotes-section of my door!

  29. Re:Why large files by CoolVibe · · Score: 5, Interesting
    raw video can easily exceed 2 GB in size. Why raw video? Because (like others said) it's easier to edit. Then you encode to MPEG2, which will shrink the size somewhat (usually still bigger than 2 GB, ever dumped a DVD to disk?), so it'll be "small" enough to burn onto a DVD or somesuch. Oh, editing 3 hours of raw wave data also chews away at the disk size. Also, since you need to READ the data from the media to see if it looks nice, you need to have support for those big files as well. Right, now why don't we need files bigger than 2 GB again? Well?

    Oh, you're still not convinced, well see it this way: when in the future will you ever need to burn a DVD?

    Well? A typical one sided DVD-R holds around 4 GB of data (somewhat more), if you use both sides, you can get more than 8 GB of data on it. That's way bigger than 2 GB, no? Now, how big must your image be before you burn it on there? well?

    Right...

  30. Re:Why large grapes by edox. · · Score: 1

    Dont be the good old fox .)

    --
    quote:port 17 udp
  31. Re:Why large files by Anonymous Coward · · Score: 0

    ought to be

  32. 64KB memory segments by KDan · · Score: 1

    Oh come on, those were fun, when you had to load into memory and uncompress a file larger than that :-)

    Oh the fond memories :-)

    Daniel

    --
    Carpe Diem
  33. How large are we talking? by httpamphibio.us · · Score: 1

    It doesn't give a specific filesize in the article...

    --
    sig.
    1. Re:How large are we talking? by voodoopriestess · · Score: 1

      2^32 Bytes (aka 2GB).

      Iain

      --
      ---- "I would be careful in separating your weirdness, a good quirky quantum weirdness, from the disturbed weirdnes
    2. Re:How large are we talking? by kasperd · · Score: 1

      2^32 Bytes (aka 2GB).

      Make that 2^31.

      --

      Do you care about the security of your wireless mouse?
    3. Re:How large are we talking? by NoOneInParticular · · Score: 1
      Ah, this 2^31 brings back memories of the time I had a box for scientific work with appr 4Gb of addressable memory (most of it RAM, but also some swapspace), and wanted to view some kind of lame proprietary video format, with proprietary viewer. When starting up the application it would complain I had less than 4 MB of memory (while in fact I had a thousandfold of that).

      Hmm, the programmers seemed to store the information in an int, so by allocating 2 MB of memory (through Matlab, zeros(10000,10000) is quite a chunk), I could finally convince the application that I did not have negative memory, but actually enough to display the movie.

      But then the video was lame.

    4. Re:How large are we talking? by kasperd · · Score: 1

      When starting up the application it would complain I had less than 4 MB of memory

      I recall some story about a similar bug in MS Basic. If it was used on a PC with more than 512KB of RAM it would say there was not enough RAM. In DOS RAM was measured with a 16bit number counting units of 16 bytes.

      --

      Do you care about the security of your wireless mouse?
    5. Re:How large are we talking? by Anonymous Coward · · Score: 0

      the lengths we go to watch pr0n...

  34. Re:Why large files by Anonymous Coward · · Score: 0

    Can anyone give a good reason for needing files larger than 2gb?
    A msg-id history file on a newsserver with long retention.

  35. Q: Why large files? A: Disk images too by Anonymous Coward · · Score: 2, Interesting
    While almost all the examples given are good, I don't think anyone has mentioned complete disk images. I have recently had to do this in order to recover from a hardware issue (drive cable failure resulted loss of MBR, nasty) and on a TiVo unit that had a bad drive.

    I have most all of my older system images available to inspect. The loopback devices under Linux are tailor made for this type of thing.


    I am puzzled as to why you mention the seek times. Surely you would agree that the seek time should be only inversely geometrically related to size, the particular factors depending on the filesystem. Any deviation from the theoretical ideal is the fault of a particular OS's implementation. My experience is that this is not significant.

    (user dmanny on wife's machine, ergo posting as AC)

  36. Funny...in AIX... by cshuttle · · Score: 4, Informative

    We don't have this problem-- 4 petabyte maximum file size 1 terabyte tested at present http://www-1.ibm.com/servers/aix/os/51spec.html

    1. Re:Funny...in AIX... by n3m6 · · Score: 2, Insightful

      whenever something like this comes up. somebody just has to say "we dont' have a problem, we use X"

      that's just so lame. we have XFS and JFS. you can keep your AIX and your expensive hardware with you.

      thanks.

    2. Re:Funny...in AIX... by Lu+Xun · · Score: 1

      Ok, but how many Library of Congresses is that?

      --
      That's not a soda... it's a caffeine delivery device!
    3. Re:Funny...in AIX... by Anonymous Coward · · Score: 1

      > we have XFS and JFS. you can keep your AIX and your expensive hardware with you.

      whenever something like this comes up. somebody just has to say "we dont' have a problem ($), we use X" that's just so lame.

      I guess its also a well known fact that cheap-ass hardware held together with spit and rubber bands has no scaling limits, especially dealing with terabyte+ files.

      By the way.... you do know where JFS came from, right????? And XFS? If not, your lamer points just went through the roof.

    4. Re:Funny...in AIX... by SN74S181 · · Score: 1

      Expensive hardware?

      My only AIX hardware cost $35 at the auction a few weeks ago. Granted it only has 128 MB, and it's PPC chip is less than 200 MHz, but hey.

      I don't run AIX on it, of course. It runs NetBSD.

    5. Re:Funny...in AIX... by Anonymous Coward · · Score: 0

      Funny AIX had JFS long before your pet OS did.

  37. Have you ever seen some people's email? by alen · · Score: 4, Insightful

    On the Windows side many people like to save every message they send or receive to cover their ass just in case. This is very popular among US Government employees. Some people who get a lot of email can have their personal folders file grow to 2GB in a year or less. At this level MS recommends breaking it up since corruption can occur.

    1. Re:Have you ever seen some people's email? by nentwined · · Score: 5, Funny

      I agree with MS on this one. government employees shouldn't be allowed to hold their positions for longer than a year. DOWN WITH GOVERNMENTAL CORRUPTION! ... :)

      --
      heaven
    2. Re:Have you ever seen some people's email? by sqrlbait5 · · Score: 2, Informative

      Yeah, but if you're using NTFS, where there doesn't appear to be a max file size, you still get the 2GB limit on Outlook files. Every damn version of Outlook has had this 2GB limit, but OutlookXP doesn't actually fix the problem, just warns the user at 1.87GB. We have people hitting their limit all the time at work, but that's because they like to send artwork and whatnot and not clear out their folders.

      --
      LDAA #$80 BITA 0x40 BNE END
    3. Re:Have you ever seen some people's email? by Anonymous Coward · · Score: 0

      Huge zip archives of porn is now considered "artwork"?

    4. Re:Have you ever seen some people's email? by kasperd · · Score: 2, Insightful

      2GB in a year or less.

      They probably don't write emails but instead write Word documents and attach them to empty emails.

      --

      Do you care about the security of your wireless mouse?
    5. Re:Have you ever seen some people's email? by sean23007 · · Score: 1

      Don't call that just a Windows phenomenon. There are many cases where it is a good idea to save every email you get. Then again, there are others where it is a good idea to destroy all the evidence. Either situation can happen to you regardless of what OS you use.

      Just saying, is all.

      --

      Lack of eloquence does not denote lack of intelligence, though they often coincide.
    6. Re:Have you ever seen some people's email? by spongman · · Score: 1

      ouch. you're not using exchange, i take it?

    7. Re:Have you ever seen some people's email? by jonathanbearak · · Score: 1

      i thought they don't guarantee anything past 70mb.
      of course, if they switched to something like maildir, there goes half the job market for mcse's.

    8. Re:Have you ever seen some people's email? by ediron2 · · Score: 1
      This is very popular among US Government employees. Some people who get a lot of email can have their personal folders file grow to 2GB in a year or less. At this level MS recommends breaking it up since corruption can occur.
      Hmm... is that the filesystem or the Government that gets corrupt when it gets too fat?

      I also laughed at the thought of a convicted monopolist like MS recommending this breakup.

    9. Re:Have you ever seen some people's email? by InfiniteWisdom · · Score: 1

      > Have you ever seen some people's email?

      Stop being so nosy and stick to reading your OWN e-mail!!!

    10. Re:Have you ever seen some people's email? by benzapp · · Score: 1

      Now that MS has file system level encryption, you would think they would simply store each email as a file. I can think of no other reason why storing every email in a single file makes sense.

      If NTFS is so good, it should be able to handle tens of thousands of small files no problem.

      --
      I don't read or respond to AC posts
    11. Re:Have you ever seen some people's email? by Anonymous Coward · · Score: 0

      Most sane e-mail packages store e-mails individually in a proper filesystem layout.

    12. Re:Have you ever seen some people's email? by egreB · · Score: 1

      Do not make jokes about that - it's actually quite true. I've seen a lot of mails with no content or "See the attached file" where an attached MS Word document contains the .. content.

      Sometimes, the reason for this is because the content is formatted especially in Word, but most of the time it's just a letter or something.

      I've gotten some of these. I just reply and request the information in an open format.

    13. Re:Have you ever seen some people's email? by kasperd · · Score: 2, Funny

      Do not make jokes about that - it's actually quite true.

      Guess what..... I wasn't joking.

      I just reply and request the information in an open format.

      So do I. Sometimes I send the reply in a .dvi file. I got a surprise the day a friend of mine managed to read the .dvi file I had attached.

      --

      Do you care about the security of your wireless mouse?
    14. Re:Have you ever seen some people's email? by egreB · · Score: 1

      Guess what..... I wasn't joking.

      Heh.. Sorry (-8 That'll teach me to read posts twice before replying to them. English isn't my native language, so I misunderstood the tone. It could have been a joke, though, if people didn't actually send their mails as MS Word attachments. Oh well.

      As for sending replies in DVI-format, that's a good idea. I'll do that next time (-8

  38. Re:Why large files by bourne · · Score: 2, Interesting

    Can anyone give a good reason for needing files larger than 2gb?

    Forensic analysis of disk images. And yes, from experience I can tell you that half the file tools on RedHat (like, say, Perl) aren't compiled to support >2GB files.

  39. Switch to gnu/hurd by Anonymous Coward · · Score: 3, Funny

    It has a nice small 1gb filesystem limit. I have partitioned my hard disk in to 64 little chunks and it runs very slowly, and unstabilly, but its completley open source and im happy.

    1. Re:Switch to gnu/hurd by /dev/trash · · Score: 1

      Is this true? If so I don't think I'll ever try it out.

    2. Re:Switch to gnu/hurd by shepd · · Score: 1

      It's 2GB, but yes, it is true, HURD is the pinnacle of what happens when you just let people do what the hell they like without any management whatsoever. All you programmers might hate your managers, but honestly, without them, you'd end up with HURD-like projects -- a decade late, and still half a decade to go.

      --
      If you could be told what you can see or read, then it follows that you could be told what to say or think - BoC
  40. Re:Why large files by Anonymous Coward · · Score: 0

    Well, the applications I support store and interpret Seismic data. One survey can routinely be in the >100GB range. The visualisation apps we make are often asked to load 2-20GB in memory alone (that's why we still use Sun and SGI systems to do it, though we are actively pursuing Linux too). So 64-bit filesystems and files are kinda important to us.

  41. Umm, scientific computing by Anonymous Coward · · Score: 1, Insightful

    Many large-scale computing projects easily generate hundreds of gigabytes and even terabytes of data. They are writing to RAID systems and even parallel file systems to improve their IO.

    Think beyond the little toy that you use. These projects are using Unix (Solaris, Linux, BSD and even MacOSX) on clusters of hundreds or thousands of nodes.

  42. Re:Why large files by benevold · · Score: 2, Insightful

    We use a Unidata database here for an ERP system, each database is more than 2gb a piece (more like 20 gb) of relatively small files, when the directories are tarred for backup reasons they are usually over 2gb which means that gzip won't compress them. Unless I'm missing something I don't see an alternative for files large than 2gb in this case. Sure on the personal computing level the closest thing you probably get is ripping DVD's but there are other things out there, and I realize this is tiny in comparison to some places.

  43. Wrong. by I+Am+The+Owl · · Score: 1

    You obviously have never done any work with video before. Most DV will eat up 2GB easy with 15min of footage or less.

    --

    --sdem
  44. Re:Why large files by Anonymous Coward · · Score: 0
  45. Its not the size of the file... by bananaape · · Score: 1, Funny

    Its how you use it.

  46. MOD UP by xintegerx · · Score: 1

    I e-mailed somebody on the Board of Higher Ed of my State for some answers, and they simply replied

    Please call me at #-###-###-###.

    Thanks

    He has a really good point if mail programs put archives in one big zip-equivalent file, because these CAN get huge.

    1. Re:MOD UP by DAldredge · · Score: 1, Insightful

      Thats not why he wanted you to call him. If he answered your questions via email there would have been a record of what he had said.

    2. Re:MOD UP by Webmonger · · Score: 1

      I sometimes do tech support, and I often find it much easier to arrive at the right answer over the phone than via email. The immediate feedback, the ability to get clarification, to discuss alternatives all make it my preferred method.

      Though your theory may be correct in this case. Who can say?

  47. Re:Why large files by Veteran · · Score: 1

    I have run into problems trying to compress a tar archive of my home directory which has been around since 1995 when I switched to Linux. The two gig limit runs into trouble here.

  48. Re:Why large files by kasperd · · Score: 3, Insightful

    The seek times alone withinr these files must be huge

    Who moded that as Insightful? Sure, if you are using a filesystem designed for floppy disks, it might not work well with 2GB files. In the old days where the metadata could fit in 5KB a linked list of diskblocks could be acceptable. But any modern filesystem uses tree structures which makes a seek faster than it would be to open another file. Such a tree isn't complicated, even the minix filesystem has it.

    If you are still using FAT... bad luck for you. AFAIK Microsoft was stupid enough to keep using linked lists in FAT32, which certainly did not improve the seek time.

    --

    Do you care about the security of your wireless mouse?
  49. Re:Why large files by martinschrder · · Score: 1

    Bitmap files for image setters can easily become huge. Think of 500x100(cm)x1000x1000(pixels).

  50. AutoZone's 1TB DB by Anonymous Coward · · Score: 0

    I'm posting AC on this one.

    I can tell you that, to my knowledge, the AutoZone corporation has databases which exceed a terabyte in size. Yes, that's 1 terabyte. When you consider the sheer number of AutoZone retail locations, combined with their giant inhouse catalog, sales records going back umpteen years, customer data, etc. it's not hard to imagine such a large database.

    I'm not saying that one huge database is the way to go. But I am saying that AFAIK it's in practice. 2 gigs is nothing when it comes to file size.

  51. Why not to learn from past? by Libor+Vanek · · Score: 2

    I just wonder why we don't learn from past (limits) and remove this limits "forever". E.g. 1 month ago I recieved question of possibility building 10 TB Linux cluster (physics are crazy ;-)).

    There surely MUST be some way how to do this - I just imagine some file (e.g. defined in LSB) which would define this limits for COMPLETE system (from kernel, filesystems, utils to network daemons). I know there are efforts to things like this but if we'd say (for example) thay that distribution in 2004 won't be marked "LSB compatible" if ANY of programs will use any other limits I think it will create enough preasure on Linux vendors.

    Just a crazy idea ;-)

    1. Re:Why not to learn from past? by n3m6 · · Score: 1

      there is no spoon and there is always a limit.

      the problem is where its sticking at . ;)

    2. Re:Why not to learn from past? by Libor+Vanek · · Score: 1

      The point of my posting is to have limits in only one file for complete system and need to just change it there.

  52. The O/S should do it and do it well. by tjstork · · Score: 3, Interesting

    1) Splitting up a big file turns an elegant solution into a an inelegant nightmare.

    2) Instead of 10 different applications writing code to support splitting up an otherwise sound model, why not have 1 operating system have provisions for dealing with large files.

    3) You are going to need the bigger files with all those 32 bit wchar_t and 64 time_ts you got!

    --
    This is my sig.
  53. Re:data warehouse, and any database for that matte by CrudPuppy · · Score: 2, Informative

    the datafile size averages 8GB in the warehouse.

    --
    A year spent in artificial intelligence is enough to make one believe in God.
  54. Re:Why large files by Perl-Pusher · · Score: 1

    Science Data usually consist of huge multidimensional arrays. I have seen satellite data in huge netcdf files that are very close if not slightly larger than that.

  55. Re:Why large files by markz · · Score: 1

    database dumps - one of our smaller database dumps is 2.3 GB compressed. The dumps are the easiest method of backup and distribution - locally and (very) remotely.

  56. Re:Why large files by bunratty · · Score: 2, Interesting

    Over Christmas and New Years, I helped my wife run a simulation of 1000 different patients for an acedemic pharmacokinetics paper. The run took ten days and had an input file of about 1.5 GB. If her computer was faster, or she had access to more computers, she would have wanted to simulate more patients and would easily have needed support for files larger than 4 GB. As CPUs get faster and hard disks get larger, there will be much more demand for these large files as well as more than 4 GB per process.

    --
    What a fool believes, he sees, no wise man has the power to reason away.
  57. Re:Why large files by gbitten · · Score: 1

    Another example of large file utility are the database files. In my job, the DB machine (Solaris) hasn't sufficient disk space to generate the DB dump. The biggest dump have 11GB and I wasn't able to put it in Linux box (RH 6.2), so I used FreeBSD 4.2 with sucess.

  58. BeOS Filesystem by SixArmedJesus · · Score: 2

    I remember reading in the BeOS Bible that the BeOS filesystem could contain files as large as 18 petabytes. Makes you wonder two things: What's the biggest filesystem that you could use with a BeOS machine? and Why don't other OSs have filesystem like this. Espcecially with those awesome extended attributes. I weep for the loss of the BeOS filesystem...

    --

    *slight crashing sound*
    1. Re:BeOS Filesystem by Yokaze · · Score: 4, Informative

      Mine is bigger than yours :)

      Linux XFS: 9 exabytes

      Also supports extended attributes.

      --
      "Between strong and weak, between rich and poor [...], it is freedom which oppresses and the law which sets free"
    2. Re:BeOS Filesystem by SixArmedJesus · · Score: 1

      Hmmmm... yes, yours is bigger. Thanks for the info. As far as the BeOS filesystem, though, I really liked the attributes because they could be any size, AND they could be arbitrary binary data. So, for example, file icons could be kept in the attributes. Or sounds (although how that would be useful, I dunno). Another good example of this was the default BeOS text editor. It could handle colors, bold, italic, and underline, much like a word processor, but it was still a plain text file. The style data was kept in the attributes. Also, I think my real disappointment with the whole XFS thing is that there are no file managers that can handle those extended attributes. In BeOS, the file manager utilized these attributes with ease. Icons were displayed, you could layout a manager window so that it would display the various types of data, and you could basically use the filesystem as a database. And the tools for it were so easy to use and readily available. I just haven't seen it with Linux yet. Although, if I found one, I'd surely give it a go.

      Another quick note, I've read somehwere that FreeBSD's UFS2 also has extended attributes. It would interesting to know how that would compare to both the BeOS FS and XFS as far as file size and what types of attributes it supports.

      Thanks again for the info!

      --

      *slight crashing sound*
    3. Re:BeOS Filesystem by Anonymous Coward · · Score: 0

      It's 18,000! (THOUSAND!) petaBytes for BeOS.
      2^64 = 18446744073709551616. ...Almost enough.

  59. Re:Why large files by joto · · Score: 1
    Can anyone give a good reason for needing files larger than 2gb?

    Yes. Sometimes you need to store a lot of data. Even DVD's has 4.3 GB of data these days. But that's not even much compared to the amount of data we handle in seismic research. I would believe astronomists, particle physicists and a lots of other people also routinely handle ridiculous amounts of data.

    By the way, in producing the DVD, you would naturally work with uncompressed data. How would you handle that?

    The seek times alone withinr these files must be huge, and it smacks a bit of inefficienecy

    And because it is inefficient, we should not support it? As a matter of fact, any file larger than one disk-block is inefficient. Maybe we should stop supporting that as well?

    sure its just as bad to have an app use hundreds of say 4kb files or so, but two GIGABYTES???

    As I've said, it's not really that much, depending on the application.

  60. Re:Why large files by Zathrus · · Score: 2, Interesting

    In my previous job we regularly processed credit data files >2 GB. All the data is processed serially (as someone else mentioned), so seek time is not an issue (nor is it an issue in a binary data file - seek to 1.4GB. Done. Next.).

    The real issue we ran up against was compression... we wanted to have the original and interm data files available on-disk for awhile in case of reprocessing. The processing would generally take up 10x as much space as the original data file, so you compressed everything. Except that gzip can't handle files >2GB (at the time an alpha could, but we didn't want to touch it). Nor can zip. So we had to use compress. Yay. (bzip could handle it, but was decided against by the powers that be).

    Compression of large files is still an issue, unless you want to split them up. Unless you download a beta version gzip still can't handle it. As I understand it zip won't ever be able to do it. There are some fringe compressors that can handle large files, but, well, they're fringe.

  61. Re:Why large files by imnoteddy · · Score: 1
    Databases.

    The computer aided design databases for an automobile, when you have 3D models for the parts, the tooling, plant layout, etc. is in the low terabyte range. As another example, Boeing dedicates about 14 terabytes to commercial airplane geometry data storage.

    Or Astronomy. A planning document talks about a project generating 300 terabytes per year.

    --
    No electrons were harmed creating this post, though some may have been subjected to electrical and/or magnetic fields.
  62. Re:Why large files by Markus+Landgren · · Score: 1

    Last time I wrote a 7 gig file it was an image of a hard disk. Lots of other stuff (video) can get large too. Anyway, there is an error in the headline. 2 gigs is not a limit in modern unices, only in ancient or otherwise really crappy unices.

  63. Somewhat cumbersome, even on Linux by topologist · · Score: 2, Informative
    To enable LFS (Large File Support) in glibc (which not all filesystems support), you need to recompile your application with
    -D_FILE_OFFSET_BITS=64 and -D_LARGEFILE_SOURCE

    This forces all file access calls to their 64-bit variants, and you'll explicitly need to use structs like off64_t instead of off_t where needed. And I believe most large file support is really available only past glibc 2.2

    Additionally you need to use O_LARGEFILE with open etc. So legacy applications that use glibc fs calls have to be recompiled to take advantage of this, and may need source level changes. Won't work on older kernels either.

    1. Re:Somewhat cumbersome, even on Linux by topologist · · Score: 1

      Okay, I didn't see the link to the page which talks about all of this at the end of the article. Oops.

  64. snicker by Rhinobird · · Score: 1

    maybe the plural for Unix should be Unixen

    Sudden though of "Linuxen the HOOOOOUUUSSSSE, bizzach!"

    --
    If Mr. Edison had thought smarter he wouldn't sweat as much. --Nikola Tesla
  65. Yep... by Kjella · · Score: 2, Informative

    Some numbers for *uncompressed* video:

    NTSC/YUV2/stereo: ~111gb for a cinema movie (1hr 45min)
    PAL/YUV2/stereo: ~125gb for same

    HTDV/surround: ~908gb for same

    With huffyuv (very low CPU usage, lossless) you should be able to cut that by a factor of 2-3. But it's still *huge*

    Kjella

    --
    Live today, because you never know what tomorrow brings
    1. Re:Yep... by kasperd · · Score: 1

      NTSC/YUV2/stereo: ~111gb for a cinema movie (1hr 45min)
      PAL/YUV2/stereo: ~125gb for same

      How did you reach those numbers? AFAIK NTSC and PAL use the same line frequency and have the same number of pixels per line, so that would lead to the same size.

      --

      Do you care about the security of your wireless mouse?
  66. Re:Why large files by Anonymous Coward · · Score: 0

    what about next-generation video discs, should they spilt their scenes into 4k files as well?

  67. there is no such thing as a double-sided DVD-R by Anonymous Coward · · Score: 0

    Nor double-layer, for that matter. DVD-Rs amax out at 4.7GB period, end of story. There are two-sided DVD-RAMs though. But the two sides are used separately, so no double-sized images. Heck, no images at all since DVD-RAM is fully random access.

    My (Windows) machine has no problem with >4GB files, BTW. Stupid WinZip can't expand files to that size (or zip them from that size) though.

    1. Re:there is no such thing as a double-sided DVD-R by psm321 · · Score: 1
    2. Re:there is no such thing as a double-sided DVD-R by Anonymous Coward · · Score: 0

      PKZip can.

  68. Error Prevention by Veteran · · Score: 2, Interesting

    One of the ways to keep errors from creeping into programs is to put limits on things so high that you can never reach them in the practical world.

    The 31 bit limit on time_t overflows in this century - 63 bits outlasts the probable life of the Universe so it is unlikely to run into trouble.

    That is the best argument I know for a 64 bit file size; in the long run it is one less thing to worry about.

    1. Re:Error Prevention by jhines · · Score: 1

      The next significant problem with time will come in the year 9999, when the four digit field that lazy programmers have used for thousands of years overflows. Didn't they learn their lessons the first time around?

      Digital took a bug report on this for Vax/VMS and promised a fix, some time in a future release.

    2. Re:Error Prevention by Anonymous Coward · · Score: 1, Insightful

      > limits on things so high that you can never reach them in the practical world.

      The 2 GByte limit came from a time when 14 inch disks held 30 MByte and disk space and RAM was too precious to waste an extra 32 bits when these would always be all zero for the forseeable futute.

      The concept of a hard drive that was as large as 2 GByte was just silly - it would fill the whole computer room, and in any case this is a limit on each file, not on the file system.

    3. Re:Error Prevention by Thing+1 · · Score: 2, Interesting
      One of the ways to keep errors from creeping into programs is to put limits on things so high that you can never reach them in the practical world.
      Anyone ever thought of a variable-bit filesystem?

      Start with 64-bit, but make it 63-bit. If the 64th bit is on, then there's another 64-bit value following which is prepended to the value (making it a 126-bit address -- again, reserve one bit for another 64-bit descriptor).

      Chances are it won't ever need the additional descriptors since 64-bits is a lot, but it would solve the problem once-and-for-all.

      --
      I feel fantastic, and I'm still alive.
    4. Re:Error Prevention by Ben+Hutchings · · Score: 1

      This is not a problem for file-systems as stored on disk so much as it is a problem for the file-system API. Passing around and manipulating arbitrarily long numbers in memory is substantially slower than using fixed-length numbers and could result in a big performance penalty for file operations.

    5. Re:Error Prevention by Istealmymusic · · Score: 1

      I believe this is known as BER encoding (Perl's unpack uses the "w" format specifier to decode these types of integers). For each byte (or in your example, qword), the MSB is set if another unit follows, unset if not. Compresses quite well, but practically, its not worth it. Reading a fixed-size integer is an O(1) operation, BER integers are read much slower and mess up alignment.

      --
      "The lesson to be learned is not to take the comments on slashdot too literally." --Vinnie Falco, BearShare
  69. Re:Why large files by Anonymous Coward · · Score: 0

    We say: "Bill Gates said something criminally stupid and short-sighted."

    You say: "Bill Gates says he didn't." (read the link).

    Gates said it... along with a great many other moronic things. Get over it.

  70. Re:Why large files by perfects · · Score: 1

    Bill Gates now claims that he was misquoted. What he really said was that "640K should be more than enough memory for anybody's toaster."

  71. Re:Why large files by wideBlueSkies · · Score: 1

    That tarball of 2002 stock quotes used to feed your stock research system.

    The database files themselves, in the system.

    --
    Huh?
  72. Re:Funny...in IRIX/XFS... by Anonymous Coward · · Score: 1, Informative

    Other filesystems don't either :

    http://www.sgi.com/software/xfs/techinfo.html

    "Max. File Size
    Designed to scale to 9 million TB with current hardware supporting scalability to 8000 TB on IRIX. Linux-64, 2 TB Max File Size. Solaris and Windows NT undergoing scalability testing"

    "Max. File System Size
    Designed to scale to 18 million TB with current hardware supporting scalability to 8000 TB on IRIX. Linux-64, 500 file systems of 2 TB each. Solaris and Windows NT undergoing scalability testing."

    Unfortunately, it's not just a problem with the filesystem, but also and most often a problem with the applications. So, AIX does have this problem just as much as any other. Unless you've tested all the applications available for AIX.

  73. Re:Why large files by lib · · Score: 0

    One not-so-everyday reason....
    Research.

    Right now im doing data-cache research that requires reference traces that are post-processed for various statistics (aka. every load & store is written to a file and then examined by other apps).

    These files are HUGE. Some of the benchmarks we're running have well over a billion memory references. For each reference you have 4 to 8 bytes for the address and various additional bytes for additional statistics.

    On the low side these files are ~ 4GB :(

  74. It's all about efficiency. by OS24Ever · · Score: 2, Insightful

    There is something innate in the education, learning, and daily working of a programmer that makes them not want to use 'too big' of a number for a certain task.

    it either

    A) Wastes Memory Space
    B) Wastes Code Space
    C) Wastes Pointer Space
    D) Or Violates some other tenant the programmer believes

    So, When they go out and create a file structure, or something similar, they don't feel like exceeding some 'built-in' restriction to their way of thinking.

    And usually, at the time, it's such a big number that the programmer can't think of an application to exceed it.

    Then, one comes along and blows right through it.

    I've been amused by all the people jumping on the 'it don't need to be that big' bandwagon. I can think of many applications that ext3 or whatever would need to use to make big files. they include:

    A) Database Servers
    B) Video Streaming Servers
    C) Video Editing Workstations
    D) Photo Editing Workstations
    E) Next Big Thing (tm) that hasn't come out yet.

    --

    As a rock-in-roll Physicist once said, No matter where you go, there you are.

    1. Re:It's all about efficiency. by dvdeug · · Score: 2, Insightful

      There is something innate in the education, learning, and daily working of a programmer that makes them not want to use 'too big' of a number for a certain task.

      We have code for infinite precision integers. The problem is, if it were used for filesystem code, you still couldn't do real-time video or DVD burning, because the computer would be spending too long handling infinite precision integers.

      As long as you're careful with it, setting a "really huge" number, and fixing it when you reach that limit is usually good enough.

  75. A few more words: by JohnnyBigodes · · Score: 1

    - Backups so a single file (no, I don't want to copy a fscking whole directory structure, thank you very much.
    - Video editing.
    - Large sound editing (multi-channel).
    - Ever tried to create a DVD ISO image? there you go...
    - Speaking of DVD's, *you* try dumping one to your harddisk with 2GB files.
    - Disk images (ever had to Ghost around a boot-disk or boot-DVD with a disk image?)
    - 3D animation files (probably included in the "video editing" section).

    want me to go on? the list is bigger...

  76. Please mod parent up. by wideBlueSkies · · Score: 1

    Please mod this guy up as interesting or informative.

    --
    Huh?
  77. I can't believe this...superSynchronicity??? by haggar · · Score: 2, Interesting

    I had a problem with HP-UX apparently not wanting to transfer via NFS (when the NFS server is on HP-UX 11.0) files larger than 2GB. I had to backup a Solaris computer's hard disk using DD across NFS. This usually worked when the NFS server is Solaris. However, last friday it failed, when the server was setup on HP-UX. I had to resort to my little Blade 100 as the NFS server, and I had no problems with it.

    I have noticed that on the SAME DAY some folks have asked question about the 2 GB filesize limit in HP-UX on comp.sys.hp.hpux !! Apparently, HP-UX default tar and cpio don't support files over 2 GB, either. Not even in HP-UX 11i. I never thought HP-UX stinked this bad...

    How does Linux on x86 stack up? I decided not to use it for this backup, since I had my Blade 100, but would it have worked? Oh, btw, is there finally implemented on Linux a command like "share" (exsts in Solaris) to share directories via NFS, or do I still need to edit /etc/exports and then restart NFS daemon (or send SIGHUP)?

    --
    Sigged!
    1. Re:I can't believe this...superSynchronicity??? by Arethan · · Score: 1

      the command that is equivalent to 'share' is 'exportfs', it can usually be found in /usr/sbin/.

      It allows you to push NFS exports to the kernel and nfsd without having to edit /etc/exports. Thus, they do not persist across reboots. However, you cannot use exportfs until nfsd is running, and nfsd will auto kill itself if /etc/exports is completely empty. So you must share at least 1 directory tree in /etc/exports before you can use exportfs.

      I believe Solaris has this same problem with share though. I don't remember these days, it's been a while since my SCSA cert. (Heh, i guess that's what man pages are for :)

    2. Re:I can't believe this...superSynchronicity??? by haggar · · Score: 1

      Thanks.
      And no, Solaris doesn't have this kind of problem. In Solaris, you have (a more general) /etc/dfs/* for sharing filesystems. Even if there is no fs shared in /etc/dfs/dfstab, nfsd and mountd will happily run. This autokill thing is really stupid.

      --
      Sigged!
    3. Re:I can't believe this...superSynchronicity??? by haggar · · Score: 1

      Oh yeah, so how does Linux cope with > 2 GB files transferred via NFS TO a Linux server? So far, only Solaris seem to support our solution. I have not tried Linux because the test takes some relatively considerable time, and if large files aren't supported to be transferred via NFS, I better not even try.

      --
      Sigged!
    4. Re:I can't believe this...superSynchronicity??? by Arethan · · Score: 1

      *shrug*
      I can't comment on the autokill thing. That's how the NFS implementation in all the distros I've ever used worked. Unix is Unix, except for all the different little quirks. ;)

      BTW: I just checked RedHat 8.0 for the autokill "feature". Looks like they fixed it. I just ran "/etc/init.d/nfs start" and it started, didn't bitch about /etc/exports only containing a single '#', and allowed me to then add an export using exportfs and I mounted it successfully afterwards. I even tried it after completely deleting /etc/exports. Still ran fine.

      Long story short, the autokill is gone in RedHat 8. That's good in my book. I never liked that behaviour either. As for the >2GB files. I'm running a test right now just for shits and giggles, but I don't see why it wouldn't work. As long as the file system can handle the destination file's size, nfs3 should behave admirably. (BTW, Solaris 8 uses nfs ver3 last I checked.)

    5. Re:I can't believe this...superSynchronicity??? by haggar · · Score: 1

      Yes, Unix is Unix, yet time and time again I see that Solaris satisfies me a bit more than the rest of the crop. It's all small things, I know, but the fact that my crappy little Blade 100 turned effectively to be more powerful than a HP L2000 was quite shocking. What a difference an OS makes...
      It's really good that this autokill is gone in RH 8.0. Still, if I think about the fact that Solaris 2.7 had it, it feels as if it took Linux really long to get it's act together on this little detail. On the other hand, Sun invented NFS, so it's no wonder they have it done right, even in the little details. It's no wonder that Solaris 8 has NFS ver. 3.

      I'll be honest with you: if I manage to get a Linux server do the NFS server job for this backup procedure, it'll get a huge plus in my book, and it will get a lot of visibility with some large customers. But last time I had to work with Linux as NFS server, something was wrong with the locking.

      --
      Sigged!
    6. Re:I can't believe this...superSynchronicity??? by Arethan · · Score: 1

      For the record, my little test was successful. I moved a 3.7GB file across NFS between Linux boxes. Filesizes and md5sums match between the original and the NFS copy. The source and destination were both running RedHat 8, though Linux has used the NFS ver3 for quite a while now. Definitely as of the 2.2 kernel, probably back into the 1.x series even.

      I will agree with you, Sun really has their act together when it comes to making a good commercial Unix. They have nice hardware, and their OS is pretty solid. They have a few memory leaks in some libraries (as of 2.8), but nothing that can't be worked around, and most applications that care take the bugs into account. Of course, Sun has been in the biz for quite a while. ;)

      Anyhow, always nice chatting with fellow Slowaris er..Solaris junkie. ;)

      Cheers!

    7. Re:I can't believe this...superSynchronicity??? by haggar · · Score: 1

      OK, I will then do my test here in the lab. If it works, it will be the second supported solution, apart from Solaris 8 and 9.

      probably back into the 1.x series even
      Umm... even if that was so (I tend to doubt that.. 1.x? 1.2.x was the first I saw doing anyting useful), back then there wasn't anything but ext2 surely didn't yet support large files. Th patch for ext2 for such support came out somewhere near the end of 1999. I think it was RedHat 6.0 the first to support it. If I recall correctly. But c'mon... Linux 1.x, I'm almost getting nostalgic. It didn't even have ext2, just the crappy old extfs...

      --
      Sigged!
    8. Re:I can't believe this...superSynchronicity??? by Wolfrider · · Score: 1

      --Eh, it was probably intended as a "security feature." Just ends up being annoying for the REAL root.

      --
      .
      == WolfriderV6 == I'm willing to admit that *I just might* be wrong... Are you??
    9. Re:I can't believe this...superSynchronicity??? by haggar · · Score: 1

      I see. OK. I am not sure how it would improve security, but I can imagine that it was the thought behind it. It's not like it's some bug, more like a deliberate decision, isn't it.

      --
      Sigged!
  78. At microsoft's level ... by Anonymous Coward · · Score: 0

    most people recommend breaking them up.

  79. Cripes! by Hubert_Shrump · · Score: 1

    That's three words.

    I didn't realize Daniel was so big, though.

    Has he considered going lossy?

    --
    Keep your packets off my GNU/Girlfriend!
  80. PAL & NTSC by Kjella · · Score: 2, Informative

    PAL: Max 720x576x25fps interlaced (50 Hz)
    NTSC: Max 640x480x29.97fps interlaced (60 Hz)

    No, the don't have same frequency, nor scanlines. Some european TVs will take PAL-60, like PAL only at 60Hz though. Also I don't think the color space works in the same way, but not sure about that one. That was why I used YUV2 (16bit) for both.

    Kjella

    --
    Live today, because you never know what tomorrow brings
  81. Re:Why large files by Proneax · · Score: 1

    I remember like 4 or 5 years ago talking to my friend's dad, who works at kodak, and he would fill an entire 2gb jazz drive with one picture.

  82. Re:data warehouse, and any database for that matte by SwissCheese · · Score: 1

    Even our Exchange private information store is somewhere around 10GB, and we are a small company by most standards

  83. Only 35 years... by Kjella · · Score: 1

    And that big y2k problem that was supposed to bring down mankind? How many years did it take to fix that? I very much doubt we started in 1965 ;)

    Prediction: First distro to "suck it up" will be around 2035 or so. Personally, I think this is so far down on the priority list as you can get. Besides, with open source, is there really that problematic to grep the source for "time_t" and fix it? I don't think so.

    Kjella

    --
    Live today, because you never know what tomorrow brings
    1. Re:Only 35 years... by Dan+Ost · · Score: 3, Informative

      For most programs, it would require little more
      than to change the typedef that defines __time_t
      in bits/types.h.

      For stupidly written programs that assume the
      size of __time_t or that use __time_t in unions,
      each will need to be addressed individually to
      make sure things still work correctly.

      --

      *sigh* back to work...
    2. Re:Only 35 years... by edhall · · Score: 1

      The FreeBSD folks have already done a considerable amount of work on this, even to the point of making time_t 64 bits for both kernel and userland and testing for issues. Enough is known that the main worry now is how to handle the change in ports, some of which need a fair amount of work to move away from 32-bit time_t. But at the rate things are going, I'd expect that they will make the transition to 64-bit time_t for FreeBSD 6.0. I've no idea how they will handle the legacy issues (ports and pre-6.0 binaries) though.

      -Ed
    3. Re:Only 35 years... by Anonymous Coward · · Score: 0

      I worked at a company that had Old computers from the 60s and 70s They were for digitizing Clothing patterns. The Y2k bug never showed up on those computers. So I guess people were working on the problem at around 1965

  84. whatever by Anonymous Coward · · Score: 0

    George Orwell may have wrote some nifty stories but the guy was no linguist mmmkay.

  85. Re:data warehouse, and any database for that matte by Anonymous Coward · · Score: 0
    my data warehouse at work is 600GB and grows at a rate of 4GB per day.

    That sounds like fun with backups.

  86. row partitions by axxackall · · Score: 1
    I agree that 2 GB limit is obsolete today, especially for projects with large databases and with video editing tasks.

    However, I would recommend to stay away from > 2GB files in database environment. Even if your FS supports large files, you still loose performance on "double-driver": first your kernel provedes a partition, than it provides a file-system over it. But if you need so big files, why would you need file-system? Just use row partitions!

    Of course you still need large files for video, but massive concurrent preformance overhead is not a typical problem in such case.

    --

    Less is more !
    1. Re:row partitions by lenski · · Score: 1
      I agree, 2GB is "inadequate" for current large system applications, and for "new media", etc.

      On the other hand, I question whether having >2GB flat files is a reasonable way to organize big data. (Movies have "scenes", music is often divided into "songs", or "movements" for the classically minded, plays have "acts", and so on. Hierarchy and subdivision come naturally in many domains of activity.

      On the gripping hand, as 64-bit CPUs become more common (MMmm... Hammer...), I expect a relatively natural though not necessarily monotonic progression toward 64-bit addressing in flat files.

    2. Re:row partitions by iamacat · · Score: 1
      I don't think most database programmers can write better space allocation, I/O buffering or virtual memory code than good OS programmers. Did any of you guys write a database buffering code and used something better than a simple LRU list? Like taking physical disk layout into account? If you did, and it performed better than the OS on realistic benchmarks, why not write a reusable device driver that will improve performance of everything, not just the database?

      Now it's possible that somehow you have a very good knowledge of your application-specific disk usage pattern and can get a speed up that outweighs user-mode overhead, system swapping your buffers in and out of memory and so on. In this case, you better use a dedicated disk rather than just a partitition. Otherwise, your I/O scheduling code will have interesting interactions with system's swapfile and other normal filesystem activity.

      Even then you run a risk that OS code will one day improve and outperform your homegrown changes. Most programmers are better off just tuning their code to work well with OS native filesystem, virtual memory and so on.

    3. Re:row partitions by pstemari · · Score: 1
      Physical disk layout is no longer available with modern devices. Database layout across multiple physical devices is precisely what a good DBA is trained to handle.

      As far as buffer management and filespace allocation inside a tablespace, that's precisely what Oracle or DB2 specialize in, using very sophisticated cross-process buffering techniques and cache hit scoring. None of that is home-grown. It's why you spring the big bucks for a serious database.

    4. Re:row partitions by Anonymous Coward · · Score: 0
      64-bit addressing in flat files

      Why wait for Hammer?
      typedef unsigned long long off_t


      Sure, you loose a little speed dealing with those 64bit integers, but not as much as you would think (With a modern CPU)
  87. Re:Why large files by Anonymous Coward · · Score: 0

    Gates said it...

    Why? Because you say so? Maybe he said it, maybe he didn't. Show me the source of that quote. Until you can, your assertion is worth exactly nothing. People who deliberately spread misinformation disgust me, doubly so if they claim to be technical types. I'd have you taken outside and flogged if I could.

    -- a random AC

  88. Re:Why large files by addps4cat · · Score: 1

    Hey everyone lets keep beating a dead horse and telling him the million and one ways that you need files greater than 2gb. Half of these posts just say "movies" anyway. So stop repeating yourselves.

    --
    Don't eat shrimp candy, just a heads up.
  89. three words by Nick+Mitchell · · Score: 1

    hate jar jar

  90. What the hell? by White_Lightning · · Score: 1

    Why'd they even mention DOS? All DOS programs are staticly linked. There are no dll's or anything like them (except overlays). The only thing close would be DOS Extenders. So, what does DOS have to do with it?

    1. Re:What the hell? by mabinogi · · Score: 1

      the problem is not DLLs specifically, static libraries cause problems too....

      when the header file says off_t, and the library thinks off_t is 32 bits and the program linking to the library thinks it's 64 bits, then you have a problem.

      The same sort of problem would presumably occur when a DOS library was compiled in large mode, but the program linking to it used small, or vice versa....

      --
      Advanced users are users too!
    2. Re:What the hell? by White_Lightning · · Score: 1

      Wouldn't that be a programmers error then? Either when writing the header file or using the wrong link library?

    3. Re:What the hell? by mabinogi · · Score: 1

      Yes, but if the programmer doesn't know how the library was compiled, then it's the distributor of the library's fault.

      Which is why this article is putting the emphasis on getting the distros to ensure that they provide a consistent platform.

      --
      Advanced users are users too!
    4. Re:What the hell? by White_Lightning · · Score: 1

      I suppose I should have read the full article instead of just skimming over it.

  91. Admittedly, I had problems with the need for... by constantnormal · · Score: 1

    ... 64-bit addressing before thinking this through. I couldn't see the significant advantage for more than a very tiny fraction of apps in being able to address more than a few gigabytes.

    Now I can't wait for OS X to have 64-bit support for the IBM 970 processors (I do realize that it will take several releases before default 64-bit operation is practical).

    When compared to clustered 32-bit filesystems, I would think that a "pure" 64-bit filesystem would have a number of very practical advantages.

    I could easily see the journalled filesystem becoming one of the first 64-bit subsystems in OS X, right after VM.

    1. Re:Admittedly, I had problems with the need for... by Anonymous Coward · · Score: 0

      64-bit addressing has little to do with 64-bit file lengths. Except of course if you want to memory map a huge file...

      Actual 64-bit code is currently still of limited use, since for most values 32 bits are sufficient and handling 64-bit values for those things where it is necessary (like file offsets) using 32-bit instructions doesn't have a significant effect on performance.

      BTW: OS X has always had a 64-bit off_t, so much of the problems affecting other systems don't apply to it (or other 4.4 BSD -derivatives).

      I'm a bit skeptical about transitions to a extended 64-bit instruction set and ABI, considering the mess made by Irix and Solaris... I'd say the Alpha went in the right direction - a new 64-bit architecture, not designed to be an extension to an old 32-bit architecture, with software emulation for old binaries (as well as i386 binaries on NT). Too bad it didn't gain a decent market.

  92. Large filesystem lack more of a problem by mauriceh · · Score: 3, Interesting

    A much bigger problem is that Linux filesystems have a capacity limit of 2TB.
    Many servers now have the physical capacity of over 2TB on a filesystem storage device.
    Unfortunately this is still a very significant limitation.
    This problem is much more commonly encountered than file size limitations.

    --
    Maurice W. Hilarius Voice: (778) 347-9907
    1. Re:Large filesystem lack more of a problem by Xilman · · Score: 1
      A much bigger problem is that Linux filesystems have a capacity limit of 2TB.
      Many servers now have the physical capacity of over 2TB on a filesystem storage device.
      Unfortunately this is still a very significant limitation.
      This problem is much more commonly encountered than file size limitations

      An interesting observation, but not one I've ever made. I'm much more likely to want to store over 2Gb of data in a single file than to want a 2Tb file system. Indeed, I don't have 2TB of disk to make into a file system, but I create large files relatively often.

      Paul

      --
      Lasciate ogne speranza, voi ch'intrate
    2. Re:Large filesystem lack more of a problem by mauriceh · · Score: 2, Informative

      Roughly 50% of of the servers we build at present have over 1TB of storage.
      Roughly 30% have over 2TB.

      With a 3Ware 7500-12 IDE RAID card and 11x200GB disks we hit 2.1TB.

      This costs about $6,000 in a server, so is a fairly popular option.

      Next month Maxtor ships their 300GB drives (MAYBE, Maxtor have been lying about their release schedules lately). Once that happens, it will be a very common problem.

      --
      Maurice W. Hilarius Voice: (778) 347-9907
  93. Re:data warehouse, and any database for that matte by hector13 · · Score: 1

    These are file on a regular partition (ie, ext2 or somesuch)?? It still sounds totaly in-effecient to me. I have nothing against large files, but I would hope a db would be using something more effecient or atleast using its own filesystem (making the 2bg limit irrelevant).

  94. I miss BeFS... by jonr · · Score: 1

    18 EXAbytes file sizes, real journals, life queries...
    *SOB*
    J.

  95. Re:Why large files by Sayjack · · Score: 1

    Backup files, exporting a huge oracle database to a file. And, when I record divx quality video through my ATI card I can go through the GB like crazy.

    A better question is, Who doesn't need largefile support?

    As for the seek time...not everything is accessed like a random access file. I imagine that the backup data will be read in sequentially. The video file would mostly be handed sequentially other than when jumping to a chapter fast forwarding or reversing.

    --

    -- Good judgement comes with experience. -- Experience comes with bad judgement.

  96. Re:Why large files by AJWM · · Score: 1

    Can anyone give a good reason for needing files larger than 2gb?

    Video/movie files, for one thing. Even compressed (eg DV or MPEG) those things are huge. A 2 GB file at professional DV compression (50 Mb/sec) is about 4 minutes worth. (DV is similar to MJPEG, so it's still lossy. Uncompressed or unlossy compressed video (critical for machine vision or image analysis apps) chews even more space.

    I know I've wanted to be able to just dump a mini-DV tape (about 13 GB) directly to a single disk file for later editing.

    Other fields also use huge data sets - seismic data analysis for example. Filesystems designed for supercomputer clusters (eg PVFS) have unlimited size on the total filesystem (tens of terabytes is not unusual) although the individual file size may still be limited by the underlying OS or hardware word size.

    Then there's creating a .zip or .tgz of a collection of big files. Or creating the equivalant of an ISO image of a DVD. And so on.

    --
    -- Alastair
  97. Re:Why large files by AJWM · · Score: 1

    The seek times alone within these files must be huge,

    Depends on how your inodes are laid out, how big you have to get for triple indirect blocks, etc.

    Shouldn't be any worse (and maybe better) than trying to seek through an equivalent collection of smaller files -- you've got to do all those directory searches, etc. (Exact comparisons will depend greatly on the filesystem and parameters chosen when the FS was created.)

    --
    -- Alastair
  98. Re:Why large files by drinkypoo · · Score: 1

    There is a need for a virtualizing filesystem which supports multiple volumes, offline and not, and files stored in segmented form to fit. It would be insanely handy in a clustering environment; The whole cluster could store the file (with some redundancy) and access it in a shared fashion. This would substantially improve the ease of working with inanely large data sets in a clustered scenario.

    --
    "You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
  99. Re:Why large files by drinkypoo · · Score: 1
    Ostensibly your filesystem driver will be caching much of the list information in memory, thus for the uses to which fat32 is applied, it is still a reasonable method. There's a reason it's called fat32, it's a direct descendant.

    Anyway those using a M$ OS which does not support NTFS are fooling themselves. If you are using some form of windows prior to Windows 2000, then you are getting a terrible experience which is nothing like the real OS -- NT. NTFS is a pretty good filesystem with journaling, ACLs, and implicit support for encryption and compression. Fat32 is shite.

    --
    "You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
  100. to be fair by Anonymous Coward · · Score: 0

    1995 was just after the Alpha was first released IIRC. Back then (and still today to a lesser extent), the phrase "if you need 64-bit pointers (or 64-bit pointers into a file), get a 64-bit machine" made at least some sense.

    1. Re:to be fair by Anonymous Coward · · Score: 0

      It never made any sense to tell someone to buy a new, incompatible machine just because their file got too big.

    2. Re:to be fair by Anonymous Coward · · Score: 0

      Just after? Uh.. I baught a DEC Alpha Multia in 1995. And believe me, Alpha's been around a lot by then. This box cost me 75 bucks from ebay (yes) duh. By then DEC was already not really dec.. it was just digital and on it's way to become compaq and hp.

  101. gzip handles large files fine by Anonymous Coward · · Score: 0

    I've compressed files larger than 2 gb with gzip on linux 2.4 without any problems at all. We have some 20 gig and larger database tables which compress nicely to around 5 gig.

    1. Re:gzip handles large files fine by Whelkman · · Score: 1

      gzip works over 4 GB but loses the ability to accurately report uncompressed file sizes (minor).

  102. Re:Why large files by Anonymous Coward · · Score: 0
    Raw video is about 1G a minute.

    When I encode 120 minutes of video into MPEG2 at around 640x480 it reaches about 3.5G.

  103. Re:Why large files by UnknownSoldier · · Score: 1

    > NTFS is a pretty good filesystem with journaling,

    That's only partially true -- it doesn't journal data, only meta-data.

  104. RTFPP by xintegerx · · Score: 1

    I was giving an example because the parent was 0, Offtopic at the time.

    The example was that officials do worry about e-mail so they would either save it like he said or avoid typing it like I said. The point is that they would consider it important and that they would save e-mails that were sent.

    1. Re:RTFPP by DAldredge · · Score: 1

      The have to save it for the same reason they do not like sending it. Open Records laws. It is much easier to take 2 or 3 different stands on an issue if those you talk to have no record...

  105. Re:Why large files by drinkypoo · · Score: 1

    Sure, but that's good enough to save people in almost all cases. I've never, EVER lost data on NTFS5 due to a crash (which has happened plenty) or a power failure (only twice since I started using it.) FAT32, on the other hand... Or ext2 for that matter, it doesn't matter. A partially journaling filesystem gets the job done well enough for basically any purpose. If it's not good enough for you, perhaps a filesystem is not the best place to store your data in the first place, I'd considered a clustered replicating RDBMS :P

    --
    "You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
  106. These jokes are (minus infinity, really dumb) by Anonymous Coward · · Score: 0

    At least come up with something funny, like "In Soviet Russia, filesystem seeks YOU."

  107. The "l" in lseek() by edhall · · Score: 3, Informative

    Once upon a time (prior to 1978) there was no lseek() call in Unix. The value for the offset was 16 bits . Larger seeks were handled by using the different value for "whence" (the third argument to seek()) which causes seeks to occur in 512-byte increments. This resulted in a maximum seek of 16,777,216 bytes, with an arbitrary seek() often requiring two calls, one to get to the right 512-byte block and a second to get to the right byte within the block. (Thank goodness they haven't done any such silliness to break the 2GB barrier.)

    When Research Edition 7 Unix came out, it introduced lseek() with a 32-bit offset. 2,147,483,648 bytes should be enough for anyone, hmmm? :-).

    -Ed
  108. Needs to be signed... by wowbagger · · Score: 1

    The time_t type must be signed, so that you can represent negative time differences. If you make time_t unsigned, when you try to do things like saying "if this file is older than that file" you will get a very large positive time, rather than a negative time. Not good.

    1. Re:Needs to be signed... by koreth · · Score: 1
      No, the type of time_t - time_t must be signed. That doesn't imply that time_t must be signed. For example, (unsigned int) - (unsigned int) is int, not unsigned int.

      And anyway, "if (time_t > time_t)" works fine with unsigned values.

    2. Re:Needs to be signed... by Ben+Hutchings · · Score: 2, Informative
      No, the type of time_t - time_t must be signed. That doesn't imply that time_t must be signed. For example, (unsigned int) - (unsigned int) is int, not unsigned int.

      Wrong. The C99 standard says in section 6.3.1.8 paragraph 1:

      Many operators that expect operands of arithmetic type cause conversions and yield result types in a similar way. The purpose is to determine a common real type for the operands and result. For the specified operands, each operand is converted, without change of type domain, to a type whose corresponding real type is the common real type. Unless explicitly stated otherwise, the common real type is also the corresponding real type of the result, whose type domain is the type domain of the operands if they are the same, and complex otherwise.

      Here, the common real type is unsigned int, and the description of the addition and subtraction operators (section 6.5.6) does not specify a different type for the result when both operands have arithmetic type.

      If you disagree, please cite relevant parts of the standard to support your case.

    3. Re:Needs to be signed... by koreth · · Score: 1
      I stand corrected. My assertion was based on the fact that, well, it works:

      unsigned int a = 1, b = 2;
      int c = a - b;

      That consistently results in c == -1, at least on every C compiler I've used for the last 20-odd years. But if that behavior isn't actually part of the standard, then I guess we'd need to define some standard macros for unsigned time_t math to produce correct results portably.

    4. Re:Needs to be signed... by doug363 · · Score: 1

      You'd probably want to cast a and b to signed long longs before doing the subtraction to ensure that you got the right answer. Even then, an int wouldn't necessarily be large enough to hold the answer. (i.e. what if a=-1u and b=0? You get -1, when it should be 4 billion or so...)

    5. Re:Needs to be signed... by Ben+Hutchings · · Score: 1

      In fact you get undefined behaviour when you cast a value of unsigned type to the corresponding signed type and the value is out of range. Usually you'll just get a negative result though.

    6. Re:Needs to be signed... by ejasons · · Score: 1

      unsigned int a = 1, b = 2;
      int c = a - b;

      That consistently results in c == -1, at least on every C compiler I've used for the last 20-odd years. But if that behavior isn't actually part of the standard, then I guess we'd need to define some standard macros for unsigned time_t math to produce correct results portably.

      This works because your system uses two's complement arithmetic. It would fail on a system with a different arithmetic system. It's unlikely that you'll ever work on such a system, but they could exist (and, more importantly, could exist, which is why the operation is undefined in the C standard!).
  109. Re:Why large files by LarsG · · Score: 1

    Can anyone give a good reason for needing files larger than 2gb?

    DVD .iso images. :)

    --
    If J.K.R wrote Windows: Puteulanus fenestra mortalis!
  110. Re:Why large files by Anonymous Coward · · Score: 0

    Exactly, and MPEG-2 compressed video can still exceed 2GB in size.

    I have a friend that does some video work on the side -- he takes peoples home movies and such and puts them on DVD (he turned a hobby/present into a small business). Consequently, he uses a mac. One hour of uncompressed (compressed by only loss-less) at DVD quality (720x480) @ 24.97 fps and stereo audio takes up about 15-16GB. If you want to then bumb up to like 720p that would be like what 1280x720 I think -- your talking about 40-42GB per hour.

    Of cource std tv 320x240 resolution and stereo would only take up approx 3.5GB uncompressed which is still larger than 2GB.

  111. obvious by larsl · · Score: 1

    I would have snapped up puppy.mil in an instant.

  112. Re:Why large files by Anonymous Coward · · Score: 0

    You just made my day :)

  113. The even-handed OS coverage by Anonymous Coward · · Score: 0

    I'd rather that the article didn't even bother with the lip service to other OSes; as it is, the article reads:

    "The BSDs handle large file support without any problems. Solaris and Linux have some problems. Here's all the problems with linux, with nary a mention of what happens with Solaris, or how other OSes have managed to deal gracefully with it."

    If you're gonna write a lignux article, make it a lignux article. Jeez.

  114. Re:Why large files by mccalli · · Score: 1
    Yes, I can give two.
    • Virtual PC (or VMWare or whatever), whereby various different OS installations are contained within their own virtual file systems (usually a single file of over three gig).
    • Video capture, whereby raw footage from my digital camcorder is dumped down onto the hard drive ready to be edited. Those files can be pretty vast as well.

    Cheers,
    Ian

  115. Re:Why large files by kasperd · · Score: 1

    Ostensibly your filesystem driver will be caching much of the list information in memory

    Caching the tables in physical memory does of course help, but it doesn't remove the linear scan through a linked list. This linear scan takes time even if done in RAM. To improve performance the Linux driver for this filesystem caches a number of already resolved positions, I think this cache holds 8 entries. I found out about that once I needed simultaneous sequential access to 20 files on the same FAT32 filesystem. Performance was horrible. I had two options, either do access in very large blocks to keep the number of listscans low, or increase the cachesize and recompile my kernel. I don't remember which of the two options I chose.

    --

    Do you care about the security of your wireless mouse?
  116. Re:Why large files by Admiral+Burrito · · Score: 1

    I recently tried recording a one hour TV show with xawtv, to AVI (MJPEG, 640x480, 15 fps, 16-bit stereo sound). It appeared to record okay, and ended up 5 gigs. But I could only play the first few minutes of it with aviplay. Something (either xawtv, aviplay, or the AVI file format itself) has a 2 (or 4 (unsigned)) GB limit.

  117. You TWIT. Plural of virus is "viruses". by Anonymous Coward · · Score: 0

    It is not already a plural. Read a book (I suggest a dictionary, English or Latin).

  118. I don't understand it either - by Anonymous Coward · · Score: 0

    That part of writing an OS from scratch is trivial.

  119. Re: tar/cpio and file 2 GB by Anonymous Coward · · Score: 0

    By the Unix standard tar and cpio will never support files bigger than 2 GB. Maybe a new utility called tar64 and cpio64 will. AIX backup/restore support files > 2 GB. Maybe HP-UX dump/restore can do the same.
    Good luck.

  120. you probably use a firewall or something by xintegerx · · Score: 1

    I would guess that a router or firewall or any device, maybe even a cable modem would filter that. If you think you're accessing through a firewall, that's probably why.

    It works on Win98/Internet Explorer 5 with a direct connection to the cable modem.

  121. Re:Why large files-Pinnacle by Anonymous Coward · · Score: 0

    " Ever heard of something like movie-editing? You can get huge files really fast."

    Heard of it. Live it. That's why the original stays on the source machine (DV), while a lower quality (preview) is loaded into the computer doing the editing. Once editing is done then the software pulls the higher quality originals from the source and assemble and process appropriately. Outputting in format desired. Keeps file size managable.

  122. Not in Solaris 8 and above by jsimon12 · · Score: 1

    Old news, Solaris 2.6 and 7. Solaris 8 is 64 by default. I hope they are not still developing for 2.6 :)

  123. Fake quote- mod parent down. by Anonymous Coward · · Score: 0

    I don't get why people continue to quote this. I've never ever seen a cite (besides "well, I heard it from my friend), and Bill Gates has said he didn't say it. Two strikes is enough for me.

    "Mod me down, please."
    --cyber_rigger

  124. Re:Why large files by drinkypoo · · Score: 1
    Don't you mean, increase the cache size, make modules... :)

    Oh well. Anyway, I know that a linked list just plain isn't as efficient as a tree, but as you say there are ways to speed things up. I would assume that the windows driver probably throws away quite a bit of memory trying to make fat32 fast, microsoft has always been more than willing to squander memory willy-nilly. In fact, Mechwarrior IV:Vengeance used to have a habit of squandering it permanently, or until the process terminated... From what I hear, Excel still does, but I don't spend much time in there consecutively.

    Also, I don't see any reason you couldn't build a tree in memory or in a cache (perhaps you build it in memory and design it so that you can swap most of it out automatically? That would be a really funky way to do things on chicago but it would be quite reasonable on any flavor of NT, or of course on your favorite open-source operating system. A non-trivial job to be sure but obviously not impossible. At least that way it would only be slow once per boot.

    --
    "You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
  125. Why is open() concerned? by dfgdfgdfg · · Score: 1
    Why does it make a difference for open() whether the size of off_t is 32 or 64 bits? Shouldn't only lseek() be affected?

    I thought only few programs used lseek(), e.g. databases. Wouldn't most programs read files sequentially, whitout using off_t at all?

    --
    -- 1.e4 c6 2.d4 d5 3.Sc3 de4: 4.Se4: Sd7 5.Sg5 Sgf6 6.Ld3 e6 7.S1f3 h6 8.Se6:
  126. Re:Why large files by kasperd · · Score: 1

    Don't you mean, increase the cache size, make modules... :)

    I think FAT was compiled in my kernel at that time.

    perhaps you build it in memory and design it so that you can swap most of it out automatically?

    I wonder who really wants to spend a lot of time improving FAT performance when there are so many other filesystems that will always perform better than FAT.

    --

    Do you care about the security of your wireless mouse?
  127. Except for astronomical calculations.. by A55M0NKEY · · Score: 1

    Except for scientific calculations where there will probably never be a reasonable limit on the size or precision of numbers needed I doubt anyone would need more than 64 bits for any scalar type, be it a char or an int or a double or whatever. Why not use 64 bits for everything and accept the wasted space for storing chars but not ever have to worry about running out of numbers? Even if you waste 7/8 of the space on your hard drive to store 8 byte long chars, the available storage has gone up exponentially by using a 64 bit address space. increasing the size of your data 8 times is negligable, negligable enough to not even bother with 1 byte chars.

    --

    Eat at Joe's.

  128. Re:Why large files by drinkypoo · · Score: 1
    I wonder who really wants to spend a lot of time improving FAT performance when there are so many other filesystems that will always perform better than FAT.

    Well, mostly Microsoft, I'm thinking. Also fat32 is a handy filesystem because just about everyone can read it these days. I'm about to set up a PC for my girlfriend's aunt, it's just a K6-2 300. It'll have 256mb ram, and minimal (1.2Gb) disk, because that's what I have lying around. I'm putting Windows 98 SE on the disk, and knoppix will be provided on a CD so she can play with linux, assuming I can get it to stop making idiot assumptions about refresh rates without requiring her to insert a floppy as well. That is god damned idiotic. But anyway I digress, the best FS for that OS is FAT32, so I'm going to use it, all data will be stored on a fat32 volume. I imagine this is becoming a fairly common scenario. Also of course many geeks multiboot to win98 for games, the only filesystem they'll have in their PC readable by all operating systems is FAT32 and they will likely be keeping media there.

    --
    "You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
  129. Re:Use Microsoft Windows NT and forget about the by John+Sullivan · · Score: 1
    Use Microsoft Windows NT and forget about the little things.

    I realise you're just a troll, but I'd like to point out that Win32 also has two forms of file API for most functions - one that can do 64-bit and one limited (or at least which encourages you to use) 32-bit. For 64-bit access to work in any given application, you're relying on the language runtime making the correct mapping, and/or the end developer choosing the right set of functions to use. In many cases the easiest and most obvious ones will limit you to 32-bits - so many applications will not work with such large files.

    This is a problem which has affected pretty much every system - even in an OS where *only* 64-bit file APIs exist, you'll still find an occasional app which tries to fit a file location into a 32-bit variable.

    --
    This is my World Wide Web of Whatever
  130. Re: Painfull, in ACHES by RoboProg · · Score: 1

    Oh, but you do. AIX 4 definitely wants -DLARGE_FILES (sp?), or bad things happen, and watch your longs and long-longs (and their aliases) carefully. (A buddy and I recently had to comb through exactly this problem in an app)

    --
    Yow! I'm supposed to have a plan?
  131. The Human Genome... by Anonymous Coward · · Score: 0

    is ~3.15 billion nucleotides long. Yeah, you can compress it (2 bits per nucleotide = 4 nucleotides/byte), but it makes it a pain in the ass to work with.

  132. Re:Why large files by asparagus · · Score: 1

    I know I've wanted to be able to just dump a mini-DV tape (about 13 GB) directly to a single disk file for later editing.

    That's the way I edit. With the size of modern hard drives, it's a waste of time to do a traditional log/capture session. Instead, just dump everything to disk and then break it up from there. FCP even has a feature or two designed towards this direction (Start/stop detection). Hopefully they'll fix the subclip bug in version 4.

    I tell FCP to parition my files, though. The only >2GB files I currently have are my toast DVD images. I try not to use >2GB files in general, though...there's still some mysterious HFS+ bugs floating around that I've been trying to avoid.

    -Brett

  133. Re:Unices? All your boxen are belong to us!!! by andrewjjenkins · · Score: 1

    Your boxen will be shipped in 4-6 weeks.
    Sweet! I didn't know I'd get free boxen for reading your post!
    This guy would have a field day with "All your boxen are belong to us"

  134. Anybody else still have the T-shirt? by Ripsaw · · Score: 1
    Way back in January of 1995 a group called the Large File Summit was formed to standardize large file access in Unix systems.

    This group produced three notable results:

    • A specification, which was ultimately submitted to X/Open,
    • A declaration that 2**64 bytes is a "bubbabyte", and
    • A really cool T-shirt.

    I still have my T-shirt -- how about you?

  135. Benefits of File Size Caps by Flamesplash · · Score: 1

    I used to be a student admin for Clemson's College of Engr. and Science. We had several CAD tools that the Engr. students would use. There was this one tool that you could specify a duration the simulation was supposed to last, otherwise if the field was blank it would run forever. Besides that little bit of badness the field was blank by default, so many an unsuspecting student would run their simulations and they would run forever creating these huge output files, which the students also didn't know about.

    The killer here, is that if you quit the program the wrong way ( something like Close instead of Quit ) the program would keep going, even after the student would log out.

    So now you have N students who are all generating infinite files. However, the files would hit the 2GB limit and stop eating up space. ( Thank You )

    The only other nasty ness of this is that once we found the file, if you simply removed it, the program (still running after log out) is just able to finally add more data. So you had to track down where the program was runnging and kill it first.

    I was in charge of backups, and man of man was this annoying for them.

    --
    "Not knowing when the dawn will come, I open every door." - Emily Dickinson