Slashdot Mirror


Linux Kernel Archives Struggles With Git

NewsFiend writes "In May, Slashdot discussed Kerneltrap's interesting feature about the Linux Kernel Archives, which had recently upgraded to multiple 4-way dual-core Opterons with 24 gigabytes of RAM and 10 terabytes of disk space. KernelTrap has now followed up with kernel.org to learn how the new hardware has been working. Evidently the new servers have been performing flawlessly, but the addition of Linus Torvalds' new source control system, git, is causing some heartache by having increased the number of files being archived sevenfold."

6 of 45 comments (clear)

  1. same reason I dislike Subversion by kwoff · · Score: 2, Interesting

    `grep -r`ing source code under Subversion takes much longer than with CVS, due to all the .svn files.

  2. reiser4 + VCS? by OmniVector · · Score: 2, Interesting

    (sightly) offtopic. wasn't reiser4 supposed to have 'plugin' support, so things like version control could be built directly into the file system? the prospect of being able to say type:

    touch bar
    echo 'foo' > bar
    revisions bar
    output of revision history
    cp bar/revision/1 bar-version-1.0.backup

    granted yes, the storage requirements and cpu usaged might be horrific, but i think something like this is inevitable in file systems, and certainly i welcome the day it becomes a reality.

    --
    - tristan
  3. Filesystem? by RealBorg · · Score: 5, Interesting

    Maybe kernel.org should finally consider moving to a more appropriate filesystem than ext3, preferably reiserfs for it beeing optimized to handle a lot of small files. Tail packing not only saves disk space but more important a lot of memory in block cache.

  4. Re:File System Scalabilty? by jbolden · · Score: 2, Interesting

    I agree he's given this a lot of thought. Linus wouldn't have such non mainstream views if he didn't care. Bad ideas can be well thought out.

    Next, I'm not ignoring speed you can scale a database system up infinitely large. Since database systems support acid transactions (i.e. line/file source code locking during transaction) you can have multiple merges going on at once and thus effective speed is much much better. For example Amazon.com uses Oracle as their backend. Think about the number of users and how snappy amazon feels. Do you really believe that worldwide kernel development is even a small fraction of what amazon.com has to handle in terms of volume?

    I don't see any reason whatsoever for low system requirements for the database servers. However the clients can run on junk hardware under a database system quite easily. Again think Amazon.

    Finally decentralized, I don't really believe that is needed at all. What Linus seems to want is the ability for people to:

    1) Create forks without him knowing about them
    2) Merge parts of those forks back into his trees at will

    Again high end database based CM systems support that. The trees can be on different schemas within the same database or exported at the database level to different servers. Merges are specific to the schema.

    Seriously I have yet to hear of anything that Rational doesn't do that Linus wants as a programmer or as a project lead. Decentralized is the best example of this. He certainly claims to need it but I have to understand why.

  5. Re:perhaps this might help by jd · · Score: 2, Interesting
    Actually, I believe Fedora Core 3 has most of the other filesystems compiled in, you just won't get the main partition formatted with them.


    Since the "smart" way to run such a server is to have the main FS on one disk and the data on another (this avoids tracking the head back and forth), the data partition can be just about anything.


    Now, the fact that the maintainers have said they are using Ext3 is rather more convincing to me. Foolish beyond belief, but convincing. I would rather use a "less reliable" FS like XFS and a RAID array to deal with errors, as I would have the performance benefit with no significant risk.


    I also regard Red Hat's obsession with Ext3 (even though Linux is all about choice, and it is choice that makes Linux different) as unhealthy. SGI, for some time, produced XFS-aware installer replacements for Red Hat Linux, and it would have been very easy for Red Hat to roll the differences in.


    (In fact, it would likely help a lot, as they could likely have worked out some kind of sponsorship deal with SGI, where SGI helped fund some of Red Hat's work, in exchange for Red Hat promoting SGI's software for Linux.)


    Lastly, how are the Linux developers going to encourage development and innovation, if they use an entirely "safe" off-the-shelf distribution? I don't particularly want kernel.org to crash, but nor do I want people to turn away on the grounds that even the Linux kernel developers don't trust their own work.

    --
    It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)
  6. Re:Well, what filesystem are they using? ext3 OK by anon+mouse-cow-aard · · Score: 2, Interesting
    Gripes about ext3 performance are probably outdated.

    We did some tests comparing reiser3, xfs, and ext3 with the dir_index option on 2.6 kernels. We were writing thousands (ok tens of thousands) of small files into a couple of directories (specialized app, you don't want to know.)

    When directories got large, ext3 with the hash lookups (between 800 and 1500 creations per second on newish hardware) ran much faster than xfs, oh and several orders of magnitude faster than ext3 without the directory hashing. reiser3 was slower than xfs.

    We were thinking of going with xfs anyways, because it was so attractive that the directories would shrink when files were deleted (whereas ext3 directories stay big, with a hole in it.) but xfs would crash on us after a couple of days. So In March we chose ext3. We have approximately 9 million files in a single file system at the moment, it seems to work ok, but the system crashes every three weeks or so. We think we might have tortured it too much, and can reasonably keep only about 2 million files on-line, so we'll see if that helps.

    of course, ymmv.