Slashdot Mirror


The Hairy State of Linux Filesystems

RazvanM writes "Do the OSes really shrink? Perhaps the user space (MySQL, CUPS) is getting slimmer, but how about the internals? Using as a metric the number of external calls between the filesystem modules and the rest of the Linux kernel I argue that this is not the case. The evidence is a graph that shows the evolution of 15 filesystems from 2.6.11 to 2.6.28 along with the current state (2.6.28) for 24 filesystems. Some filesystems that stand out are: nfs for leading in both number of calls and speed of growth; ext4 and fuse for their above-average speed of growth and 9p for its roller coaster path."

36 of 187 comments (clear)

  1. Do the number of calls really matter? by dbIII · · Score: 4, Interesting

    In the case of NFS for instance, hasn't there been a performance improvement? Isn't that the thing that matters?

    1. Re:Do the number of calls really matter? by epiphani · · Score: 5, Informative

      Two things of note with NFS...

      1. NFSv4 support was added. v4 is complex and has a lot of authentication stuff in it that wasn't in v3.

      2. SunRPC is "part" of the NFS tree, but is effectively just a transport layer. It is completely abstracted, hence the numbers of symbols. It could be used for other stuff, so it pushes up that number too.

      --
      .
    2. Re:Do the number of calls really matter? by Smidge207 · · Score: 4, Interesting

      Yes, that sounds like "slimming down" to me. At least, I can understand what teh article is trying to get at. It seems like we went through a period of early operating system development over the past few decades where the stress was on throwing everything in, including the kitchen sink. It's at least interesting that Linux distros are putting in some amount of effort into pulling excess functionality out of the default installation while computers continue to become bigger, faster, stronger.

      And I think it is pointing at something similar to what is going on with OSX, and it is a trend. We've hit some kind of a milestone, I think, where most of our computer functionality is "good enough" for most of what we actually use them for. Something about the development of computer systems right now reminds me of... whenever it was... 10 years ago?... when people were using their computers mostly for word-processing, and their computers were good enough for that, so there wasn't a huge drive to accomplish a particular thing. Then people discovered that they could rip CDs into MP3s and share them, and there grew this whole new focus on multimedia and the Internet.

      Now we have those things handled, and it seems like the answer to "what's next?" is making both hardware and software smaller and less bloated. We're getting smart phones that are becoming something more like a real portable computer, and we're getting things like netbooks. I predict you're also going to start seeing better use of embedded systems, like maybe DVRs are just going to be built into TVs soon. Not sure on that one, but I think you're going to see things shrinking, devices being consolidated, and a renewed focus on making things more efficient and refined.

      Meh. It's rambling time...

      =Smidge=

      --
      Is it just my observation, or is eldavojohn an idiot?
    3. Re:Do the number of calls really matter? by Kjella · · Score: 4, Interesting

      Ever since.... well, the first abstraction there's been a holy flamewar of abstractions versus spaghetti code. The one side of the war claims that by building enough layers each layer is simple, well-understood with well-defined interactions and thus fairly bugfree. The other side claims that abstractions wrap things in so many layers that the whole code is like an onion without substance, separating cause from effect so it's difficult to grasp and that these layers seriously hurt performance. The answer is usually to do is simple if possible, complex if necessary. Of calls went up and performance went up it's probably necessary, but isolated an increase in cross calls would be a bad thing.

      --
      Live today, because you never know what tomorrow brings
    4. Re:Do the number of calls really matter? by Anonymous Coward · · Score: 5, Funny

      Hi. I'm the infamous Anonymous Coward, and it's time we had a talk.

      For years now, I've been enhancing the discussion on Slashdot through interesting interjections and humorous anecdotes (often about homosexual African Americans), but I feel things just aren't working out.

      It takes me an awful lot of time, researching composing and spell chekcing the many hundreds of valuable posts I make a day, and although I don't request anything in return all I ever see is abuse. You moderate my comments down for absolutely no good reason.

      I've had enough.

      From this point on I'm just not going to bother. It's over.

      I've been feeling this way for a while, slowly I've put less and less effort in my posts, repeating the same ideas over and over and, now, even started repeating whole posts verbatim.

      It's been fun, Slashdot, but I'm disillusioned. You broke my heart, and I am never doing to give you the benefit of my insight again.

      Be happy.
      Love and regrets,
      Anon.

    5. Re:Do the number of calls really matter? by Anonymous Coward · · Score: 5, Funny

      Hahaha disregard that! I suck cocks!

  2. At least Reiser by Spamhead · · Score: 5, Funny

    got to make one call...

    --
    Everybody Wang-Chung tonight!
    1. Re:At least Reiser by oboreruhito · · Score: 4, Funny

      I doubt it. Too RISCy.

    2. Re:At least Reiser by GF678 · · Score: 5, Informative

      Off topic, but just in case anyone is curious as to how Hans Reiser is doing in prison...

      Not particularly well so far: http://www.kcbs.com/pages/3634907.php?

    3. Re:At least Reiser by mollymoo · · Score: 3, Interesting

      Being dead doesn't sound too bad to me. The process of dying almost always sucks and I don't want to be dead, but once I am dead I can guarantee you I won't give a shit about it.

      --
      Chernobyl 'not a wildlife haven' - BBC News
  3. What? by svnt · · Score: 5, Interesting

    While OSes may be "sliming down" as the article says, what does the removal of standard db packages from Ubuntu have to do with filesystem-related kernel calls?

    The article doesn't seem to mention the possiblity that more functionality may be pushed into the kernel from userspace, which might make sense in other situations, but I don't think that argument would hold up here.

    I am struggling to make the connection between the summary and the so-called article. The fact that they are not stripping/locking fs functionality means that OSes aren't shrinking? That's the hypothesis?

  4. Where's NTFS ? by Anonymous Coward · · Score: 5, Funny

    You are kidding arent you ?

            Are you saying that this linux can run on a computer without windows underneath it, at all ? As in, without a boot disk, without any drivers, and without any services ?

            That sounds preposterous to me.

            If it were true (and I doubt it), then companies would be selling computers without a windows. This clearly is not happening, so there must be some error in your calculations. I hope you realise that windows is more than just Office ? Its a whole system that runs the computer from start to finish, and that is a very difficult thing to acheive. A lot of people dont realise this.

            Microsoft just spent $9 billion and many years to create Vista, so it does not sound reasonable that some new alternative could just snap into existence overnight like that. It would take billions of dollars and a massive effort to achieve. IBM tried, and spent a huge amount of money developing OS/2 but could never keep up with Windows. Apple tried to create their own system for years, but finally gave up recently and moved to Intel and Microsoft.

            Its just not possible that a freeware like the Linux could be extended to the point where it runs the entire computer fron start to finish, without using some of the more critical parts of windows. Not possible.

            I think you need to re-examine your assumptions.

    1. Re:Where's NTFS ? by Orion+Blastar · · Score: 5, Funny

      Dude Microsoft is giving up on NTFS for WinFS with Windows 7.0. Get your facts straight before you start to character assassinate an operating system. WinFS was to be a part of Vista, but Microsoft removed it before the retail version in order to meet deadlines.

      Did you know that Linux has limited NTFS support? I usually have to create a FAT32 partition to copy files between Windows XP and Linux. NTFS is usually read only or not available. Pfffssssttt!

      Just like wine, Microsoft will not release a finished product before its time.

      --
      Remember, Slashdot does not have a -1 disagree moderation, and no, troll, flamebait, and overrated are not substitutes.
    2. Re:Where's NTFS ? by jaavaaguru · · Score: 4, Funny

      In Linux, the open office might be the default for editing your wordfiles, and you might prefer ubuntu brown over the grassy knoll of the windows desktop, but mark my words young man - without the windows drivers sitting below the visible surface, allowing the linus to talk to the hardware, it is without worth.

      And so, by choosing your linux as an alternative to windows on the desktop, you still need a windows licence to run this operating system through the windows drivers to talk to the hardware. Linux is only a code, it cannot perform the low level function.

      My point being, young man, that unless you intend to pirate and steal the Windows drivers and services, how is using the linux going to save money ? Well ? It seems that no linux fan can ever provide a straight answer to that question !

      May as well just stay legal, run the Windows drivers, and run Office on the desktop instead of the linus.

    3. Re:Where's NTFS ? by quickOnTheUptake · · Score: 5, Insightful

      Did you know that Linux has limited NTFS support? I usually have to create a FAT32 partition to copy files between Windows XP and Linux. NTFS is usually read only or not available.

      Have you heard of NTFS-3G?

      The NTFS-3G driver is a freely and commercially available and supported read/write NTFS driver for Linux, FreeBSD, Mac OS X, NetBSD, Solaris, Haiku, and other operating systems. It provides safe and fast handling of the Windows XP, Windows Server 2003, Windows 2000, Windows Vista and Windows Server 2008 file systems.

      --
      Mod points: Guaranteed to remove your sense of humor.
      Side effects may include gullibility and temporary retardation
    4. Re:Where's NTFS ? by Kjella · · Score: 4, Interesting

      Sometimes I wish there was a way to make my own meta-mod, like "don't include mods from the people that modded this up ever again". The same copy-paste has been in tons of stories now, and it's not funny anymore because it's the EXACT same thing. I'd even rather hear one more variation on our insensitive clod overlords from Soviet Russia.

      --
      Live today, because you never know what tomorrow brings
    5. Re:Where's NTFS ? by ADRA · · Score: 5, Informative

      1. The AC was a satire. In fact, I remember reading those exact lines at least once before. Its actually quite funny, so props to the original troll for making something really nice to read.

      2. ntfs-3g should be all you need to handle read/writes in Linux these days. I think its nested on top of fuse, so you'll probably need it as well. (Side note, glad Linus finally caved on allowing fuse into his kernel releases)

      3. WinFS is a meta-layer on top of NTFS, so not in itself a disk file-system.

      --
      Bye!
  5. Thoughts by stevied · · Score: 4, Informative

    Thoughts:

    - This is measuring, I believe, calls to different functions; a call to one function from multiple places is only counted once. So it's really a measure of the diversity of external calls.

    - Size and complexity aren't necessarily the same thing. It's actually possible that as common functionality is abstracted out of filesystems, they get smaller but make more external calls. There was a point a few years ago when this was happening at quite a rapid pace in the fs code, I don't know if it is still true.

    - Journalled filesystems and networked filesystems are pretty complex creatures by their nature, the quoted numbers don't seem unreasonable. NFS in particular implements (IIRC) protocol versions 2, 3 and 4, and 4 had a lot of new stuff.

    1. Re:Thoughts by morgan_greywolf · · Score: 4, Interesting

      In fact, if you think about it, the greater the number of different functions a filesystem driver uses, the less functionality it needs to have within itself. I also don't think the number of external calls is a significant measure of anything related to the size or performance, really. It all depends on what calls are being made and for what purpose.

      If anything, as you imply, it's a measure of complexity. But even that might not really be the case if you stop and think about it. As more stuff is abstracted out, the less code goes into the filesystem code, the simpler, really, not more complex that filesystem driver becomes.

      I think this was a really poor choice of metric and that almost renders this entire article moot.

  6. Yes/no by EmbeddedJanitor · · Score: 3, Insightful
    The number of calls in the interface do matter because they increase complexity. This makes fs maintainability and development a bit harder from version to version as it gets less clear what each call should do. Many of the calls are optional, or can be performed by defaults, which does help to simplify things.

    There is little calling overhead from using multiple calls. Of course these interface changes are all done for a good reason: performance, stability, security.

    --
    Engineering is the art of compromise.
    1. Re:Yes/no by Yokaze · · Score: 4, Interesting

      > The number of calls in the interface do matter because they increase complexity.

      That is only true, if a similar functionality is provided and the function-calls are of similar complexity (e.g. number of parameters, complexity of arguments.

      To my limited knowledge, over work has been done to extract more common functionality from file-systems. Should that be the the case, it would increase the number of function calls, but reduce the overall complexity.

      --
      "Between strong and weak, between rich and poor [...], it is freedom which oppresses and the law which sets free"
    2. Re:Yes/no by Suzuran · · Score: 4, Insightful

      Function calls are not free. Especially in kernel space. Everything costs time. You need to do the most you can with the least instructions. 100 lines of inline code will probably run faster than a function call.

    3. Re:Yes/no by ckaminski · · Score: 4, Informative

      Okay, you MIGHT, just MIGHT have a point with a microkernel architecture, or if the filesystems are implemented in user space (Fuse), but is irrelevant in kernel modules in Linux - you're not crossing interrupt boundaries, so calling a kernel function is just as cost effective as rolling your own.

    4. Re:Yes/no by Daniel+Phillips · · Score: 4, Informative

      Function calls are not free. Especially in kernel space. Everything costs time. You need to do the most you can with the least instructions. 100 lines of inline code will probably run faster than a function call.

      Never having been one to accept unsupported claims at face value, I just tested that assertion on a Pentim-M here, with a small C program that either calls a function to increment a counter, or directly increments the counter a number of times. I compiled with O0 to be sure gcc does not change around my code at all. Just the instructions, thanks. Funny thing? A hundred increments runs within 1% of the speed of 100 calls to a function to do the increment. And yes I unrolled those calls to isolate the cost of what I was measuring. So... rather surprisingly, the cost of these function calls is as close as doesn't matter, to exactly zero.

      Loops on the other hand... cost a huge amount. I won't get into details. But Intel clearly does something to optimize function calls in microcode, or probably even hardware. Function calls just don't cost what you think they do. In many cases, the function call will cost less by not trashing as much of that incredibly valuable L1 instruction cache.

      --
      Have you got your LWN subscription yet?
    5. Re:Yes/no by hackerjoe · · Score: 3, Insightful

      What's your point? Processors can pipeline across branches just fine, and the main effect of cache is to give a performance boost to smaller code -- code that separates and reuses functions rather than inlining them willy-nilly.

      Inlining can still be a win sometimes, but compilers will do that for you automatically anyway...

    6. Re:Yes/no by ckaminski · · Score: 3, Interesting

      Nevermind the fact that modern processors can cache the entirety of the Linux kernel.

      Simplicity of code is nearly always better than premature and not necessarily useful optimizations.

    7. Re:Yes/no by Zan+Lynx · · Score: 3, Interesting

      I believe I read somewhere or other that branch predictors need a certain number of instructions between the branch instruction and the branch target in order to do a good job. If the only instruction in the loop is a single increment, that might explain the problem. Unrolling the loop so it has more instructions might fix it.

    8. Re:Yes/no by Z34107 · · Score: 3, Interesting

      So... rather surprisingly, the cost of these function calls is as close as doesn't matter, to exactly zero.

      If the compiler knows the relative address of the function ahead of time, they are really fast.

      Try replacing your direct function call with a function pointer instead. Assign the function pointer the address of your function during runtime. It will be many orders of magnitude slower.

      Not sure why this is; just something I discovered the hard way.

      --
      DATABASE WOW WOW
    9. Re:Yes/no by ultranova · · Score: 3, Insightful

      Funny thing? A hundred increments runs within 1% of the speed of 100 calls to a function to do the increment. And yes I unrolled those calls to isolate the cost of what I was measuring. So... rather surprisingly, the cost of these function calls is as close as doesn't matter, to exactly zero.

      Not at all surprisingly, since 100 function calls and 100 integer additions will take so little time on a modern processor - and, I suspect, would even on an 8088 - that they amount to a rounding error. The machine's clock doesn't have sufficient resolution to measure them. You'd need a hundred million for a meaningful comparison.

      --

      Forget magic. Any technology distinguishable from divine power is insufficiently advanced.

  7. not following by Eil · · Score: 5, Insightful

    What's your argument here? That filesystem code in the kernel shouldn't be growing more sophisticated over time?

    This rings of the armchair-pundit argument that the kernel is getting more and more "bloated" and a breath later crying out that there still aren't Linux hardware drivers for every computing device ever made.

    1. Re:not following by doshell · · Score: 4, Insightful

      I have a good idea to get the drivers while still eliminating the bloat.

      Have an option to compile the kernel during installation, based on detected devices.

      And then whenever you buy a new webcam, replace your graphics card, or whatever, the kernel must be recompiled. People will love that.

      Also: the Linux kernel is modular. This means you don't actually hold in memory, at every time, the drivers to all the hardware devices known to man. Only those your machine actually needs to run. The remaining are just files sitting on your hard disk, ready to be loaded should the need arise. This is an adequate way to keep down the bloat while not inconveniencing the user every time a new piece of hardware pops up.

      --
      Score: i, Imaginary
  8. The state by hkb · · Score: 5, Funny

    The state of Linux filesystems may be in disarray, but it's nothing to kill your wife over...

    *rimshot*

    --
    /* Moderating all non-anonymous trolls up since 2004 */
  9. Re:Is this a story? by hedwards · · Score: 4, Informative

    I don't like that it was restricted to just Linux FSes, comparing it against ones available for other OSes, would have given it at least some context. Based upon the article, it sounds like Linux is being trounced. But, one doesn't really know because there isn't a comparison to other OSes to have any clue whatsoever.

  10. Check out Tux3 by Daniel+Phillips · · Score: 4, Informative

    While Tux3 is not yet ready to run on your desktop, and won't be for a good many months, it is relatively trim at around 6K lines, and is expected to be somewhere around 10K complete with versioning, recovery and proper code comments. Of course, that will still be significant growth in a few months, and nothing says it won't just keep growing. But Tux3 is starting much smaller than its peers, and already has a pretty good range of "big filesystem" features. One of our guiding principles is to keep it tight, therefore leaving fewer places for bugs to hide.

    --
    Have you got your LWN subscription yet?
    1. Re:Check out Tux3 by The+Master+Control+P · · Score: 3, Funny

      But Tux3 is starting much smaller than its peers, and already has a pretty good range of "big filesystem" features.

      Let us count the number of fast, slim projects have been sucked down this way...

      Programmer: "This shit is bloated. I'm starting a new project that will be slim and fast"
      <type type type>
      <build build make>
      User 1: "This is really nice and fast, but I need feature X"
      <add add add>
      Users 2 & 3: "I'd use it, but I really love $OTHER_PROGRAM's Y"
      Programmer: "Grrr..."
      <add add add>
      User 4: "I've heard that Z is doing $SPIFFY, why doesn't this do that?"
      <type type add add add add build>
      User 5: "This is all big and slow and bloated... I'm going with N instead..."
      Programmer: "Fuck you! Fuck you all motherfuckers!"

      Not to say that Tux3 will go or is going this way. Indeed, as long as people stick around who remember the guiding principle of keeping it small it shouldn't. Best of luck!

      /ext2fs fanatic
      //I can shredses the files... yes I can, and I know it will workses...

  11. Goofy metric, too. by Ungrounded+Lightning · · Score: 4, Insightful

    Unless I've misread it, TFA's definition of "size" for a filesystem is "how many distinct external/kernel subroutines does it call?"

    That seems to be a very strange metric. Seems to me that a well-designed filesystem will have little code and make extensive (re)use of well-defined services elsewhere (rather than reinventing those wheels). This "slims" the filesystem's own unique code down to its own core functionality.

    Now maybe, if the functions are hooks in the OS for the filesystem, a larger number of them implies a poorer abstraction boundary. But it's not clear to me, absent more explanation, that this is what TFA is claiming.

    --
    Bantam Dominique roosters crow a four-note song. Once you've heard it as "Happy BIRTHday" you can't NOT hear it that way