Slashdot Mirror


Understanding Memory Usage On Linux

Percy_Blakeney writes "Have you ever wondered why a simple text editor on Linux can use dozens of megabytes of memory? A recent blog posting explains how the output of the ps tool is misleading and how you can get a better idea of how much memory a process really uses."

61 of 248 comments (clear)

  1. Not only shared libraries by pontus · · Score: 5, Informative

    Nice article.
    It could also have mentioned mappings on /dev. For example, the X server, on a system with a 256MB graphics adapter, will map all that memory into its address space, making X look huge, even though it's not using all that much system RAM. This will show up as a device-backed mapping in the maps file.
    On a related note, X also looks big because it's holding pixmaps belonging to various applications (Firefox comes to mind).

    1. Re:Not only shared libraries by OhHellWithIt · · Score: 2, Interesting

      Nice article, indeed. It didn't tell me what I'd hoped it would, though, that there is a tool to give me memory stats the way Data General's AOS/VS II tools did. In the DG world, one component of process memory organization mapped a process' space into four areas: shared code, unshared code, shared data, and unshared data. The DG equivalent of "ps" reported shared memory and unshared memory, and it made short work of determining which processes were pigs and which weren't. I haven't found that in either Unix or Windows. It would be really useful in troubleshooting my wife's accursed W2K system.

      --
      "Who controls the past controls the future. Who controls the present controls the past." -- George Orwell
    2. Re:Not only shared libraries by ratboy666 · · Score: 5, Informative

      The "problem" is the concept of a COW page (copy on write). Coupled with the semantics of mmap().

      In a nutshell: I can use mmap() to map /dev/zero into memory, for (pretty much) as big as I want. 200MB? Its now mine.

      I can have a pointer to this memory.

      The problem? The memory doesn't exist. What I have is a pointer, and a guarantee that enough backing store exists to satisfy it.

      If I read through that pointer, I will see zeros. It *is* /dev/zero after all. However, I can write into the memory. If I write something, the page that is changed is copied and replaced; taking memory AT THAT TIME. Sparsely.

      The mmap() call can map a file (backing store) and allow data to be shared. Memory does not need to be used until the data is read (or written). And this time, the backing store doesn't even need swap (because the file is the backing store).

      All of which means non-changeable may be altered. Changeable may be non-existent or shared. Try to teach that to your DG tools.

      A page of code that is shared - may becomes a page of code that is private. A page of data that is unwritten doesn't have to exist. Even if it is read! A page of data that is written may STILL be shared.

      "ps" and the other tools could walk through typical process maps, counting up pages, and figuring out what each was for, but that may be a bit too intensive. The pages aren't "cross referenced" for that purpose. Besides, the page could be COWd, and then swapped. Should THAT count against the memory of the application? Maybe, maybe not.

      So "ps" by default gives you an idea of the "big picture" for each process.

      Ratboy

      --
      Just another "Cubible(sic) Joe" 2 17 3061
  2. A 'simple' editor ? by Anonymous Coward · · Score: 4, Funny

    How can they diminish EMACS like that ?

    1. Re:A 'simple' editor ? by aurb · · Score: 5, Funny

      They said 'text editor', not 'operating system'.

    2. Re:A 'simple' editor ? by the_greywolf · · Score: 2, Funny

      speaking of which... has anyone ported a text editor to it yet? it would seem EMACS is a good OS, but with no VI, i just can't figure out how it's useful...

      --
      grey wolf
      LET FORTRAN DIE!
  3. The only thing running by 0xABADC0DA · · Score: 4, Insightful

    Try statically linking a program that uses just a few glibc calls and it's pushing 800k. Now add in libc++, Qt/gtk, Xlib, kde, boost, xml, etc and you're talking a lot of memory. This is what gets me about people who say "well Java performs okay now, but it uses so much memory".

    A typical C/C++ based app uses just as much memory, it's just shared between processes. And for that matter, startup time of the first thing using kde/gnome isn't all that great either. Isn't it about time some effort was put into making Java or Mono part of the system, so it can be shared like C apps do?

    1. Re:The only thing running by chris+macura · · Score: 2, Informative
      A typical C/C++ based app uses just as much memory, it's just shared between processes...

      That's the point. Nobody cares about how much actual memory a C/C++ app touches.

      Making Java "part of the system" won't help much either because the libraries aren't the same. You could argue that at the bottom of the pyramid its still libc that's being used, but we still have and need all the wrappers on top of the library to make it compatible with Java code.

      So until people find it normal to run more than one or two java applications at once, Java will be deemed a memory hog. It's sort of a rut that Java is in right now, because nobody would really run more than two Java applications at once. My computer—granted a 5-year old 1.7ghz P4 with 386mb of RAM—can barely handle Eclipse at any reasonable speed. God forbid I also run something else)

    2. Re:The only thing running by CyricZ · · Score: 3, Insightful

      What do you mean, making it "part of the system"? Are you suggesting that they be embedded within the Linux or BSD kernels, for instance? I would hope not, because for serious use that is a recipe for disaster.

      Part of the problem with Java is that each VM has traditionally had its own copy of the Java class library. When you consider how huge the standard library is these days for Java, it's no wonder that even a small Java program consumes so much memory. And running several programs, each duplicating data from the others, is wasteful.

      Apple has had for years a JVM that shares classes between numerous virtual machine instances. It thus reduces unnecessary memory consumption.

      --
      Cyric Zndovzny at your service.
    3. Re:The only thing running by Max+Threshold · · Score: 4, Interesting

      The problem is that the bulk of Java's libraries aren't shared. At least, that's how it looks to pmap.

      Sun JVM running a simple "Hello World" program that sleeps 1000ms between messages in an endless loop:

      mapped: 260888K writeable/private: 199604K shared: 54652K

      It wouldn't be hard to create a launcher that would run them all on the same virtual machine. Such a launcher would a candidate for the system integration you suggest. After all, if you needed to run Windows apps on your Linux box, you wouldn't run multiple instances of VMWare, would you?

    4. Re:The only thing running by jsight · · Score: 3, Informative

      Update your knowledge.

      Java has concurrent GCs now that do not freeze the entire VM while being run. And I've never seen the GC go "out of whack" and hang permanently (though I've seen many apps do this due to poor thread/resource management).

    5. Re:The only thing running by swillden · · Score: 5, Interesting

      The problem is that the bulk of Java's libraries aren't shared. At least, that's how it looks to pmap.

      There's clearly something pmap is missing. I just tried exactly what you said, and on my system, pmap reports:

      mapped: 209820K writeable/private: 169696K shared: 33660K

      So then I ran 200 copies. pmap reports the same stats for each of them, but that's clearly absurd. 170MB writable/private multiplied by 200 instances is 34GB of RAM. But my laptop has 2GB of RAM, only 136KB of swap is being used, and 800MB of my RAM is being used for cache. So those 200 java instances, plus everything else I have running at the moment (which is quite a bit), are consuming about 1.2GB of RAM. That's quite a bit, but nothing like what pmap would lead me to believe.

      Thinking about it, I think I can see what's going on. pmap is showing the mapped size of each process. Each JVM individually mmaps a huge amount of memory, because it maps in all of the Java libraries. However, mmapped pages don't consume any virtual memory unless they're actually used. This trivial program only uses a tiny portion of the libraries, so only a very small part of the mapped pages is actually read. Each JVM also mmaps a big block of anonymous memory for use as its heap but, again, the mapped address space doesn't consume RAM/swap if it's never touched, and this trivial app doesn't use much heap.

      My conclusion: pmap *also* overestimates memory usage, because some portion of the mapped address space isn't actually in use. RSS, on the other hand, only measures memory that is actually in use, but doesn't distinguish between memory that is shared and memory that is not. VSZ is the most pessimistic measure, since it includes all mapped memory, shared and unshared.

      Looking again at my 200 Java processes confirms this. Each has an RSS of 6.7MB, which is too much to be "correct" (in the sense that 200*6.7MB is more RAM than my whole system is using), but not much too much, which tells me that a lot of that 6.7MB RSS is unshared.

      Looking at the pmap output in more detail, I see that most of the memory is mapped in three big anon blocks -- probably the heaps used by the generational garbage collector. The libraries are smaller and they're (duh) read-only which I'm pretty sure means the libs *are* shared across multiple instances of the JVM, because I believe that multiple processes that mmap the same file in read-only mode only get a single shared copy.

      That means the bulk of the actual memory usage is writable, not libraries, and it's all unused heap space. Assuming the generational GC does the obvious thing and unmaps the whole "dead" generation block, the bulk of the heap space will usually be unused... and the JVM actually will "give back" heap that it no longer needs, at least in part. RSS should show that. Hmm... how to construct a test case to verify it...

      Bottom line, I think: Java apps do use a little more actual memory than C/C++ apps, and trivial Java apps do use a lot more actual memory than trivial C/C++ apps, but it's not nearly as bad as pmap shows because the GC will always have a lot of extra memory mapped that has never been touched (assuming it does unmap and remap the dead generation, and it would be stupid not to).

      Enough of my rambling, semi-informed speculation. Anyone who knows more about how this stuff works, please weigh in and correct me.

      --
      Note to ACs: I usually delete AC replies without reading them. If you want to talk to me, log in.
    6. Re:The only thing running by mi · · Score: 2, Insightful
      Update your knowledge. Java has concurrent GCs now that do not freeze the entire VM while being run.
      The latest 1.4.2 still freezes the entire VM. I support a Java application for a living -- keep your evangelism to newbies...

      May be, 1.5 will bring some wonderful improvements in this area, but so far it, apparently, has not -- see another response to your posting.

      And I've never seen the GC go "out of whack" and hang permanently (though I've seen many apps do this due to poor thread/resource management).
      Of course, anything GC does is triggered by the application, and some apps are better than others with resource management.

      But the "you need not worry about memory management" was one of the top items in Java evangelism, and now the 'net is full of advice on how to manage memory in Java to avoid GC-related problems. Oops.

      Some developers, though, continue to believe that "don't worry" hype. Hard to blame them, because without it, there are even fewer advantages to Java.

      --
      In Soviet Washington the swamp drains you.
  4. Re:My own favorite is 'top'. by bumby · · Score: 5, Insightful

    ...no process is using more than 10% of the available resources...

    But hey, 10 processes are using 10%...

    --
    Hey! That's my sig you're smoking there!
  5. man page update by suso · · Score: 4, Insightful

    How about going one step further than just blogging about it and actually submitting a documentation update to the ps man page. That way future confusion of the ps output could be avoided. Of course I guess people have to actually read the man page (In honor of slashdot, I didn't read it before posting this comment ;-)

    1. Re:man page update by TallMatthew · · Score: 3, Informative
      How about going one step further than just blogging about it and actually submitting a documentation update to the ps man page. That way future confusion of the ps output could be avoided.

      Because what ps reports is the truth, from a certain point of view.

  6. Emacs by Touisteur · · Score: 2, Funny

    If you wanna see a real OS with memory hogs, run Emacs...

  7. Re:Extra, extra, read all about it by mrjb · · Score: 4, Funny

    His next story will cover that.

    --
    Visit http://ringbreak.dnd.utwente.nl/~mrjb/growingbettersoftware to download your free copy of the book
  8. Linux file & memory management shines by carribeiro · · Score: 2, Informative

    Linux (and to be fair, Unix-like systems in general) shine at file & memory management. Many people don't know, but executable files are not 'loaded' in the Windows sense - they're just mapped into memory. This design improves performance and gives the system better performance under swapping (not thrashing, mind you). Things like mem mapped files are integral in the way the system is designed and implemented. That's one of the very reasons why a Linux machine usually runs faster and more reliable than a equivalent Windows machine... even if has less memory. The Apache tuning example is great, and it shows how much performance you can squeeze out of a good design.

    1. Re:Linux file & memory management shines by Anonymous Coward · · Score: 5, Informative
      Many people don't know, but executable files are not 'loaded' in the Windows sense - they're just mapped into memory.

      Apparently, some people don't know that modern NT-based Windows versions also behave in exactly the same manner.

    2. Re:Linux file & memory management shines by JesseMcDonald · · Score: 4, Informative

      Actually, modern runtime linkers use a table of offsets rather than embedding the relocated symbol addresses directly into the executable code, and the relocations themselves are handled by mapping the file contents into virtual memory at the necessary addresses. With those two techniques combined, it is almost never necessary for the in-memory version of the executable to differ from its on-disk representation where the code and constant-data sections are concerned. When a typical application begins execution, nearly all of its virtual memory will be mapped directly onto the executable file and its shared libraries, and loaded on demand. The initialized-data section must be copied into virtual memory, and the uninitialized-data section and the stack are typically allocated as they are accessed on a page-by-page basis. Aside from a handful of housekeeping data for the linker and the C libraries, the rest of the virtual memory consists of read-only memory-mapped files.

      --
      "The state is that great fiction by which everyone tries to live at the expense of everyone else." - Bastiat
    3. Re:Linux file & memory management shines by Anonymous Coward · · Score: 3, Informative

      Please do yourself a favor and educate yourself before making any future bogus claims.

      The following two articles respectively deal with executable and libary loading in Windows:
      http://msdn.microsoft.com/msdnmag/issues/02/02/PE/
      http://msdn.microsoft.com/msdnmag/issues/02/03/Loa der/

  9. Re:Before you start bitch about Firefox memory lea by CyricZ · · Score: 5, Interesting

    Firefox does indeed suffer from some very serious memory management-related issues. For anyone who doesn't have an ideological connection to the project, it's obvious why that is.

    About 8 months back I attempted to embed Gecko within an existing graphical user interface toolkit. Having heard so much from the open source community about how easy it was to do, I thought it would go rather quickly. Of course, it did not. The lack of up-to-date documentation (if such documentation there at all) and solid examples were some of the big problems.

    But the overall architecture struck me as the worst part of Mozilla. Like it or not, it's overly complicated and convoluted in many areas. I admit that it's not easy to build well-designed software, but they so completely missed the boat it's unbelievable. However, it does make it obvious as to why many people complain about Firefox and Seamonkey running so slowly, in addition to suffering from huge memory consumption.

    As for the embedding of Gecko, I said to hell with it. I took a page from Apple, and used KHTML instead. The loss in portability by not going with Gecko was well worth the far quicker development time, the lower memory consumption, the increased responsiveness, and the higher degree of stability of KHTML.

    --
    Cyric Zndovzny at your service.
  10. "Tuning Apache" is also excellent by volts · · Score: 4, Informative

    Devin's blog also has an excellent posting on Apache performance. "Tuning Apache, part 1" (and the comments) is the sort of succinct empirical advice it is always nice to find.

  11. Re:Before you start bitch about Firefox memory lea by Anonymous Coward · · Score: 5, Insightful

    Typical Slashdot response, blame the users for the browser's bloat. 99% of the users of Firefox are not programmers and wouldn't have the slightest clue what is going on. They just want to look at porn without popups or getting infected with spyware via IE's ActiveX vulnerabilities. Asking them to download some script, set environment variables, and then file bug reports is unrealistic since most of them can't even tell the difference between a web browser and a web site. That's what beta testers are supposed to be doing but we all know that 90% of the beta testers never bother to file any bug reports, even when the browser crashes.

  12. Forgot to mention startup times... by soboroff · · Score: 2, Insightful

    All those shared libraries are also part of the reason that KDE and GNOME can take so long to start up, and why more memory and a higher-RPM hard disk can speed things up. It does make me laugh sometimes that Emacs is now one of Linux's fastest-starting desktop apps.

  13. Re:Before you start bitch about Firefox memory lea by CyricZ · · Score: 3, Interesting

    Indeed. You're completely correct. If the Mozilla crew want to take on Internet Explorer, then they can't have the general public debugging their software for them. That just won't fly.

    Part of the problem is that it's far too easy for bugs to creep into Mozilla. The code is a small step above horrible, and the architecture isn't much better. A lack of up-to-date documentation leads to programmers not knowing which XPCOM interfaces are deprecated, and which aren't.

    You can look at browsers like Konqueror and Opera, which offer a very comparable feature set to Firefox, yet do no suffer from the drawbacks. Not only that, but Konqueror and Opera are often described as feeling far more responsive, while being extremely stable. It's things like that which really impress the average Jill and Joe. Excessive memory usage will just perplex them, and likely result in them going back to Internet Explorer.

    --
    Cyric Zndovzny at your service.
  14. Re:My own favorite is 'top'. by Splab · · Score: 5, Insightful

    Top will show you the same as ps does, ps calls /proc//statm and asks whats going on. The problem on linux is the copy on write principle wich saves heaps of memmory, but makes it virtually impossible to figure out what belongs to what. The thing is, when you fork it maps the memmory and marks everything as copy on write, when something needs to write to part of the memmory, then it will make the copy for each process.
    However asking the process how much memory it has allocated will show all memory including stuff that is marked copy on write - that is, I could have 100 processes showing they each use 1.4MB of memory, because they all share the same libray, but in fact, its the same copy they are all using so I'm only using 1.4 MB instead of 140MB (+PCB et. al)

  15. A practical measure and perspective. by twitter · · Score: 2, Insightful
    Using the pmap -d trick gives some insight but the amount of swap space used is what actually slows down system response. The author notes that a user who mostly runs either KDE or Gnome will pay a greater marginal cost for running the one Gnome or KDE application that's different. That's true, but it's also hard to avoid and it often does not matter, even on a modest system with 256 MB RAM. You would think that running konqueror, kontact, gimp and gnumeric on Enlightenment or Window Maker would suck down resources. It does, but it might not be enough to get you into swap space. Just run top and see. A low resource window manager can use fewer resources than a full Gnome or KDE Window Manager, despite the magic of shared libraries. DSL and Feather GNU Linux distributions run on P1s because they come with very low resource programs, which may or may not share many libraries. A little swap use does not hurt, but things get slow when too much gets in there.

    The whole discussion should be grounded in the reality of alternatives. A typical M$ system will grind it's way into swap space on start up, before the user loads anything! The very latest and greatest Linux distros run well on Pentium IIs and the like, which XP refuses to install on.

    --

    Friends don't help friends install M$ junk.

    1. Re:A practical measure and perspective. by Lussarn · · Score: 2, Insightful

      Sure, I've been running both Linux and Windows for quite some time. And I've NEVER encountered a 1 minut swap session after closing an application in Linux. On windows it happens everyday if I run a big program like a new game (As thats what I have windows for).

      I can only imagine those swap sessions on a 233Mhz machine. Linux does handle memory way better.

    2. Re:A practical measure and perspective. by CastrTroy · · Score: 2, Informative

      I run a P2, 266 at home, with 256 MB of RAM. KDE 3.4 runs pretty slow. I've turned off a lot of the eye candy, but still the response time is quite slow. Windows 2000 on the other hand is quite speedy, I can't speak for windows XP, because I don't run it. The problem is, is that this isn't really a fair comparison, as the Windows 2000 UI, it more comparable to something like sawfish. Well, the look is similar, but Even straight X Windows has a better feature set. So, I could use Sawfish, but If I start up a KDE Program, then it takes forever just to start it up.

      --

      Anthropic principle: We see the universe the way it is because if it were different we would not be here to see it.
    3. Re:A practical measure and perspective. by Trelane · · Score: 3, Informative
      Linux is somewhere in between the two. It doesn't go to swap quite as early as Solaris, and also not as late as FreeBSD.
      It's also quite informative to note that the swappiness of the Linux kernel may be changed dynamically, via /proc/sys/vm/swappiness.
      --

      --
      Given enough personal experience, all stereotypes are shallow.
  16. Re:Before you start bitch about Firefox memory lea by arkanes · · Score: 4, Informative
    About 8 months back I attempted to embed Gecko within an existing graphical user interface toolkit. Having heard so much from the open source community about how easy it was to do, I thought it would go rather quickly.

    I'm kinda curious who you heard that from. Embedding Mozilla when you've got an already existing binding (such as for Gtk) is trivial, but writing the binding from scratch is no easy task. Gecko is a beast and the need to integrate its own drawing layer with yours makes it hard to integrate as an embedded browser. In its defense, it was never designed or intended for such a purpose. KHTML is only easier if you're using Qt (and you *did* obey the license, right?), otherwise you need to provide mappings from all the Qt primitives used by KHTML to your own. Easier than embedding Gecko, but still not trivial.

  17. Linux is no memory hogging Operating System by cciRRus · · Score: 3, Interesting

    What I really wanna understand is the memory usage in Windows.

    --
    w00t
  18. Overdue by Chris+Pimlott · · Score: 4, Interesting


    A nice article, been looking for more information on this. So often you read items in program FAQs or such giving a disclaimer on how ps memory usage is misleading, but they offer no better way. Okay, so ps memory usage information is pratically useless; now what am I supposed to use?

    I was hoping for a bit more, though; like, say, a small program that lets you see both the aggregate virtual memory total as well as the memory used specifically by the program. Add a few options for how to handle the only-one-app-using-a-library situation. Doesn't seem like it'd be that hard, and very useful.

  19. Re:My own favorite is 'top'. by diegocgteleline.es · · Score: 3, Insightful

    Is not that there's not a perfect tool, the problem is that it's a problem which is impossible to solve properly as I see it

    Take a shared library. For whatever reason, process 1 uses only the first half of the library. Thanks to demand-loading, only that half is loaded in mem, and that's what accounts as RSS for that process, say 10 MB.

    Now a process 2 is launched and it uses the other half of the library. Now, all the library is loading in memory, and even if the first process is not using and has not requested to use the second half, its RSS will grown because somebody else use other parts of the library.

    I don't think it's something you can or want to "solve": That's a consequence of the design ideas behind shared libraries. Deal with it.

  20. Re:My own favorite is 'top'. by lasindi · · Score: 2, Interesting

    Something will bog down my machine, I will run 'top' and discover that no process is using more than 10% of the available resources. OK, so why is my machine bogging?

    The "feature" that I find annoying about top, though it's really rather necessary for a CLI program, is that only the most CPU-intensive programs at a given instant get to the top. This isn't a problem with truly CPU-intensive programs that are constantly running. But all too often there's a program that's spiking to 30% or more CPU intermittently, and so the program might flash at the top every now and then, but for the most part it's low on the list where you can't see it. I'm not saying that top is bad, it's a very nice command line tool that works well; I'm just saying that the CLI has its limitations, and thus top does too. I find that KSysGuard works pretty well for this, since the processes all stay in the same place, and you can see when a process flashes %40 or whatever in the CPU column, and then kill it. You can use ps for this as well to an extent, but it's much harder (hit ps over and over and scroll up (or worse, use 'less' or 'more') to see how much CPU is being used by each process).

    --
    I have discovered a truly remarkable proof of this theorem that this sig is too small to contain.
  21. Re:Extra, extra, read all about it by lasindi · · Score: 5, Insightful

    So, people don't know how to interpret the output of ps? And that's a Slashdot frontpage story?

    Slashdot isn't only about breaking tech news; it's about keeping geeks generally informed. Many Linux geeks (including myself) probably learned something from the article that they didn't know. It's a well-written, informative article, and I'm glad Slashdot posted it because otherwise I probably would have never seen it. Not every Slashdotter already knows everything there is to know about Linux like you apparently do, and I imagine this isn't quite "common knowledge," so it's helpful for some of us.

    What have I done wrong in my settings to deserve such trivial items?

    No one forced you to click on "Read More." Sorry that you wasted a couple seconds reading the summary and realizing you already knew all about ps, but you didn't need to waste even more of your time trolling.

    --
    I have discovered a truly remarkable proof of this theorem that this sig is too small to contain.
  22. Re:Extra, extra, read all about it by tgv · · Score: 2, Informative

    Because there is not just one moderator. Everybody can moderate. So there are always a few people who think that's funny. But by not being an Anonymous Coward, but logging in instead, you can set a threshold to all posts, which will exclude most of them...

  23. Ignorance about Java by Simon+Brooke · · Score: 2, Interesting
    Actually, a JVM is even less stable than Windows. It was not designed as a real OS. The garbage-collection, for example, will freeze the entire VM for as long as it needs to run -- and sometimes it goes out of whack and hangs permanently...

    The JVM serving this page currently has an uptime of 32 days. But in the past it's had uptimes of over 200 days. Neither it, nor any of the other Tomcat servers I run, has ever gone out of whack. Java (Tomcat, Weblogic and others) powers the web servers of many of the world's biggest websites, serving millions of pages of dynamic content every day. If it was unreliable, that wouldn't be happening.

    --
    I'm old enough to remember when discussions on Slashdot were well informed.
  24. Sure by arvindn · · Score: 2, Funny
    "Have you ever wondered why a simple text editor on Linux can use dozens of megabytes of memory? "

    Of course. EMACS - eight megabytes and constantly swapping.

  25. Also applies to shared memory segments by AtariDatacenter · · Score: 2, Insightful

    Because there is nothing quite like seeing you've got 20 Oracle instances at 1gb each on a 4gb box. :)

  26. Memory Management by johnnyb · · Score: 3, Informative

    On a related note, if anyone is curious how memory management library calls such as "malloc" work, you might check out my article on the subject.

  27. This issue with PS hides a huge Java issue by egarland · · Score: 2, Interesting

    The architecture of Java doesn't allow it to share library memory space like this. The effect of this is Java programs, appear to use about the same amount of memory as compiled programs when, in fact, they are using quite a bit more. This is why running a Java program that takes up 25 megs of memory can seem to suck the life out of a computer while a compiled executable using 25 megs doesn't. Java is probably really using about 10x more memory.

    It's also why systems running a Java framework with multiple programs executing in the same Java process do so much better than ones where everything is in its own process. This is Java's sweet spot, where these JVM architecture disatvantages have the least impact.

    This is my understanding of how Java's libraries work. Someone let me know if I'm missing something here.

    --
    set softtabstop=4 shiftwidth=4 expandtab nocp worlddomination
  28. top by Kupek · · Score: 5, Informative

    Run top. Check out the column that says SHR. Subtract it from VIRT if you want to know the virtual memory usage of a process excluding shared libraries, or subtract it from RES if you want to know the physical memory usage of a process excluding shared libraries. Problem solved.

    I don't like how he phrases that what ps reports is "wrong." It's not wrong, or even "wrong." It reports exactly what Linux tells it (through the proc filesystem). It's just might not be what you expect it to be, which means you don't understand the tools and the system. When ps reports that a process' virtual memory usage is xKb, that is correct. In the address space for the process, xKb have been allocated. Shared or not, they're still in the address space.

  29. It's also why Linux is so good at multi user by Colin+Smith · · Score: 3, Interesting

    The first person to use a system might load 128Mb worth of libraries and applications. The second and all subsequent users may only use 15-30Mb worth of RAM for each additional user. e.g. A 1Gb RAM system could handle 30 concurrent users rather than 8.

    --
    Deleted
    1. Re:It's also why Linux is so good at multi user by Doctor+Memory · · Score: 2, Interesting

      /me flashes back to the day, when I was one of sixty-four CS students editing, compiling and testing on a single VAX running VMS. With one (1) megabyte of memory. That's right, dear friends, roughly a quarter of the cache of your average disk drive today. Makes me wonder how much memory my box would use if I killed X and ran everything from the console...

      --
      Just junk food for thought...
  30. Re:My own favorite is 'top'. by jallen02 · · Score: 4, Informative

    Load and CPU usage are different things. Load is a very tricky topic. The gist of it is that it is the average number or processes that were waiting to do some amount of processing. It is then scaled based on a logarithmic algorithm to give you a rough picture of what is happening. So lets say you have an SMTP server with a dozen processes all trying to disk access and the disk is also busy updating its locate database. Your disk is hammered. Your processor is not. But you have so many processes competing for IO that it bogs down the process scheduling eventually, which can make everything sluggish. Your CPU usage might not be heavy, but that doesn't mean the system isn't bogged down trying to do other things. CPU usage is an important part of system load, but not the only thing going into it.

    Jeremy

  31. Aren't they still resource hogs? by dzfoo · · Score: 2, Insightful

    >> Have you ever wondered why a simple text editor on Linux can use dozens of megabytes of memory?

    Correct me if I'm wrong but... doesn't the fact that KEdit uses a lot of libraries that consume resources and impact system performance -- whether shared or not -- still means that it is a hog? I mean, if a seemingly simple application is consuming "dozens of megabytes of memory", saying "oh, it's OK, because most of it is being shared and already commited", does not really excuse it. What if those libraries are not currently being used by any other process?

    In order for the shared memory to lessen the impact on the system, the user must be running some other processes that share the same libraries. This to me is a *BIG*, and unwarranted, assumption by the developer, as evidenced by his example of someone running the Gnome environment but running a single KDE application.

          -dZ.

    --
    Carol vs. Ghost
    ...Can you save Christmas?
  32. More tips by typical · · Score: 5, Informative

    The thing is, when you fork it maps the memmory and marks everything as copy on write, when something needs to write to part of the memmory, then it will make the copy for each process.

    A couple other tips:

    * Each thread in a process shows up as consuming the same amount of memory (either this only happens under Linuxthreads or I don't have any threaded applications running on my system).

    * Device mappings show up as consumed memory (which generates plenty of XFree86/xorg complaints). If you want to find out how much memory xorg/X11 is actually using (bytes in cached pixmaps on behalf of each process and sans device mappings), try this program (contains a tiny program that lists how much memory X is using for other programs by caching pixmaps and a perl script that lists how much memory X is using sans device mappings).

    * The article mentions the fact that shared libraries show up in every application's memory usage. So, for example, glibc alone adds 1.5MB to the memory usage of every process. But Win folks may not realize how significant this is. Most Windows applications ship with their own copies of almost all shared libraries used, which means that there is a huge amount of wasted memory under Windows that *actually affects you*. Under Linux, instead of shipping shared libraries with applications, folks have built tools to automatically download the latest shared libraries and use those across multiple applications. Result -- only one copy of the library need be in memory at a time. This means that it's actually reasonable to run a box with 128MB of memory and three remote users using the thing. You simply can't pull that under Windows and expect usability.

    * This may not sound significant, but Linux's VM is (anecdotal evidence, of course) really solid. When I run out of memory under Windows, performance rapidly degrades -- bring an application to the foreground, and the system just starts churning. Under Linux, you can push a ways into VM and things generally keep functioning pretty well (this is one of the causes of people talking about "applications loading faster under WINE than Windows" when they're trying to prove that WINE is 'faster' than Windows -- good disk I/O and VM code).

    --
    Any program relying on (nontrivial) preemptive multithreading will be buggy.
    1. Re:More tips by Alioth · · Score: 2, Interesting

      The VMM on Windows in particular is a bit nasty when it comes to how it decides to push pages out to swap. The Windows VMM only looks in the CPU's TLB to look for candidate pages to trim from a process's working set, and since this is only 64 page table entries - and recently used ones at that - it's not hard to get Windows into the situation where a process can have a gigantic working set, virtually all unused - but the VMM can't swap it out to let a process that actually needs that memory to get at it (and you get a swapping storm).

    2. Re:More tips by runderwo · · Score: 2, Informative

      The TLB is nothing more than a page table cache. On IA-32, a program has no control or ability to view the contents of the TLB besides to flush it via CR3. Saying you can look into it is like claiming you can look into L1 or L2 cache via your program (notwithstanding exceptions such as cache-as-RAM during firmware initialization). The only way you can know the contents of such caches is to know what memory accesses are performed in what order, so if you have that knowledge, then yes you could "look" into the cache using it. But I don't see how that is useful for a mechanism such as you claimed the Windows VMM implements.

    3. Re:More tips by Kupek · · Score: 2, Informative

      Each thread in a process shows up as consuming the same amount of memory (either this only happens under Linuxthreads or I don't have any threaded applications running on my system).

      Under LinuxThreads, each thread had its own PID. Under NPTL (Native POSIX Thread Library) all threads from the same process share the same PID, but each thread has a unique TID (which you can get with the Linux specific call gettid()). Calling getconf GNU_LIBPTHREAD_VERSION from a prompt should tell you what library and version you're running for pthread support.

      Anyway, this is a round-a-bout way of saying you're right. Since LinuxThreads uses a unique PID for each thread, if you queried the kernel for memory info, it would tell you that each process (thread) was invidivually consuming xKb. That's non-intuitive behavior, but I think the blame belonged to LinuxThreads, not the kernel; LinuxThreads was abusing the concept of a PID. Thankfully this has been changed in NPTL.

    4. Re:More tips by timeOday · · Score: 2, Interesting
      I agree, OOM Killer is just slightly better than a spontaneous reboot. (Or maybe worse, since it's less obvious what's going on.)

      The problem is, there's just no good way to handle low memory conditions.

  33. Loading unneccessary libraries by arth1 · · Score: 2, Informative

    What gets me is how some distro builders see a security warning about setuid/setgid binaries using lazy so loading, and decide that using -Wl,-z,now is a good thing to add. Excuse me, but that will pull in EVERY library at link time, whether used or not, often leading to some MAJOR bloat.
    Yes, it "fixes" the "problem", but so would using rpath to DSOs not writable by users or ensuring that LD_LIBRARY_PATH doesn't point to user writable directories. Without the load time bloat.

    Regards,
    --
    *Art

  34. Nostalgia is ... by newandyh-r · · Score: 3, Interesting

    When you can remember running 60 users on a mainframe with about 1MB RAM and a processor no faster than a 386.

  35. Re:My own favorite is 'top'. by jbert · · Score: 3, Interesting

    The closest I've come to dealing with it was writing exmap.

    This is a (moderatly ugly) gtk+ tool which uses a loadable kernel module to work out which pages are used by more than one process. If a page is used by N processes, each process is credited with PAGE_SIZE/N bytes.

    I believe it "solves" the problem you describe above. The biggest problem is that it provides a little too much information, so perhaps I should simplify it a bit.

    (Known problems with current 0.8 version: some of the tests fail intermittently and some systems with pre-linked elf binaries can cause errors. Should fix up both with the next release).

  36. Re:My own favorite is 'top'. by Corgha · · Score: 2, Informative

    The "feature" that I find annoying about top, though it's really rather necessary for a CLI program, is that only the most CPU-intensive programs at a given instant get to the top. [...] I find that KSysGuard works pretty well for this, since the processes all stay in the same place

    This has nothing to do with CLI vs GUI programs, and everything to do with what you're choosing to sort by. You can change the sort order in top.

    If you sort by PID or process name or something else less volatile than CPU percentages, the processes all stay in the same place in top, too. However, if you're looking for programs that are using a lot of CPU over time, it's probably worth sorting by cumulative CPU time instead.

    Read the man page or the interactive help (hit "?").

  37. Agreed - excellent article by a16 · · Score: 2, Interesting

    I just wanted to add my confirmation that the Apache article is an excellent tip.

    I had been experiencing issues reaching the max clients on a busy apache server serving around 6mbit/sec of images at peak times, and had been forced to increase the maximum child process setting to a very large number to cope with the peak daily periods.

    Having just made the changes recommended in that article, ie. changing the keep alive timeout to around 2 seconds rather than the default of 15 - we've gone from an average of 100+ child processes to a constant of 20-30.

    I'd advise anyone experiencing problems hitting their max client setting (the example he gives is a slashdotting, in my case it's serving loads of individual images) to try this setting out.

  38. 200 instances and 170 megs by gini_ · · Score: 2, Informative

    That is how it should be read I think. To start 200 instances of your Java proggie you pretty much did the same thing as starting 200 threads in single virtual machine. These threads show in ps output as operating system processes and they map entire address space of virtual machine which is why their sizes are identical.

    Memory usage of Java actually scales very nicely with silly number of threads. A couple of months ago I created a small server which opened lots of listener sockets in their own threads.

    With one thread the size of the virtual machine about 40 megs which pretty much for a simple application but when I created more server threads the amount of added memory was very small. With 100 listener threads it was like 60 megs, with 400 it was 80 megs and finally with 3000 server threads the amount of used memory was only 290megs!
    It is true that these threads were not actually doing anyting except listening on their sockets but I thing it is very impressive nevertheless.

  39. mod parent Overrated by Darkforge · · Score: 2, Informative

    That's just not true, as someone else has swillden points out in this comment to the current story. Nobody should follow your suggestion.

    Based on your over-simplified claim (which I'll call "wrong") the 43 java threads on my Tomcat box are using 3.0GB of RAM total, minus 426MB shared, which is impossible on a box with 256MB of RAM and 512MB swap.

    More generally, the problem with ps (and top) is that they fail to highlight the most important piece of information: the amount of unshared memory each process is using, or, as TFA calls it, the "marginal cost" of each process.

    Instead, they give you the total memory available to each process. That number is irrelevant to a user of that process. It won't tell you, for example, how much memory you'd save if you killed off any given process. It won't even tell you how much total memory (shared+unshared) that process is using... as others have pointed out, ps's number includes unused copy-on-write device-mapped memory.

    ps is at best deceptive, if not actually wrong.

    --

    When I moderate, I only use "-1, Overrated". That way, I never get meta-moderated!