Slashdot Mirror


Understanding Memory Usage On Linux

Percy_Blakeney writes "Have you ever wondered why a simple text editor on Linux can use dozens of megabytes of memory? A recent blog posting explains how the output of the ps tool is misleading and how you can get a better idea of how much memory a process really uses."

22 of 248 comments (clear)

  1. Re:Before you start bitch about Firefox memory lea by CyricZ · · Score: 5, Interesting

    Firefox does indeed suffer from some very serious memory management-related issues. For anyone who doesn't have an ideological connection to the project, it's obvious why that is.

    About 8 months back I attempted to embed Gecko within an existing graphical user interface toolkit. Having heard so much from the open source community about how easy it was to do, I thought it would go rather quickly. Of course, it did not. The lack of up-to-date documentation (if such documentation there at all) and solid examples were some of the big problems.

    But the overall architecture struck me as the worst part of Mozilla. Like it or not, it's overly complicated and convoluted in many areas. I admit that it's not easy to build well-designed software, but they so completely missed the boat it's unbelievable. However, it does make it obvious as to why many people complain about Firefox and Seamonkey running so slowly, in addition to suffering from huge memory consumption.

    As for the embedding of Gecko, I said to hell with it. I took a page from Apple, and used KHTML instead. The loss in portability by not going with Gecko was well worth the far quicker development time, the lower memory consumption, the increased responsiveness, and the higher degree of stability of KHTML.

    --
    Cyric Zndovzny at your service.
  2. Re:The only thing running by Max+Threshold · · Score: 4, Interesting

    The problem is that the bulk of Java's libraries aren't shared. At least, that's how it looks to pmap.

    Sun JVM running a simple "Hello World" program that sleeps 1000ms between messages in an endless loop:

    mapped: 260888K writeable/private: 199604K shared: 54652K

    It wouldn't be hard to create a launcher that would run them all on the same virtual machine. Such a launcher would a candidate for the system integration you suggest. After all, if you needed to run Windows apps on your Linux box, you wouldn't run multiple instances of VMWare, would you?

  3. Re:Before you start bitch about Firefox memory lea by CyricZ · · Score: 3, Interesting

    Indeed. You're completely correct. If the Mozilla crew want to take on Internet Explorer, then they can't have the general public debugging their software for them. That just won't fly.

    Part of the problem is that it's far too easy for bugs to creep into Mozilla. The code is a small step above horrible, and the architecture isn't much better. A lack of up-to-date documentation leads to programmers not knowing which XPCOM interfaces are deprecated, and which aren't.

    You can look at browsers like Konqueror and Opera, which offer a very comparable feature set to Firefox, yet do no suffer from the drawbacks. Not only that, but Konqueror and Opera are often described as feeling far more responsive, while being extremely stable. It's things like that which really impress the average Jill and Joe. Excessive memory usage will just perplex them, and likely result in them going back to Internet Explorer.

    --
    Cyric Zndovzny at your service.
  4. Re:Not only shared libraries by Anonymous Coward · · Score: 1, Interesting

    is this the same for FreeBSD --and other Unix-like-- by any chance?

    especially if DRM/DRI is used I would guess.

  5. Linux is no memory hogging Operating System by cciRRus · · Score: 3, Interesting

    What I really wanna understand is the memory usage in Windows.

    --
    w00t
  6. Overdue by Chris+Pimlott · · Score: 4, Interesting


    A nice article, been looking for more information on this. So often you read items in program FAQs or such giving a disclaimer on how ps memory usage is misleading, but they offer no better way. Okay, so ps memory usage information is pratically useless; now what am I supposed to use?

    I was hoping for a bit more, though; like, say, a small program that lets you see both the aggregate virtual memory total as well as the memory used specifically by the program. Add a few options for how to handle the only-one-app-using-a-library situation. Doesn't seem like it'd be that hard, and very useful.

  7. Re:Not only shared libraries by OhHellWithIt · · Score: 2, Interesting

    Nice article, indeed. It didn't tell me what I'd hoped it would, though, that there is a tool to give me memory stats the way Data General's AOS/VS II tools did. In the DG world, one component of process memory organization mapped a process' space into four areas: shared code, unshared code, shared data, and unshared data. The DG equivalent of "ps" reported shared memory and unshared memory, and it made short work of determining which processes were pigs and which weren't. I haven't found that in either Unix or Windows. It would be really useful in troubleshooting my wife's accursed W2K system.

    --
    "Who controls the past controls the future. Who controls the present controls the past." -- George Orwell
  8. Re:My own favorite is 'top'. by lasindi · · Score: 2, Interesting

    Something will bog down my machine, I will run 'top' and discover that no process is using more than 10% of the available resources. OK, so why is my machine bogging?

    The "feature" that I find annoying about top, though it's really rather necessary for a CLI program, is that only the most CPU-intensive programs at a given instant get to the top. This isn't a problem with truly CPU-intensive programs that are constantly running. But all too often there's a program that's spiking to 30% or more CPU intermittently, and so the program might flash at the top every now and then, but for the most part it's low on the list where you can't see it. I'm not saying that top is bad, it's a very nice command line tool that works well; I'm just saying that the CLI has its limitations, and thus top does too. I find that KSysGuard works pretty well for this, since the processes all stay in the same place, and you can see when a process flashes %40 or whatever in the CPU column, and then kill it. You can use ps for this as well to an extent, but it's much harder (hit ps over and over and scroll up (or worse, use 'less' or 'more') to see how much CPU is being used by each process).

    --
    I have discovered a truly remarkable proof of this theorem that this sig is too small to contain.
  9. Re:The only thing running by swillden · · Score: 5, Interesting

    The problem is that the bulk of Java's libraries aren't shared. At least, that's how it looks to pmap.

    There's clearly something pmap is missing. I just tried exactly what you said, and on my system, pmap reports:

    mapped: 209820K writeable/private: 169696K shared: 33660K

    So then I ran 200 copies. pmap reports the same stats for each of them, but that's clearly absurd. 170MB writable/private multiplied by 200 instances is 34GB of RAM. But my laptop has 2GB of RAM, only 136KB of swap is being used, and 800MB of my RAM is being used for cache. So those 200 java instances, plus everything else I have running at the moment (which is quite a bit), are consuming about 1.2GB of RAM. That's quite a bit, but nothing like what pmap would lead me to believe.

    Thinking about it, I think I can see what's going on. pmap is showing the mapped size of each process. Each JVM individually mmaps a huge amount of memory, because it maps in all of the Java libraries. However, mmapped pages don't consume any virtual memory unless they're actually used. This trivial program only uses a tiny portion of the libraries, so only a very small part of the mapped pages is actually read. Each JVM also mmaps a big block of anonymous memory for use as its heap but, again, the mapped address space doesn't consume RAM/swap if it's never touched, and this trivial app doesn't use much heap.

    My conclusion: pmap *also* overestimates memory usage, because some portion of the mapped address space isn't actually in use. RSS, on the other hand, only measures memory that is actually in use, but doesn't distinguish between memory that is shared and memory that is not. VSZ is the most pessimistic measure, since it includes all mapped memory, shared and unshared.

    Looking again at my 200 Java processes confirms this. Each has an RSS of 6.7MB, which is too much to be "correct" (in the sense that 200*6.7MB is more RAM than my whole system is using), but not much too much, which tells me that a lot of that 6.7MB RSS is unshared.

    Looking at the pmap output in more detail, I see that most of the memory is mapped in three big anon blocks -- probably the heaps used by the generational garbage collector. The libraries are smaller and they're (duh) read-only which I'm pretty sure means the libs *are* shared across multiple instances of the JVM, because I believe that multiple processes that mmap the same file in read-only mode only get a single shared copy.

    That means the bulk of the actual memory usage is writable, not libraries, and it's all unused heap space. Assuming the generational GC does the obvious thing and unmaps the whole "dead" generation block, the bulk of the heap space will usually be unused... and the JVM actually will "give back" heap that it no longer needs, at least in part. RSS should show that. Hmm... how to construct a test case to verify it...

    Bottom line, I think: Java apps do use a little more actual memory than C/C++ apps, and trivial Java apps do use a lot more actual memory than trivial C/C++ apps, but it's not nearly as bad as pmap shows because the GC will always have a lot of extra memory mapped that has never been touched (assuming it does unmap and remap the dead generation, and it would be stupid not to).

    Enough of my rambling, semi-informed speculation. Anyone who knows more about how this stuff works, please weigh in and correct me.

    --
    Note to ACs: I usually delete AC replies without reading them. If you want to talk to me, log in.
  10. Ignorance about Java by Simon+Brooke · · Score: 2, Interesting
    Actually, a JVM is even less stable than Windows. It was not designed as a real OS. The garbage-collection, for example, will freeze the entire VM for as long as it needs to run -- and sometimes it goes out of whack and hangs permanently...

    The JVM serving this page currently has an uptime of 32 days. But in the past it's had uptimes of over 200 days. Neither it, nor any of the other Tomcat servers I run, has ever gone out of whack. Java (Tomcat, Weblogic and others) powers the web servers of many of the world's biggest websites, serving millions of pages of dynamic content every day. If it was unreliable, that wouldn't be happening.

    --
    I'm old enough to remember when discussions on Slashdot were well informed.
  11. Re:Before you start bitch about Firefox memory lea by Anonymous Coward · · Score: 1, Interesting
    It's things like that which really impress the average Jill and Joe. Excessive memory usage will just perplex them, and likely result in them going back to Internet Explorer.

    I beg to differ

    1. Average Joe uses Win32 and doesn't even know how to see how much memory a program is using
    2. Average Joe is already used to crashing apps and just accepts it as a part of life. Stability will not impress Average Joe
  12. This issue with PS hides a huge Java issue by egarland · · Score: 2, Interesting

    The architecture of Java doesn't allow it to share library memory space like this. The effect of this is Java programs, appear to use about the same amount of memory as compiled programs when, in fact, they are using quite a bit more. This is why running a Java program that takes up 25 megs of memory can seem to suck the life out of a computer while a compiled executable using 25 megs doesn't. Java is probably really using about 10x more memory.

    It's also why systems running a Java framework with multiple programs executing in the same Java process do so much better than ones where everything is in its own process. This is Java's sweet spot, where these JVM architecture disatvantages have the least impact.

    This is my understanding of how Java's libraries work. Someone let me know if I'm missing something here.

    --
    set softtabstop=4 shiftwidth=4 expandtab nocp worlddomination
    1. Re:This issue with PS hides a huge Java issue by Anonymous Coward · · Score: 1, Interesting

      That's true for the most part, and also a "feature" not a "bug" of java (seriously). With all the talk of virualization today (Xen, VMWare Server released, Sun pushing containers, IBM with Power virtualization), it's important to note that Java has been virtualizing hardware for years. Many of the apps that folks are placing in virualized OS instances can be better and more efficiently done in Java. It doesn't *replace* hardware virtualization but is a better alternative in many cases.

  13. It's also why Linux is so good at multi user by Colin+Smith · · Score: 3, Interesting

    The first person to use a system might load 128Mb worth of libraries and applications. The second and all subsequent users may only use 15-30Mb worth of RAM for each additional user. e.g. A 1Gb RAM system could handle 30 concurrent users rather than 8.

    --
    Deleted
    1. Re:It's also why Linux is so good at multi user by Doctor+Memory · · Score: 2, Interesting

      /me flashes back to the day, when I was one of sixty-four CS students editing, compiling and testing on a single VAX running VMS. With one (1) megabyte of memory. That's right, dear friends, roughly a quarter of the cache of your average disk drive today. Makes me wonder how much memory my box would use if I killed X and ran everything from the console...

      --
      Just junk food for thought...
  14. Nostalgia is ... by newandyh-r · · Score: 3, Interesting

    When you can remember running 60 users on a mainframe with about 1MB RAM and a processor no faster than a 386.

  15. Re:Before you start bitch about Firefox memory lea by Anonymous Coward · · Score: 1, Interesting

    For real fun with Firefox, download a RELEASE COPY of the source code. I want to stress this - not a beta, not a nightly, grab an actual release copy. Compile it with debugging enabled, and run it.

    Then start counting assertion errors in the RELEASE COPY of Firefox. It can be a fun drinking game, for every assertion error, drink.

    It makes you wonder how Firefox 1.5 was ever released, since merely STARTING the damned thing causes like three assertion errors before the first window is even displayed.

  16. Why did "vmmap" disappear from OS X? by SgtUnix · · Score: 1, Interesting

    I'm running OS X 10.4.4 and I know for sure that in the past there has been a utility similar to "pmap" that was called vmmap.

    The path was /usr/bin/vmmap

    But it's somehow disappeared from OS X. Anybody know why?

  17. Re:My own favorite is 'top'. by jbert · · Score: 3, Interesting

    The closest I've come to dealing with it was writing exmap.

    This is a (moderatly ugly) gtk+ tool which uses a loadable kernel module to work out which pages are used by more than one process. If a page is used by N processes, each process is credited with PAGE_SIZE/N bytes.

    I believe it "solves" the problem you describe above. The biggest problem is that it provides a little too much information, so perhaps I should simplify it a bit.

    (Known problems with current 0.8 version: some of the tests fail intermittently and some systems with pre-linked elf binaries can cause errors. Should fix up both with the next release).

  18. Agreed - excellent article by a16 · · Score: 2, Interesting

    I just wanted to add my confirmation that the Apache article is an excellent tip.

    I had been experiencing issues reaching the max clients on a busy apache server serving around 6mbit/sec of images at peak times, and had been forced to increase the maximum child process setting to a very large number to cope with the peak daily periods.

    Having just made the changes recommended in that article, ie. changing the keep alive timeout to around 2 seconds rather than the default of 15 - we've gone from an average of 100+ child processes to a constant of 20-30.

    I'd advise anyone experiencing problems hitting their max client setting (the example he gives is a slashdotting, in my case it's serving loads of individual images) to try this setting out.

  19. Re:More tips by Alioth · · Score: 2, Interesting

    The VMM on Windows in particular is a bit nasty when it comes to how it decides to push pages out to swap. The Windows VMM only looks in the CPU's TLB to look for candidate pages to trim from a process's working set, and since this is only 64 page table entries - and recently used ones at that - it's not hard to get Windows into the situation where a process can have a gigantic working set, virtually all unused - but the VMM can't swap it out to let a process that actually needs that memory to get at it (and you get a swapping storm).

  20. Re:More tips by timeOday · · Score: 2, Interesting
    I agree, OOM Killer is just slightly better than a spontaneous reboot. (Or maybe worse, since it's less obvious what's going on.)

    The problem is, there's just no good way to handle low memory conditions.