Slashdot Mirror


C Coding Tip - Self-Manage Memory Alllocation

An anonymous reader inputs: "The C programming language defines two standard memory management functions: malloc() and free(). C programmers frequently use those functions to allocate buffers at run time to pass data between functions. In many situations, however, you cannot predetermine the actual sizes required for the buffers, which may cause several fundamental problems for constructing complex C programs. This article advocates a self-managing, abstract data buffer. It outlines a pseudo-C implementation of the abstract buffer and details the advantages of adopting this mechanism."

11 of 142 comments (clear)

  1. Re:Um by Anonymous Coward · · Score: 0, Insightful

    garbage collection isnt the endall amazing solution that you may think it is.

    garbage collection stops system response while it's cleaning up. At least it does this with Java.

    If you were running a system that required realtime response, like, lets say, an air traffic control tower, garbage collection could delay certain actions from performing in time and cause a disaster.

    c was designed for systems programming, and not having realtime assurance is A Bad Thing.

  2. Re:Um by scrytch · · Score: 5, Insightful

    > garbage collection stops system response while it's cleaning up

    And malloc is of course free, right? ("well no wally, they're opposites" ... yeah yeah you get my drift)

    Good gc's operate incrementally. Good gc's let you turn gc on and off at will and disable it altogether for designated arenas. Good gc's can run in a separate thread on another CPU, whereas malloc/free cannot.

    The reason java's gc goes wiggy is not because the gc is bad (it's just not very tunable except on solaris), it's because it allocates new objects all over the place (and is happily helped at it by the standard libraries). If you go hog wild with resource consumption, yes you're going to pay for it later.

    For the 99.99% of programs that do NOT need hard realtime, you're better off with gc. Cripes, it's like saying homes shouldn't have thermostats because a home thermostat isn't suitable for a reactor sensor.

    --
    I've finally had it: until slashdot gets article moderation, I am not coming back.
  3. Re:Um by be-fan · · Score: 5, Insightful

    If I ruled the world, I would create a multi-paradigm (object-oriented, generic, functional, and modular support) strongly-typed low-level language that let you program at a high-level. A second high-level langauge that was loosely-typed, garbage collected, and could be interpreted or natively compiled. Then I would define a standard to interface the two languages.
    ----------
    You just described Scheme/CL/Dylan.

    --
    A deep unwavering belief is a sure sign you're missing something...
  4. Re:Hmmm by schmaltz · · Score: 2, Insightful

    reasonable strategy: allocate a huge buffer ... do repetitive flushes and complete reloads of data ...

    Um, this is reasonable how? If you're coding on a 3GHz dream machine w/4GB 400mhz RAM, there will be somebody out there who, quite reasonably, will want to run your code on a 64MB Pentium 133... Testing only on unencumbered machines makes for delusional developers. ;)

    I'm split: either yer trollin' /., or you gotta tell us what OSS projects you've contributed to, with the above philosophy!

    --
    Big Daddy, Johnny, Burp, Aunt Zelda, Scott, Slurp, Big Momma ... where's Siggy?
  5. Re:Um by cookd · · Score: 5, Insightful
    It's even better than that. In many cases, GC can be almost as fast as malloc/free. Looking at the way things are going, I am pretty sure that in the near future GC-based code will be faster, not slower, than equivalent malloc/free code:

    • A compacting GC heap can make allocation into a REALLY fast operation (unless it triggers a collection), and there is no time spent deallocating. On normal heaps, malloc/new can take quite a lot of work (thread synchronization, best-fit free list searching, etc.). If you amortize the time the GC spends in collection over the allocations, the average allocation isn't that much slower than the corresponding malloc/free. Best of all, the gap is shrinking. Soon, GC may be FASTER overall than malloc/free in many real-world situations. It obviously depends on memory usage patterns and collection strategies, but it is starting to happen.
    • If you're using GC, your program doesn't have to do all of the bookkeeping anymore. The amount of time spent in resource tracking in big programs is fairly significant -- where the same object is shared by many parts of the program, figuring out when the item can be released is nontrivial. Under a GC, this is handled by the collector. In some cases, the GC can handle the bookkeeping more efficiently than would have been feasible with manual bookkeeping.
    • As CPUs get faster and storage (memory and disk) gets (relatively) slower, it becomes more and more attractive to spend extra CPU cycles to try to make better use of cache and memory. An L1 cache miss costs around 4-10 cycles; an L2 miss can cost 100-400 cycles; a page fault costs millions. The CPU time spent in garbage collection can become insignificant when compared with storage access time. Do you take the cache line size into consideration when you call malloc? I sure don't. But compacting GC implementations are starting to take things like that into consideration when they collect, and they rearrange the memory of the process to maximize cache hits and minimize memory waste. They can help your program make more efficient use of the cache, and perhaps reduce the working set. Just a 5% reduction in page faults would more than offset all but the most CPU-intensive garbage collectors.
    • GCs usually collect on a separate thread. That means that with a properly designed collector, while your program is blocked on IO or waiting for user input, the GC might be cleaning up the heap on a low priority thread. With luck, your main thread might NEVER actually be interrupted for a collection -- all of the collection can be done while your work threads were waiting for something else. Your memory management tax was paid in the background, while the machine was otherwise idle. On the other hand, with malloc and free, the memory management tax is paid up front by the currently executing thread. While this isn't always the case (i.e. CPU-bound processes), most apps are NOT CPU-bound, so this is a likely scenario in the future (if not the present).


    While I don't think GC is quite to the point where it is free or beneficial to the performance of the average application, it is a lot less harmful than most people think. Given that it simplifies the code and eliminates a lot of bugs (usually more than it introduces), it is definitely worthwhile in almost all new application code (kernel-mode code isn't quite there yet, but it's coming), with only a small performance penalty. And I suspect that it won't be too long before it starts to be more of a speed booster, not a perf hit.

    I think this is just another step in the process of handing another menial task over to the CPU. We moved from binary to assembly, assembly to low-level languages, low-level languages to higher-level languages, etc. At each step, the new method had a performance penalty at first, then as the new method matured, it turned out to actually be faster than the old method it replaced, while dramatically increasing programmer productivity (i.e. modern optimizers can usually do a better job than an assembly language programmer; often C++ code is faster than the equivalent C code since the compiler has more information to work with and the programmer can make use of more effective techniques like templates).
    --
    Time flies like an arrow. Fruit flies like a banana.
  6. Re:realloc by Nevyn · · Score: 2, Insightful
    But linked lists are worse on performance than realloc.

    Care to back that up with some facts.

    Also linked lists are very prone to leaking and very hard to figure out which one to free first.

    So you a) have the library code do that ... and b) have lots of tests. Of course you want to do both of those for something using a single block of memory too, and if you want it to be efficient it usualy does something clever to avoid copies ... and so is probably more likely to screw up and use/deallocate the wrong thing. As for leaking memory, that's pretty much built into the design with a single block of memory model, unless you want to keep calling realloc() to shrink the buffer (then you just blow the fragmentation of malloc to hell -- which if you're lucky will just make things slower).

    The "idea" of the article wasn't bad, it's just a very bad description of the proposed solution ... and a couple of years too late.

    --
    ustr: Managed string API with ave. 44% overhead over strdup(), for 0-20B
  7. Re:Hmmm by Nevyn · · Score: 2, Insightful
    This is simple to do, and avoids a lot of errors. It's also not much of a headline.

    While I agree the IBM "research" article is terrible, the idea behind it isn't.

    Actually having donetests and benchmarks. I can safely say:

    • It's not the simplest solution.
    • It's certainly not anywhere near fast.
    --
    ustr: Managed string API with ave. 44% overhead over strdup(), for 0-20B
  8. Re:Um by Mikkeles · · Score: 2, Insightful

    If I ruled the world, I would create a multi-paradigm (object-oriented, generic, functional, and modular support) strongly-typed low-level language that let you program at a high-level. A second high-level langauge that was loosely-typed, garbage collected, and could be interpreted or natively compiled. Then I would define a standard to interface the two languages.

    In other words, take C++ and add the concept of components/packages. Take Python and add the features (such as generics) that are missing from C++. And then define an interface between components written in both langauges.

    I would suggest Ada + scheme/common lisp. I also think plain C + assembly would be very sensible for when one is dealing directly with bits, bytes, and addresses as it simplifies some of the overriding which would be needed in Ada.

    I don't think of C++ as strongly typed. typedefs don't enforce, e.g., different meanings of ints (so number of apples vs. number of oranges can't really be distinguished without creating multiple classes). Even enums aren't distinguishable from ints!
    Also, Ada already has the packaging you mentioned. It also does support interoperability with C, as do most scheme/lisps.
    --
    Great minds think alike; fools seldom differ.
  9. Re:Um by cookd · · Score: 2, Insightful

    I've heard that for years, it's yet to be true for the general case ... and most people doing manual allocation don't call malloc/free for every single grow/shrink operation. The caching slab allocator was put into solaris a long time ago now.

    Caching slab, lookaside lists, whatever, you're still calling some kind of allocator with some kind of cost. There are definitely applications where memory allocation/deallocation follows a pattern that allows that kind of optimization to work well. But here is the crux of my argument: 1. the general purpose allocators definitely do have a non-zero cost; 2. special purpose allocators can be significantly faster, but they aren't general purpose, so they don't work in all situations, and they often require significant extra work to make them work properly; 3. special-purpose allocators usually work by speculatively pre-allocating chunks from the general purpose allocator, and this wastes memory; 4. even when you are using special-purpose allocators as much as possible, you still have to spend a lot of time in memory management -- the special-purpose allocator isn't free, and every once in a while it still has to call out to the general purpose allocator, which is expensive. So I stand by the original premise: after all is said and done, the time spent in memory management for a GC-based program isn't much (if any) greater than the time spent in memory management for an equivalient non-GC-based program. When the total cost of the allocations/deallocations is compared with the cost of a similar set of GC-alloc + collections, the difference for most big applications isn't really significant.

    The keyword being all, yes you can sometimes just completely forget about designing how you use memory and everything "works". However sometimes those bits of memory have other resources associated with them.

    Yep. And now you don't have to bother with those either, since the GC can collect them during finalization of their associated bits of memory. The point here is that while you still might have to write a destructor of some sort, it is usually much simpler (no memory to worry about) and you don't have to figure out when that resource is no longer in use -- the GC will figure it out for you. If the resource must be freed deterministically, GC doesn't prevent you from doing that. But in the cases where it doesn't, GC still works great. In other words: don't refuse the solution just because it only makes 90% of your problems go away -- you're still better off, aren't you?

    You seem to be under the impression that running the GC isn't going to blow your cache ... why is that?

    If the garbage collector itself causes additional page faults, then yes, it isn't an advantage. But the argument was that when compared with page faults (something we've all been willing to accept, something that hasn't caused an unacceptable performance hit), GC is small potatoes. In addition, modern GC implementations take the amount of memory available on the machine into account, and they actively work to keep memory usage under the level that would start causing page faults. The garbage collector can spring into action when a request is made that might result in a page fault, and the page fault only actually happens if there really aren't any objects to release. The GC can dynamically determine which objects are unreachable, while the non-GC program must rely on a static analysis that often means objects live longer than is truly necessary (the GC can free something anytime after the last reference is gone, while a non-GC program has to keep the object until it comes to a point in the control flow that is unreachable until the object is no longer needed). While the dynamic analysis has a runtime cost (the static analysis has no runtime cost), the additional information available at runtime can enable much more efficient memory management, which in turn can reduce page faults, which can off

    --
    Time flies like an arrow. Fruit flies like a banana.
  10. So use C++ already by Animats · · Score: 3, Insightful
    Watching C programmers create home-brew objects is pathetic. If you need to encapsulate data structures, use C++.

    Yes, C++ has a host of problems and Strostrup and the C++ committee refuse to fix them. But the STL is a huge improvement on malloc/free. (They still can't get auto_ptr right, though.)

  11. Re:Um by Anonymous Coward · · Score: 1, Insightful

    If you think that was the point, then you completely missed out. I guess that is to be expected from somebody that went to UT.