C Coding Tip - Self-Manage Memory Alllocation

← Back to Stories (view on slashdot.org)

C Coding Tip - Self-Manage Memory Alllocation

Posted by Cliff on Wednesday January 7, 2004 @01:14PM from the free()-your-own-malloc() dept.

An anonymous reader inputs: "The C programming language defines two standard memory management functions: malloc() and free(). C programmers frequently use those functions to allocate buffers at run time to pass data between functions. In many situations, however, you cannot predetermine the actual sizes required for the buffers, which may cause several fundamental problems for constructing complex C programs. This article advocates a self-managing, abstract data buffer. It outlines a pseudo-C implementation of the abstract buffer and details the advantages of adopting this mechanism."

4 of 142 comments (clear)

Re:Um by scrytch · 2004-01-07 14:04 · Score: 5, Insightful

> garbage collection stops system response while it's cleaning up

And malloc is of course free, right? ("well no wally, they're opposites" ... yeah yeah you get my drift)

Good gc's operate incrementally. Good gc's let you turn gc on and off at will and disable it altogether for designated arenas. Good gc's can run in a separate thread on another CPU, whereas malloc/free cannot.

The reason java's gc goes wiggy is not because the gc is bad (it's just not very tunable except on solaris), it's because it allocates new objects all over the place (and is happily helped at it by the standard libraries). If you go hog wild with resource consumption, yes you're going to pay for it later.

For the 99.99% of programs that do NOT need hard realtime, you're better off with gc. Cripes, it's like saying homes shouldn't have thermostats because a home thermostat isn't suitable for a reactor sensor.

--
I've finally had it: until slashdot gets article moderation, I am not coming back.
Re:Um by be-fan · 2004-01-07 14:08 · Score: 5, Insightful

If I ruled the world, I would create a multi-paradigm (object-oriented, generic, functional, and modular support) strongly-typed low-level language that let you program at a high-level. A second high-level langauge that was loosely-typed, garbage collected, and could be interpreted or natively compiled. Then I would define a standard to interface the two languages.
----------
You just described Scheme/CL/Dylan.

--
A deep unwavering belief is a sure sign you're missing something...
Re:Um by cookd · 2004-01-07 17:51 · Score: 5, Insightful
It's even better than that. In many cases, GC can be almost as fast as malloc/free. Looking at the way things are going, I am pretty sure that in the near future GC-based code will be faster, not slower, than equivalent malloc/free code:
- A compacting GC heap can make allocation into a REALLY fast operation (unless it triggers a collection), and there is no time spent deallocating. On normal heaps, malloc/new can take quite a lot of work (thread synchronization, best-fit free list searching, etc.). If you amortize the time the GC spends in collection over the allocations, the average allocation isn't that much slower than the corresponding malloc/free. Best of all, the gap is shrinking. Soon, GC may be FASTER overall than malloc/free in many real-world situations. It obviously depends on memory usage patterns and collection strategies, but it is starting to happen.
- If you're using GC, your program doesn't have to do all of the bookkeeping anymore. The amount of time spent in resource tracking in big programs is fairly significant -- where the same object is shared by many parts of the program, figuring out when the item can be released is nontrivial. Under a GC, this is handled by the collector. In some cases, the GC can handle the bookkeeping more efficiently than would have been feasible with manual bookkeeping.
- As CPUs get faster and storage (memory and disk) gets (relatively) slower, it becomes more and more attractive to spend extra CPU cycles to try to make better use of cache and memory. An L1 cache miss costs around 4-10 cycles; an L2 miss can cost 100-400 cycles; a page fault costs millions. The CPU time spent in garbage collection can become insignificant when compared with storage access time. Do you take the cache line size into consideration when you call malloc? I sure don't. But compacting GC implementations are starting to take things like that into consideration when they collect, and they rearrange the memory of the process to maximize cache hits and minimize memory waste. They can help your program make more efficient use of the cache, and perhaps reduce the working set. Just a 5% reduction in page faults would more than offset all but the most CPU-intensive garbage collectors.
- GCs usually collect on a separate thread. That means that with a properly designed collector, while your program is blocked on IO or waiting for user input, the GC might be cleaning up the heap on a low priority thread. With luck, your main thread might NEVER actually be interrupted for a collection -- all of the collection can be done while your work threads were waiting for something else. Your memory management tax was paid in the background, while the machine was otherwise idle. On the other hand, with malloc and free, the memory management tax is paid up front by the currently executing thread. While this isn't always the case (i.e. CPU-bound processes), most apps are NOT CPU-bound, so this is a likely scenario in the future (if not the present).
While I don't think GC is quite to the point where it is free or beneficial to the performance of the average application, it is a lot less harmful than most people think. Given that it simplifies the code and eliminates a lot of bugs (usually more than it introduces), it is definitely worthwhile in almost all new application code (kernel-mode code isn't quite there yet, but it's coming), with only a small performance penalty. And I suspect that it won't be too long before it starts to be more of a speed booster, not a perf hit.

I think this is just another step in the process of handing another menial task over to the CPU. We moved from binary to assembly, assembly to low-level languages, low-level languages to higher-level languages, etc. At each step, the new method had a performance penalty at first, then as the new method matured, it turned out to actually be faster than the old method it replaced, while dramatically increasing programmer productivity (i.e. modern optimizers can usually do a better job than an assembly language programmer; often C++ code is faster than the equivalent C code since the compiler has more information to work with and the programmer can make use of more effective techniques like templates).
--
Time flies like an arrow. Fruit flies like a banana.
Vstr by Nevyn · 2004-01-07 18:11 · Score: 5, Informative

The article basically proposes a very bad implementation of Vstr, most of the advise was extremly simplified at best but more likely just uninformed: an "efficient" abstract buffer that mixes shorts and pointers -- words almost fail me, how to solve the problem of "what do you do with the data when it's all in the buffer" -- "let's just copy it back out again (hey whats a couple of extra copies between friends). Representing in memory object sizes with "long int" *sigh*.

If you are interested in the article, go read this explanation of why you want it for security and this explanation of why you want it for speed .

Vstr is LGPL, has actual benchmark data behind the block sizes it picks, has an extensive test suite ... and has documentation for the many functions that come with the library (including a fully compliant printf like function). Of course, I don't have a PhD ... but after reading this, you might well count that as a plus too

--
ustr: Managed string API with ave. 44% overhead over strdup(), for 0-20B