C Coding Tip - Self-Manage Memory Alllocation
An anonymous reader inputs: "The C programming language defines two standard memory management functions: malloc() and free(). C programmers frequently use those functions to allocate buffers at run time to pass data between functions. In many situations, however, you cannot predetermine the actual sizes required for the buffers, which may cause several fundamental problems for constructing complex C programs. This article advocates a self-managing, abstract data buffer. It outlines a pseudo-C implementation of the abstract buffer and details the advantages of adopting this mechanism."
garbage collection isnt the endall amazing solution that you may think it is.
garbage collection stops system response while it's cleaning up. At least it does this with Java.
If you were running a system that required realtime response, like, lets say, an air traffic control tower, garbage collection could delay certain actions from performing in time and cause a disaster.
c was designed for systems programming, and not having realtime assurance is A Bad Thing.
> garbage collection stops system response while it's cleaning up
... yeah yeah you get my drift)
And malloc is of course free, right? ("well no wally, they're opposites"
Good gc's operate incrementally. Good gc's let you turn gc on and off at will and disable it altogether for designated arenas. Good gc's can run in a separate thread on another CPU, whereas malloc/free cannot.
The reason java's gc goes wiggy is not because the gc is bad (it's just not very tunable except on solaris), it's because it allocates new objects all over the place (and is happily helped at it by the standard libraries). If you go hog wild with resource consumption, yes you're going to pay for it later.
For the 99.99% of programs that do NOT need hard realtime, you're better off with gc. Cripes, it's like saying homes shouldn't have thermostats because a home thermostat isn't suitable for a reactor sensor.
I've finally had it: until slashdot gets article moderation, I am not coming back.
If I ruled the world, I would create a multi-paradigm (object-oriented, generic, functional, and modular support) strongly-typed low-level language that let you program at a high-level. A second high-level langauge that was loosely-typed, garbage collected, and could be interpreted or natively compiled. Then I would define a standard to interface the two languages.
----------
You just described Scheme/CL/Dylan.
A deep unwavering belief is a sure sign you're missing something...
reasonable strategy: allocate a huge buffer ... do repetitive flushes and complete reloads of data ...
;)
/., or you gotta tell us what OSS projects you've contributed to, with the above philosophy!
Um, this is reasonable how? If you're coding on a 3GHz dream machine w/4GB 400mhz RAM, there will be somebody out there who, quite reasonably, will want to run your code on a 64MB Pentium 133... Testing only on unencumbered machines makes for delusional developers.
I'm split: either yer trollin'
Big Daddy, Johnny, Burp, Aunt Zelda, Scott, Slurp, Big Momma
While I don't think GC is quite to the point where it is free or beneficial to the performance of the average application, it is a lot less harmful than most people think. Given that it simplifies the code and eliminates a lot of bugs (usually more than it introduces), it is definitely worthwhile in almost all new application code (kernel-mode code isn't quite there yet, but it's coming), with only a small performance penalty. And I suspect that it won't be too long before it starts to be more of a speed booster, not a perf hit.
I think this is just another step in the process of handing another menial task over to the CPU. We moved from binary to assembly, assembly to low-level languages, low-level languages to higher-level languages, etc. At each step, the new method had a performance penalty at first, then as the new method matured, it turned out to actually be faster than the old method it replaced, while dramatically increasing programmer productivity (i.e. modern optimizers can usually do a better job than an assembly language programmer; often C++ code is faster than the equivalent C code since the compiler has more information to work with and the programmer can make use of more effective techniques like templates).
Time flies like an arrow. Fruit flies like a banana.
Care to back that up with some facts.
So you a) have the library code do that ... and b) have lots of tests. Of course you want to do both of those for something using a single block of memory too, and if you want it to be efficient it usualy does something clever to avoid copies ... and so is probably more likely to screw up and use/deallocate the wrong thing. As for leaking memory, that's pretty much built into the design with a single block of memory model, unless you want to keep calling realloc() to shrink the buffer (then you just blow the fragmentation of malloc to hell -- which if you're lucky will just make things slower).
The "idea" of the article wasn't bad, it's just a very bad description of the proposed solution ... and a couple of years too late.
ustr: Managed string API with ave. 44% overhead over strdup(), for 0-20B
While I agree the IBM "research" article is terrible, the idea behind it isn't.
Actually having donetests and benchmarks. I can safely say:
ustr: Managed string API with ave. 44% overhead over strdup(), for 0-20B
I would suggest Ada + scheme/common lisp. I also think plain C + assembly would be very sensible for when one is dealing directly with bits, bytes, and addresses as it simplifies some of the overriding which would be needed in Ada.
I don't think of C++ as strongly typed. typedefs don't enforce, e.g., different meanings of ints (so number of apples vs. number of oranges can't really be distinguished without creating multiple classes). Even enums aren't distinguishable from ints!
Also, Ada already has the packaging you mentioned. It also does support interoperability with C, as do most scheme/lisps.
Great minds think alike; fools seldom differ.
I've heard that for years, it's yet to be true for the general case ... and most people doing manual allocation don't call malloc/free for every single grow/shrink operation. The caching slab allocator was put into solaris a long time ago now.
... why is that?
Caching slab, lookaside lists, whatever, you're still calling some kind of allocator with some kind of cost. There are definitely applications where memory allocation/deallocation follows a pattern that allows that kind of optimization to work well. But here is the crux of my argument: 1. the general purpose allocators definitely do have a non-zero cost; 2. special purpose allocators can be significantly faster, but they aren't general purpose, so they don't work in all situations, and they often require significant extra work to make them work properly; 3. special-purpose allocators usually work by speculatively pre-allocating chunks from the general purpose allocator, and this wastes memory; 4. even when you are using special-purpose allocators as much as possible, you still have to spend a lot of time in memory management -- the special-purpose allocator isn't free, and every once in a while it still has to call out to the general purpose allocator, which is expensive. So I stand by the original premise: after all is said and done, the time spent in memory management for a GC-based program isn't much (if any) greater than the time spent in memory management for an equivalient non-GC-based program. When the total cost of the allocations/deallocations is compared with the cost of a similar set of GC-alloc + collections, the difference for most big applications isn't really significant.
The keyword being all, yes you can sometimes just completely forget about designing how you use memory and everything "works". However sometimes those bits of memory have other resources associated with them.
Yep. And now you don't have to bother with those either, since the GC can collect them during finalization of their associated bits of memory. The point here is that while you still might have to write a destructor of some sort, it is usually much simpler (no memory to worry about) and you don't have to figure out when that resource is no longer in use -- the GC will figure it out for you. If the resource must be freed deterministically, GC doesn't prevent you from doing that. But in the cases where it doesn't, GC still works great. In other words: don't refuse the solution just because it only makes 90% of your problems go away -- you're still better off, aren't you?
You seem to be under the impression that running the GC isn't going to blow your cache
If the garbage collector itself causes additional page faults, then yes, it isn't an advantage. But the argument was that when compared with page faults (something we've all been willing to accept, something that hasn't caused an unacceptable performance hit), GC is small potatoes. In addition, modern GC implementations take the amount of memory available on the machine into account, and they actively work to keep memory usage under the level that would start causing page faults. The garbage collector can spring into action when a request is made that might result in a page fault, and the page fault only actually happens if there really aren't any objects to release. The GC can dynamically determine which objects are unreachable, while the non-GC program must rely on a static analysis that often means objects live longer than is truly necessary (the GC can free something anytime after the last reference is gone, while a non-GC program has to keep the object until it comes to a point in the control flow that is unreachable until the object is no longer needed). While the dynamic analysis has a runtime cost (the static analysis has no runtime cost), the additional information available at runtime can enable much more efficient memory management, which in turn can reduce page faults, which can off
Time flies like an arrow. Fruit flies like a banana.
Yes, C++ has a host of problems and Strostrup and the C++ committee refuse to fix them. But the STL is a huge improvement on malloc/free. (They still can't get auto_ptr right, though.)
If you think that was the point, then you completely missed out. I guess that is to be expected from somebody that went to UT.