C Coding Tip - Self-Manage Memory Alllocation

Um by be-fan · 2004-01-07 13:17 · Score: 3, Interesting

--
A deep unwavering belief is a sure sign you're missing something...

Re:Um by Screaming+Lunatic · 2004-01-07 14:04 · Score: 4, Interesting

Which means you don't have predictable destruction. Which means you don't have destructors. Which means you can't use idioms like resource-allocation-is-initialization.
So in the presence of exceptions, you won't leak memory on the heap. But you will leak mutexes, file handles, etc. You need another idiom to handle those cases.
In the .NET world, C# introduced synchronization blocks to handle the leaking mutex problem. But it is a pain in Managed C++ and VB.NET.
Garbage collection is not the be all and end all.
If I ruled the world, I would create a multi-paradigm (object-oriented, generic, functional, and modular support) strongly-typed low-level language that let you program at a high-level. A second high-level langauge that was loosely-typed, garbage collected, and could be interpreted or natively compiled. Then I would define a standard to interface the two languages.
In other words, take C++ and add the concept of components/packages. Take Python and add the features (such as generics) that are missing from C++. And then define an interface between components written in both langauges.
Currently Boost.Python and SWIG exist. But I wish that they would just work automagically, everytime I typed make at the command line or build in VC++.
Re:Um by scrytch · 2004-01-07 14:04 · Score: 5, Insightful

> garbage collection stops system response while it's cleaning up

And malloc is of course free, right? ("well no wally, they're opposites" ... yeah yeah you get my drift)

Good gc's operate incrementally. Good gc's let you turn gc on and off at will and disable it altogether for designated arenas. Good gc's can run in a separate thread on another CPU, whereas malloc/free cannot.

The reason java's gc goes wiggy is not because the gc is bad (it's just not very tunable except on solaris), it's because it allocates new objects all over the place (and is happily helped at it by the standard libraries). If you go hog wild with resource consumption, yes you're going to pay for it later.

For the 99.99% of programs that do NOT need hard realtime, you're better off with gc. Cripes, it's like saying homes shouldn't have thermostats because a home thermostat isn't suitable for a reactor sensor.

--
I've finally had it: until slashdot gets article moderation, I am not coming back.
Re:Um by be-fan · 2004-01-07 14:08 · Score: 5, Insightful

If I ruled the world, I would create a multi-paradigm (object-oriented, generic, functional, and modular support) strongly-typed low-level language that let you program at a high-level. A second high-level langauge that was loosely-typed, garbage collected, and could be interpreted or natively compiled. Then I would define a standard to interface the two languages.
----------
You just described Scheme/CL/Dylan.

--
A deep unwavering belief is a sure sign you're missing something...
Re:Um by Tom7 · 2004-01-07 16:17 · Score: 4, Informative

Of course, there are also hard real-time garbage collectors (ie Cheng's), though I don't think you'll find them in general-purpose production compilers. However, you will find good garbage collectors in a number of real production compilers (say, in mlton). It's definitely worth benchmarking.
Re:Um by E_elven · 2004-01-07 17:45 · Score: 2, Funny

> Which means you don't have predictable destruction. Which means you don't have destructors.

My GC does something like this:

1) Allocate memory, reference count 1 ...
N-1) When ref count reaches 0, we call destroy() on the object ...
N) Free the memory

--
Marxist evolution is just N generations away!
Re:Um by cookd · 2004-01-07 17:51 · Score: 5, Insightful
It's even better than that. In many cases, GC can be almost as fast as malloc/free. Looking at the way things are going, I am pretty sure that in the near future GC-based code will be faster, not slower, than equivalent malloc/free code:
- A compacting GC heap can make allocation into a REALLY fast operation (unless it triggers a collection), and there is no time spent deallocating. On normal heaps, malloc/new can take quite a lot of work (thread synchronization, best-fit free list searching, etc.). If you amortize the time the GC spends in collection over the allocations, the average allocation isn't that much slower than the corresponding malloc/free. Best of all, the gap is shrinking. Soon, GC may be FASTER overall than malloc/free in many real-world situations. It obviously depends on memory usage patterns and collection strategies, but it is starting to happen.
- If you're using GC, your program doesn't have to do all of the bookkeeping anymore. The amount of time spent in resource tracking in big programs is fairly significant -- where the same object is shared by many parts of the program, figuring out when the item can be released is nontrivial. Under a GC, this is handled by the collector. In some cases, the GC can handle the bookkeeping more efficiently than would have been feasible with manual bookkeeping.
- As CPUs get faster and storage (memory and disk) gets (relatively) slower, it becomes more and more attractive to spend extra CPU cycles to try to make better use of cache and memory. An L1 cache miss costs around 4-10 cycles; an L2 miss can cost 100-400 cycles; a page fault costs millions. The CPU time spent in garbage collection can become insignificant when compared with storage access time. Do you take the cache line size into consideration when you call malloc? I sure don't. But compacting GC implementations are starting to take things like that into consideration when they collect, and they rearrange the memory of the process to maximize cache hits and minimize memory waste. They can help your program make more efficient use of the cache, and perhaps reduce the working set. Just a 5% reduction in page faults would more than offset all but the most CPU-intensive garbage collectors.
- GCs usually collect on a separate thread. That means that with a properly designed collector, while your program is blocked on IO or waiting for user input, the GC might be cleaning up the heap on a low priority thread. With luck, your main thread might NEVER actually be interrupted for a collection -- all of the collection can be done while your work threads were waiting for something else. Your memory management tax was paid in the background, while the machine was otherwise idle. On the other hand, with malloc and free, the memory management tax is paid up front by the currently executing thread. While this isn't always the case (i.e. CPU-bound processes), most apps are NOT CPU-bound, so this is a likely scenario in the future (if not the present).
While I don't think GC is quite to the point where it is free or beneficial to the performance of the average application, it is a lot less harmful than most people think. Given that it simplifies the code and eliminates a lot of bugs (usually more than it introduces), it is definitely worthwhile in almost all new application code (kernel-mode code isn't quite there yet, but it's coming), with only a small performance penalty. And I suspect that it won't be too long before it starts to be more of a speed booster, not a perf hit.

I think this is just another step in the process of handing another menial task over to the CPU. We moved from binary to assembly, assembly to low-level languages, low-level languages to higher-level languages, etc. At each step, the new method had a performance penalty at first, then as the new method matured, it turned out to actually be faster than the old method it replaced, while dramatically increasing programmer productivity (i.e. modern optimizers can usually do a better job than an assembly language programmer; often C++ code is faster than the equivalent C code since the compiler has more information to work with and the programmer can make use of more effective techniques like templates).
--
Time flies like an arrow. Fruit flies like a banana.
Re:Um by egomaniac · 2004-01-07 18:33 · Score: 2, Interesting

Which means you don't have predictable destruction. Which means you don't have destructors. Which means you can't use idioms like resource-allocation-is-initialization.

So in the presence of exceptions, you won't leak memory on the heap. But you will leak mutexes, file handles, etc. You need another idiom to handle those cases.

Java is probably the most widely-used garbage-collected language in existence. I think I speak for all Java programmers when I say "WTF are you talking about?"

It is true that you can't guarantee timely execution of finalizers. Because of that, Java is no different than any other language in that you should close files when you're done with them. However, if you forget to close a file here and there, it's generally no big deal because they will be closed during finalization. It only becomes a problem when you're leaking tons of file handles very quickly.

Now, this article is about C, so let's compare the two.

Java: failing to close a file == usually no problem whatsoever

C: failing to close a file == permanently leaked handle

I'd take the garbage-collected case any day of the week, personally.

As far as the other case you mention, mutexes, goes, Java has two means of providing mutual exclusion. The "synchronized" keyword, which is almost always the way things are done, works properly in the face of exceptions, with no extra work by the user. The wait and notify methods, which are the more advanced case, need to be balanced, so you can end up with deadlocks if you fail to do a notify for some reason (such as an exception).

Well, that's what the "finally" clause is for. Yes, you need to be careful to do things the right way, but that's true of any language.

I freely admit that Java is not the be-all and end-all of languages. There may be ways of handling these situations that guarantee these problems can never occur. But you make it sound as if garbage collection is a step backwards from malloc/free, and it most certainly is not. Everywhere you can get into trouble with a garbage collected language, you can get into ten times as much trouble with a manual-free language.

--
ZFS: because love is never having to say fsck
Re:Um by Anonymous Coward · 2004-01-08 00:00 · Score: 2, Informative

Often C++ code is faster than the equivalent C code since the compiler has more information to work with and the programmer can make use of more effective techniques like templates.

Forget C++! High-level modern multi-paradigm languages like Common Lisp and OCaml, which can do most things C++ can do and a lot of things it can't, often produce code as fast as the average C implementation of any given algorithm, despite their relying heavily on GC. And while it's generally possible to tweak the C version to be faster, bear in mind that the C already takes longer to write and debug, even without allowing for the time-consuming profile/tweak/rebuild cycle that real optimisation requires.

Personally I've already relegated C to the position that assembly used to hold - I drop into C to write very tight loops, to perform low-level bit twiddling, and to interface with other languages. I wouldn't really consider using it to write an entire application, and the time I save by not having to manage resources myself is one of the major factors that rule it out.
Re:Um by Mikkeles · 2004-01-08 02:22 · Score: 2, Insightful

If I ruled the world, I would create a multi-paradigm (object-oriented, generic, functional, and modular support) strongly-typed low-level language that let you program at a high-level. A second high-level langauge that was loosely-typed, garbage collected, and could be interpreted or natively compiled. Then I would define a standard to interface the two languages.

In other words, take C++ and add the concept of components/packages. Take Python and add the features (such as generics) that are missing from C++. And then define an interface between components written in both langauges.

I would suggest Ada + scheme/common lisp. I also think plain C + assembly would be very sensible for when one is dealing directly with bits, bytes, and addresses as it simplifies some of the overriding which would be needed in Ada.

I don't think of C++ as strongly typed. typedefs don't enforce, e.g., different meanings of ints (so number of apples vs. number of oranges can't really be distinguished without creating multiple classes). Even enums aren't distinguishable from ints!
Also, Ada already has the packaging you mentioned. It also does support interoperability with C, as do most scheme/lisps.

--
Great minds think alike; fools seldom differ.
Re:Um by Curien · 2004-01-08 04:01 · Score: 4, Interesting

Java is probably the most widely-used garbage-collected language in existence. I think I speak for all Java programmers when I say "WTF are you talking about?"

He's talking about the unpredictability of resource release using GC. If unpredictability isn't a problem, fine. If you need to synchronize your resources carefully (which is what a mutex is /all about/), then you've got a problem.

Now, this article is about C, so let's compare the two.

The post you're responding to wasn't about C: it was about a weakness of GC compared to, say, RAII (which is the idiom C++, among others, uses). But just for fun, let's go on to see how little you know about C.

Java: failing to close a file == usually no problem whatsoever

Unless you need to open the file again before the garbage collector decides to reclaim the handle.

C: failing to close a file == permanently leaked handle

Bzzt... wrong. All file handles are released upon program termination. I like how you used '==' to try impress people with your programming skillz, though.

As far as the other case you mention, mutexes, goes, Java has two means of providing mutual exclusion. The "synchronized" keyword

Wait, wait... you're saying... hold on a second, now... that Java uses a different idiom to handle mutexes? That's exactly what the parent post said it would have to do... because GC isn't as useful as RAII when it comes to general resource allocation (not just memory).

But you make it sound as if garbage collection is a step backwards from malloc/free

He made no such comparison. He compared it (unfavorably) to RAII.

--
It's always a long day... 86400 doesn't fit into a short.
Re:Um by Nevyn · 2004-01-08 09:25 · Score: 2, Interesting

If you amortize the time the GC spends in collection over the allocations, the average allocation isn't that much slower than the corresponding malloc/free. Best of all, the gap is shrinking. Soon, GC may be FASTER overall than malloc/free in many real-world situations. It obviously depends on memory usage patterns and collection strategies, but it is starting to happen.

I've heard that for years, it's yet to be true for the general case ... and most people doing manual allocation don't call malloc/free for every single grow/shrink operation. The caching slab allocator was put into solaris a long time ago now.

If you're using GC, your program doesn't have to do all of the bookkeeping anymore.

The keyword being all, yes you can sometimes just completely forget about designing how you use memory and everything "works". However sometimes those bits of memory have other resources associated with them.

An L1 cache miss costs around 4-10 cycles; an L2 miss can cost 100-400 cycles; a page fault costs millions. The CPU time spent in garbage collection can become insignificant when compared with storage access time.

You seem to be under the impression that running the GC isn't going to blow your cache ... why is that?
But compacting GC implementations are starting to take things like that into consideration when they collect, and they rearrange the memory of the process to maximize cache hits and minimize memory waste.

And the GC knows when to move memory, and when it's just wasting CPU/cache ... how? It's fairly simple to have double pointers (both win32 and MacOS9 use HANDLE types a lot IIRC). The big problem is working out when you can/should move them.

GCs usually collect on a separate thread. That means that with a properly designed collector, while your program is blocked on IO or waiting for user input, the GC might be cleaning up the heap on a low priority thread. With luck, your main thread might NEVER actually be interrupted for a collection

If you actually want threads ... which many of us don't. It also seems to think that cross CPU cache invalidates are free (accessing the same data from more than one CPU -- this can have a massive cycle cost), and that the GC does some kind of magic locking which will never affect the application.

It's also amusing that, by definition, it means that as your application does more work (and hence needs more CPU) it is guaranteed to not get it ... because you will then suddenly start fighting with the GC thread.

While I don't think GC is quite to the point where it is free or beneficial to the performance of the average application, it is a lot less harmful than most people think.

It's my experience that the people who think GC is expensive generally know what they are talking about, and are in the minority. Most people either don't need to care (GC is good enough for them -- fair enough, I've written things where I used perl and didn't need to know how it allocated), or just don't care.

There are other ways of managing memory than just giving up and saying well the GC will save me. Vstr, which is a decent implementation of what the IBM article was trying to say removes almost all the allocation/deallocation headache when dealing with byte data ... but it's still controllable and predictable.

--
ustr: Managed string API with ave. 44% overhead over strdup(), for 0-20B
Re:Um by moocat2 · 2004-01-08 13:56 · Score: 2, Interesting

Yes, the article is about C, but the poster was talking about predictable destruction, so he was not referring to C when dissing GC.

My guess is that he or she was referring to C++ using the RAII (resource acquisition is initialization) paradigm. Every resource is wrapped in a object and all objects are put on the stack. Voila, no resource leaks.

Also, I think your example is far from complete. While failing to close a file handle may be "no problem whatsoever" for the application itself, but can be a problem in the larger context. Some platforms will not allow a file to be deleted when it is open. If a programmer forgets to close a file handle, the user may not be able to delete the file until some time in the future. Far from an ideal situation.

And consider a different example. Think about trying to create a different synch primitive such as a reader-writer lock. Under a GC system, the programmer is still required to release the lock. If he doesn't, the system could essentially hang.

As for the finally clause, we all know that guaranteeing that there is no path that allows leaks is just plain damn hard to do. If finally was such a great answer, why even both with GC, just free data in the finally clause.
Re:Um by cookd · 2004-01-08 16:00 · Score: 2, Insightful

I've heard that for years, it's yet to be true for the general case ... and most people doing manual allocation don't call malloc/free for every single grow/shrink operation. The caching slab allocator was put into solaris a long time ago now.

Caching slab, lookaside lists, whatever, you're still calling some kind of allocator with some kind of cost. There are definitely applications where memory allocation/deallocation follows a pattern that allows that kind of optimization to work well. But here is the crux of my argument: 1. the general purpose allocators definitely do have a non-zero cost; 2. special purpose allocators can be significantly faster, but they aren't general purpose, so they don't work in all situations, and they often require significant extra work to make them work properly; 3. special-purpose allocators usually work by speculatively pre-allocating chunks from the general purpose allocator, and this wastes memory; 4. even when you are using special-purpose allocators as much as possible, you still have to spend a lot of time in memory management -- the special-purpose allocator isn't free, and every once in a while it still has to call out to the general purpose allocator, which is expensive. So I stand by the original premise: after all is said and done, the time spent in memory management for a GC-based program isn't much (if any) greater than the time spent in memory management for an equivalient non-GC-based program. When the total cost of the allocations/deallocations is compared with the cost of a similar set of GC-alloc + collections, the difference for most big applications isn't really significant.

The keyword being all, yes you can sometimes just completely forget about designing how you use memory and everything "works". However sometimes those bits of memory have other resources associated with them.

Yep. And now you don't have to bother with those either, since the GC can collect them during finalization of their associated bits of memory. The point here is that while you still might have to write a destructor of some sort, it is usually much simpler (no memory to worry about) and you don't have to figure out when that resource is no longer in use -- the GC will figure it out for you. If the resource must be freed deterministically, GC doesn't prevent you from doing that. But in the cases where it doesn't, GC still works great. In other words: don't refuse the solution just because it only makes 90% of your problems go away -- you're still better off, aren't you?

You seem to be under the impression that running the GC isn't going to blow your cache ... why is that?

If the garbage collector itself causes additional page faults, then yes, it isn't an advantage. But the argument was that when compared with page faults (something we've all been willing to accept, something that hasn't caused an unacceptable performance hit), GC is small potatoes. In addition, modern GC implementations take the amount of memory available on the machine into account, and they actively work to keep memory usage under the level that would start causing page faults. The garbage collector can spring into action when a request is made that might result in a page fault, and the page fault only actually happens if there really aren't any objects to release. The GC can dynamically determine which objects are unreachable, while the non-GC program must rely on a static analysis that often means objects live longer than is truly necessary (the GC can free something anytime after the last reference is gone, while a non-GC program has to keep the object until it comes to a point in the control flow that is unreachable until the object is no longer needed). While the dynamic analysis has a runtime cost (the static analysis has no runtime cost), the additional information available at runtime can enable much more efficient memory management, which in turn can reduce page faults, which can off

--
Time flies like an arrow. Fruit flies like a banana.
Re:Um by Whip-hero · 2004-01-09 04:33 · Score: 2, Informative

So in the presence of exceptions, you won't leak memory on the heap. But you will leak mutexes, file handles, etc. You need another idiom to handle those cases.

In the .NET world, C# introduced synchronization blocks to handle the leaking mutex problem. But it is a pain in Managed C++ and VB.NET.

File/mutex/whatever leakage in the presence of exceptions? Two words:

unwind-protect

--
--WH--

Gee, isn't that handy by c · 2004-01-07 13:34 · Score: 4, Interesting

Anyone who's done C coding for more than, oh, a day would have already figured that out. It's not a coincidence that every programming language that doesn't have "smart" arrays built into the language ends up with some sort of buffer class (Java's ByteStream class, C++'s stream IO buffers, etc).

The fundamental problem is that this sort of thing needs to be done at the C library level. And if it's not done in a flexible fashion, you end up with a library call that rarely gets used. Anyone used hsearch() lately?

If only clib streams (FILE* and friends) were extensible, this article would never have had to be written.

c.

--
Log in or piss off.

Please by YellowElectricRat · 2004-01-07 14:04 · Score: 4, Funny

This article, I believe, has already been published in the well known programmers' journal "No shit Sherlock - monthly"

Memory Allocation by rmohr02 · 2004-01-07 14:16 · Score: 4, Funny

Just like slashdot allocated extra space for the third "l" in "alllocation".

Hmmm by JMZero · 2004-01-07 14:29 · Score: 4, Interesting

Many small programs are no longer memory, or even performance, constrained. As such, a reasonable strategy for a lot of desktop software is to allocate a huge buffer at startup, and do repetitive flushes and complete reloads of data (always using the same pre-allocated buffers).

This is simple to do, and avoids a lot of errors. It's also not much of a headline.

--
Let's not stir that bag of worms...

Re:Hmmm by schmaltz · 2004-01-07 16:39 · Score: 2, Insightful

reasonable strategy: allocate a huge buffer ... do repetitive flushes and complete reloads of data ...

Um, this is reasonable how? If you're coding on a 3GHz dream machine w/4GB 400mhz RAM, there will be somebody out there who, quite reasonably, will want to run your code on a 64MB Pentium 133... Testing only on unencumbered machines makes for delusional developers. ;)

I'm split: either yer trollin' /., or you gotta tell us what OSS projects you've contributed to, with the above philosophy!

--
Big Daddy, Johnny, Burp, Aunt Zelda, Scott, Slurp, Big Momma ... where's Siggy?
Re:Hmmm by Nevyn · 2004-01-07 19:25 · Score: 2, Insightful
This is simple to do, and avoids a lot of errors. It's also not much of a headline.

While I agree the IBM "research" article is terrible, the idea behind it isn't.

Actually having donetests and benchmarks. I can safely say:
- It's not the simplest solution.
- It's certainly not anywhere near fast.
--
ustr: Managed string API with ave. 44% overhead over strdup(), for 0-20B

Wow! by arkanes · 2004-01-07 14:59 · Score: 4, Funny

Man, I never thought of that. An abstract memory buffer. What a concept! I don't need to define the lengths of everything at compile time then!

Now, I'll need a nice short catchy name for it... oh! I know! I'll call it a heap!

Useful by dtfinch · 2004-01-07 15:20 · Score: 2, Interesting

But this is like teaching calculus students remedial math. The "Level: Intermediate" at the top of the article should have given that away.

Re:realloc by Permission+Denied · 2004-01-07 16:20 · Score: 4, Interesting

Regarding realloc:

When you call realloc, you're very likely to cause the data to be copied from the old buffer to the new buffer. This is very high overhead. The article discusses how to do similar things, but without this unecessary copying (eg, low overhead). It's actually not that interesting of an article as what it describes is hardly new and I believe any competent programmer could come up with that solution when faced with the particular circumstances that inspired it.

Realloc works by seeing if there is free memory after the end of the allocated block, and changing the block's size if so. Realloc can do this because it knows about the internals of the malloc/free implementation. If there is allocated memory right after the block in question, a new block must be allocated, as you cannot "move" the later block in a language like C where any memory location can be a pointer. You could try this kind of stuff in other languages (or in some bastardized C where you do not have direct access to memory, but go through more indirection, the next logical abstraction after the article), but when you start automatically finding/checking/updating memory pointers, you get into GC.

You may be able to overcome some overhead on realloc if you move the problem down into the kernel. The kernel could play page table games so there is little or no actual copying involved, just updating of page tables. This would be fairly easy to implement, but I don't think anyone's done it because (a) flushing the relevant TLB entries could hurt performance more than the copying, and (b) the system call overhead might be more overhead than the copying. Realloc is generally only used for small buffers (due to programmers knowing about the copying overhead) and this trick would only have gains for large buffers spanning multiple pages. For small buffers, the library-level realloc could avoid the system call and do the copying itself, avoiding system call overhead and TLB entry flushes.

This scheme I describe could make for an interesting paper (especially determining for what size of buffer and what type of program it has gains), but I doubt it would make much difference in real system performance as programmers avoid realloc for large buffers, and there are very few cases where one needs direct linear access to a large range of memory rather than being better-served by organizing that memory into some data structure.

A memory leak IN THE SPECIFICATION?!? by MarkusQ · 2004-01-07 16:58 · Score: 4, Funny

From the article:

The pLostBlock points to the linked list's last memory block.

pLostBlock?!? This almost sounds as if it's designed to leak!

-- MarkusQ

P.S. Seriously, I think this is a fine idea, if not particularly earth shaking. But the typo was too ironic not to point out.

Vstr by Nevyn · 2004-01-07 18:11 · Score: 5, Informative

The article basically proposes a very bad implementation of Vstr, most of the advise was extremly simplified at best but more likely just uninformed: an "efficient" abstract buffer that mixes shorts and pointers -- words almost fail me, how to solve the problem of "what do you do with the data when it's all in the buffer" -- "let's just copy it back out again (hey whats a couple of extra copies between friends). Representing in memory object sizes with "long int" *sigh*.

If you are interested in the article, go read this explanation of why you want it for security and this explanation of why you want it for speed .

Vstr is LGPL, has actual benchmark data behind the block sizes it picks, has an extensive test suite ... and has documentation for the many functions that come with the library (including a fully compliant printf like function). Of course, I don't have a PhD ... but after reading this, you might well count that as a plus too

--
ustr: Managed string API with ave. 44% overhead over strdup(), for 0-20B

Re:realloc by Nevyn · 2004-01-07 19:14 · Score: 2, Insightful

But linked lists are worse on performance than realloc.

Care to back that up with some facts.

Also linked lists are very prone to leaking and very hard to figure out which one to free first.

So you a) have the library code do that ... and b) have lots of tests. Of course you want to do both of those for something using a single block of memory too, and if you want it to be efficient it usualy does something clever to avoid copies ... and so is probably more likely to screw up and use/deallocate the wrong thing. As for leaking memory, that's pretty much built into the design with a single block of memory model, unless you want to keep calling realloc() to shrink the buffer (then you just blow the fragmentation of malloc to hell -- which if you're lucky will just make things slower).

The "idea" of the article wasn't bad, it's just a very bad description of the proposed solution ... and a couple of years too late.

--
ustr: Managed string API with ave. 44% overhead over strdup(), for 0-20B

Not News by SkewlD00d · 2004-01-08 01:28 · Score: 3, Informative

Embedded systems do this, they have a pool of Buffers and a BufferManager that allows you to do effectively your own memory mangement (and in some cases, static memory management). malloc() and free() are usually really slow, so if you can save 99% of those calls by reusing memory blocks, you can really speed up your programs.

--
The biggest trick the devil pulled was letting lawyers become politicians so they can write the laws.

Re:What a crap story... by Anonymous Coward · 2004-01-08 15:30 · Score: 2, Funny

How about this: Solve all of your memory management problems by switching to visual basic! All memory management is done automagically. No need to even think about it! Just hook up your data bound controls and write your logic code. No more memory worries :)

Re:realloc by Nevyn · 2004-01-08 17:24 · Score: 2, Informative

Linked lists require links -- this puts pressure on the L1 cache.

I said facts, not theories. Yes, I know all about cache and his friend random access. Maybe you'll take a noticable cache hit when moving between nodes ... and yes, certainly doing memcpy() over a single large block X will be faster than doing it X/20 times for 20 byte nodes (20 was the figure given in the IBM "research" article). But that doesn't take into account the time taken to call realloc() each time to expand the large block ... or the amount of space wasted if you have a large amount of data followed by mostly small amounts of data ... or the sharing that is possible in a node based system (thus almost eliminating the cost of copying).

And I did provide the simple benchmarks (which showed that with sane node sizes the vector approach is better than realloc() + memcpy()), and code for somewhat more complication cases that try to model simple network IO applications (which show the single block of memory is only competitive when you use more memory and add complication to the algorithum to reduce copying ... or use lots more memory).

But feel free to prove me wrong, I'd certainly study anything you came up with. And yes, I'm taking the csv code into account ... on normal input the node based code was within 50% IIRC, and on "difficult" code it was over 100% faster (again IIRC) ... and your code was a lot more hand optomized.

--
ustr: Managed string API with ave. 44% overhead over strdup(), for 0-20B

So use C++ already by Animats · 2004-01-08 20:14 · Score: 3, Insightful

Watching C programmers create home-brew objects is pathetic. If you need to encapsulate data structures, use C++.

Yes, C++ has a host of problems and Strostrup and the C++ committee refuse to fix them. But the STL is a huge improvement on malloc/free. (They still can't get auto_ptr right, though.)

Slashdot Mirror

C Coding Tip - Self-Manage Memory Alllocation

31 of 142 comments (clear)