The New Garbage Man
We've all heard of "garbage collection" in reference to cleaning up after yourself after a memory allocation. Two graduate students at the Illinois Institute of Technology are proposing a new method of garbage collection and memory allocation called DMM. What they are trying to do is "map basic software algorithms such as malloc(), free(), and realloc() into hardware." Sounds as if it has some promise, but can it ever be as flexible as some of the memory managers used today?
If they were to do this, it would probably increase performance. But marginally. At least in linux programs, when I've done profiles, most of the cycles were taken up in console i/o. Printf, etc.
It would also require a modification to the kernel - so that the brk() system call would simply call the hardware variants.
I think that it probably *would* remove flexibility, but at the same time I think that's a good thing. Part of the problem with memory leaks and stuff is the flexibility...
Then again, I might just be talking out of a bodily orifice.
If you can't figure out how to mail me, don't.
For linux tips: http://www.linuxtipsblog.com
Cripes, you people are slow today. Did you notice that NOT ONE of the early-posters correctly anticipated their position?
Week files that can't survive a crash should be abandoned and deleted. you heard it here first!
What's the point of trying to speed up memory allocations? Why not just make the hardware faster?
I'm not giving everyone the right to go out and write inefficient code. It just seems that this research will not get very far. By the time they have something working, memory will be faster than their hardware solution. Sure, just use their hardware to make things even faster. Only if they get the cost down very low.
Barjne looked at some specialized machines. They just weren't cost justified. I would have to think that is the case here.
I'd rather see research into fabricating memory that ran at core processor speeds. That would speed up every memory access, not just the malloc()s and free()s.
--
then it comes to be that the soothing light at the end of your tunnel is just a freight train coming your way
After working on a project that was rewriting a heap allocator (I just wrote the test program, but I did read up on memory allocators quite a bit), I realized exactly how complicated memory allocators can be and more importantly, how little the community knows about exactly what is the optimal way to do memory allocation so that both speed and fragmentation are mimimized. Given this uncertanty it is probably best to keep the allocator in software so that it can be modified.
As I was frying up some Crepes, Blini, Okonomiyaki, Palachinta, Latkes, Pfannkuchen, and Flapjacks for the Annual "Pale Hand of Discomfort" Ninja powwow, I paused to reflect on "The New Garbage Man."
Indeed, "The New Garbage Man" is truly one of the world's most excellent events between the time it occurred and shortly thereafter.
As a NINJA of the secretmost Corduroy Chrysanthemum rank, I command all subordinate Ninjas (yes, I mean YOU) to work for "The New Garbage Man" using any and all methods available (and you KNOW what I mean).
After all, if "The New Garbage Man" has its way, we will soon all be free. Our capitalist, pancake-loving Ninja way of life will be enhanced immeasurably. We will finally have enough hot grits to pour down each others' pants, much less our very own? Our bound and petrified Natalie Portman meditation shrines shall be more glorious than ever!
Remember, fellow Ninjas, grip your shurikens between second and fourth fingers and keep your swords no less than a centimeter out of their sheaths.
We must be prepared to assist "The New Garbage Man."
Why is having this 'software flexibility' important? I mean, most of the programmers I know just use the OS's memory management stuff. (and, if they really want to they can even do there own memory management). So it's not like the application programmers really lose any of there flexibility anyway. Its kind of like 3d hardware, some of the really crazy demoscene demos might not be possible with hardware acceleration, but anyone just using a standard API being rendered in hardware would notice an enormous improvement in speed.
But it seems like, unlike games and demos, most programmers are using "APIs" for there memory management. So I don't think switching to hardware would do anything but speed things up. It could also allow for system types that we haven't seen before, with much, much, more use of dynamic memory.
[ c h a d   o k e r e ]
ReadThe ReflectionEngine, a cyberpunk style n
Transparent garbage collection has always been a good thing...it's just been a slow good thing. Hopefully, hardware support for garbage collection would simplify matters in the way that hardware threads could simplify the problems of multithreaded code.
(Of course, both problems become easier if one programs in an appropriately intelligent functional language, not C -- a language that facilitates integrating such things in a transparent way -- but that's most definitely another rant...)
When I hear questions like "but will it be flexible?" I can't help thinking that it sounds a little like whining from folks comfortable with their current programming model. Considering the number of problems that memory management gets programmers into, you'd think that any move towards a fast, automatic system (think George W. -- Java on cocaine) would be welcomed...
The whole idea behind RISC is that the hardware provide FAST primitives that the upper levels can use to create more complex structures. Putting malloc free etc into hardware would just make things more complicated and maybe faster in the VERY short run but would surely not be as good as just making a DAMN fast memory subsytem or an awesome processor.
Thoughts on tech, Software Engineering, and stuff
"The Java virtual machine does not assume any particular implementation technology, host hardware, or host operating system. It is not inherently interpreted, but can just as well be implemented by compiling its instruction set to that of a silicon CPU. It may also be implemented in microcode or directly in silicon." from Chapter 1, the Introduction.
Since the JVM is responsible for garbage collection, this implies that the concept of implementing memory management in the hardware is at least 4 years old.
I find it hard to believe no one's thought of this idea before for C, C++ (and so on) either.
I wouldn't knock it till they try it or, at least, someone publishes some calculations that prove or disprove the possibility of major speed-ups.
"Sometimes the light's all shinin' on me, other times I can barely see."--R. Hunter
Although a better method would be nice, why did they have to go and use FrontPage 2000? The Bugged1
LET THE HATE WITHIN YOU TAKE CONTROL!
UNLEASH THE POWER OF THE DARKSIDE ON THIS POST!
FEEL THE POWER OF NEGATIVE MODERATION!
YOU WILL GET A RUSH OF POWER!
FEEL THE HATRED, GIVE ME YOUR POINTS!
POINTKILLA!
Wow, and you have such wonderful grammer yourself, I can see how bad grammar can be annoying to you. And Frontpage, well a lot of Windows users use that (don't ask me why...) maybe they just don't Care about art? Frontpage comes with office, and frontpage exspress comes with windows.
as for the grammar, maybe there all chinese or somthing.
JWZ's thoughts on garbage collections.
How is this different from what Lisp machines (Symbolics and LMI) did 15-20 years ago?
FrontPage's lack of proper software garbage collection is the perfect environment to test hardware garbage collection in.
-Joe
this seems as though it would be another one of those innovations that causes problems. why? it sees as though it will make it so that programmers no longer have to code correctly, and we already have enough of this garbage (pun) floating around. it's time to stop making it so easy for programmers not to have to code properly; myself included. i WANT to have to worry about maloc() calls etc. 'nuff said
............ no.
Your german is terrible.
Meine Deutsche ist gut!!!
Deine Englisch ist nicht gut!!
Mir = gut!!
Dich = nicht gut!!!
Cuba = gut!!
Amerikanish = nicht gut!!!
Slashdot = gut!!!!
microsoft.com = nicht gut!!!
Meine Detsche = gut!!!
Deine Englisch = nicht gut!!!
Kommunism = gutt!!!!
Kapitalism = nicht gut!!!
NATALIE PORTMAN = SEHR GUT!!!
BILL GATES = NICHT ZO GUT!!!
Wieviel ist meine Deutsche??? Ich mochte der Feedbach (auf Englisch: "feedback")!!!
Danke!!!
I don't think this would make much difference speed-wise of course, but it could be helpful against memory leaks.
There's no reason for a sig here.
Garbage collection actually takes very little resources. The advantages of hardware algorithms performing this action does not seem like it is worth it. First, memory prices are extremely high, to add this technology it will only increase prices. Second, there is only a slight gain in performance. Third, memory is tricky, it can become corrupted rather easily, this could cause complications. Remember that it is the OS that is responsible for managing resources.
A lot of the posters here don't seem to get it. Two things to keep in mind:
.apparently the man can do no wrong, if you ask him...
1) Garbage collection is good for two reasons:
i) You simplify your interfaces. W/o garbage collection, your interfaces need to be designed to keep track of what has or hasn't been allocated. This makes programmer simpler. If you don't use GC, you essentially are "reference counting", which is one of the worst performing GC algorithms, in fact.
ii) You'll have fewer memory leaks. Memory leaks are difficult, often impossible to detect in large applications. W/o GC, the amount of memory leaked is unbouded in the program size. With GC, you can formally prove that memory leaks will remain bounded.
So with all these good thing going for GC, why is it not commonly used?
1) It is often slower.
2) It's not a standard part of C/C++. (But it is in Perl, Java and just about every other "modern" language).
The hardware knows when a reference/pointer is used. Software often doesn't and has to rely on techniques like reference counting or mark&sweep. This would be a *lot* easier if a few simple hardware hacks were throw in (not too many transistors). Remember, b/c of Moore's Law, CPU makers have more transistors than they know what to do with. It's kind of like "test and set bit"... implementing this in hardware is waaay easier than implementing locks in software.
As for #2, well talk to Bjarne about that..
-Amit Dubey
Go ahead, call it flamebait, but it's true. I can only assume that the apparent obsession with garbage collection by some people is a result of their inability to manage memory on their own, which is a consequence of their incompetence. Memory managers and garbage collection add overhead. You don't need this overhead if you release the memory yourself when you don't need it. If you don't release the memory when you don't need it, then your writing bad software.
"Two graduate students at the Illinois Institute of Technology are proposing a new method of garbage collection and memory allocation called DMM." NO, they are proposing building new hardware to optimize DMM (dynamic memory management). News for geeks means news that are technically accurate, even when talking about fairly silly research projects.
If this is implimented then only C languages would benifit from this, OS's like Linux, possibly Windows, but non C languages, VB, Pascal, etc wouldn't benifit from these because they couldn't use them effectively.
Instead of moving a few C functions to hardware I'm with most people here who think that we should just create a faster memory subsystem.
-- iCEBaLM
People complain about FrontPage and grammar? That's hardly a valid reason. Having met two of these individuals (Chang and Srisa-an), I can say that I do not believe either one to be a native English speaker. Both are, however, very knowledgable about computer architecture and hardware design.
I'm not saying that this is a good/bad idea, or that it will/won't improve performance. I don't know. I do know that the two men I've met are two of the very few people in IIT's CS department who might come up with something usable, regardless of their lack of English and HTML skills.
The English I saw browsing the pages was really not bad, and even if the page was done in FrontPage it's still better than the main IIT site (take a look if you don't believe me). Many successful open-source software projects involve people who do not use flawless English, but that does not affect the quality of the work.
I guess the PC just had to be "PC", eh?
OK, flame me, but you *nix programemrs really ought to free your own damn memory when you're done with it instead of just leaving the OS to clean up your mess for you when you terminate. I've had to port a number of *nix programs to DOS/Windows/MacOS and get pissed no end at the laziness of the unix-centric mind. You have a free() function. USE IT!
Obliterate all forms of electronic devices and go back to paper and pencil. I am evil and judgemental, and I make no pretense toward fairness. This is my OPINION, and by that you may just learn a little bit more about me than you really want to know.
1) This *IS* a hardware solution - read the article before replying.
2) This doesn't counter RISC philosophy (ie many RISC chips have test&set bit instructions)
3) GC *IS* good. GC is for incompetant programms just like "High-level" languages such a C were for incompetant programmers in the 1970's and 80's.
why would you need hardware-based malloc() and free() functions with the ongoing development of faster processors? In a year we will have faster processors, which will illiminate the need for faster memory access. The faster the cpu, the faster the memory.
Wrong. Faster CPU= Faster CPU. The memory will create a bottleneck. Memory needs big improvement.
They are going to insert malloc/free into hardware, that doesn't mean they will build garbage collection into the hardware (which is a completely different thing). However, as the project description says, it could improve garbage collection performance, but that is another issue.
DMM (Dynamic Memory Management) is a well known term, not something new as the slashdot article tries to make it sound like.
Building DMM into hardware it not a new idea. However, I'm not really sure if it has been successfully implemented in any "real" CPUs.
People would love to write programs for such systems using a high-level language like Java or C++. The main issue is that real-time systems must be able to deterministically predict how long any operation will take. Problem is, with current implementations of runtimes, the garbage collector could kick in at any time and take some arbitrary amount of time to finish, totally messing things up.
Lets say you've got a software-controlled robot arm, putting together a car or something (cheesy example). A start command is executed, and the arm begins moving. Then, right before the stop command was supposed to be executed, in comes your garbage collector. It does its thing and finishes, and the stop command executes - 50 milliseconds too late, putting the arm out of position. Oops - I hope you weren't planning on selling that car.
But if you can deterministically predict when and how long all of the operations will take, voila, you can just take them into account when scheduling what happens and everything is all hunky-dory. The hardware part of it is a bit of a distraction I think - the deterministic part is what's cool.
Maybe interesting in this context, a language called 'C--' was developed last year to serve as a portable assembly language with garbage collection. It was intended mostly for compilers of functional programming languages (many fp compilers generate regular C code). Here's a link (yes, at microsoft!)
On the flip side, Windows should release the memory of a process after termination. Memory leaks should not persist past the termination of a process.
And your god, BG, should take that advice too, I've yet to see ANY program except for netscape with more memory leaks than Windows.
If you can't figure out how to mail me, don't.
For linux tips: http://www.linuxtipsblog.com
uh. you obviously have no idea what you're talking about.
Sigh. Have you ever written a garbage collector? Think about it- the people who cared enough about garbage collection to implement it in Lisp, Java, Python, etc. did. I've written one too, and I promise, you MUST have an intimate knowledge of memory to get one working. It's not incompetent programmers who write these things.
So why did those non-incompetent programmers who know all about managing memory on their own bother with writing these tools for the weak-minded? Mainly, it's that good programmers understand the principles of program design. One important design principle is that when you want to know what your program is doing at the top level, you shouldn't have to care what is really going on at the lower levels. That principle is the reason we have operating systems and compilers, for example. All of those things allow the programmer not to have to worry about what goes on under the hood, not because the programmer is stupid, but because the programmer's time is better spent solving the problem than dealing over and over again with low-level details that have nothing to do with the program at hand as opposed to any other program in the world.
Garbage-collected languages do the same thing- you trade a little speed (the idea that garbage collectors must be slower than hand allocation and deallocation actually turns out not to be true, incidentally, but in practice it often is) for the ability not to have to think about a ridiculously machine-level idea (id est, 'in which cell of memory am I going to write down this particular chunk of data?') so your code can just concisely reflect what your program actually does- quicker to write, easier to read, better all the way around. A smidgen slower, but who cares? You can certainly write programs of acceptable speed in garbage-collected languages, so it's no big deal unless you're writing something really speed critical, like tight graphics rendering loops or something, in which case us softheaded garbage-collection-needing incompetent programmers just use inline assembly. =)
-jacob
you really need to shut up. Just because someone forgets to call free() doesnt mean the machine is going to decrease in speed, or crash, or do anything noticable to the human eye.
Don't they see the future is on software? Transmeta teached us nothing?
What is the point on getting a slight speed increase when you loose all modularity and freedom?
Polymorphic Software is the key, it is on Alpha Centauri!
A perfect match.
Depends if (a) you're a basic profession is programming rather than needing to write programs to solve problems and (b) you're writing highly fluid programs with very complicated data structures (needed to solve real problems) so that when suddenly the entire basis of the program changes it's incredibly difficult to figure out which memory freeing times have changed and where the safest place to deallocate them is. (And when I talk about complicated data structures, I mean complicated, not the namby-pamby things you get in programs dealing with pure computer problems. If so then not having to do something that's as error prone as ensuring correct new/delete behaviour is a bonus that lets you solve your problems quicker. Saying that not wanting to waste mental effort on something irrelevant to the task in hand make me incompetent is a bit like saying that, because I've only passed a conventional driving test I'm incompetent that I can't drive my car on two wheels, leap over trucks and do other car stunts. Sure, such technical skill is acheivable and is needed by, eg, stuntmen, but since I only drive to get to and from work, it's not a relevant skill. Bottom line: anything that gets me the answers I want from my programs with less programming effort is a good thing.
None of those machines have caught on. The reason is that generally, you seem to do better if you invest the same kinds of resources into just making your processor faster.
Simply implementing existing malloc/free algorithms in hardware is unlikely to help. Those are complex algorithms with complex data structures, and they are probably memory limited. Where hardware could help is by changing the representation of pointers themselves: adding tag bits that are invisible to other instructions (allowing pointers and non-pointers to be distinguished, as well as space for mark bits), and possibly representing pointers as base+offset or some other kind of descriptor that makes it easier for the allocator to move memory around without the C/C++ program knowing about. As a side benefit, this kind of approach could also make C/C++ programs much safer and help run languages like Java faster.
But altogether, after having seen several attempts at this, I'm not hopeful. Unless Intel or Motorola do this in a mainstream, mass market processor, it will be very expensive and not perform as well. If you want those features, you end up getting better performance at a lower price by simply buying a mainstream processor without the features and simulating what you need.
If they're planning on doing hardware-level memory management, they're probably going to be improving the control algorithms for the TAG and/or CAM memories.
Your basic cache/main-memory hit, miss, refresh, and stale address mapping functions, but with a more flexible and comprehensive interface.
--The more you know, the less you know.
Argh! Moving stuff to hardware, when we have long struggled to get rid of esoteric hardware instructions?!? The whole idea behind RISC is to remove this kind of stupidness.
History is full of examples where this kind of hardware "features" made things faster originally, but has since become bottlenecks instead of improvements. "rep movsb" made sense on the 8086, but on a Pentium Pro it's just another thing which makes the chip more complicated, big, and power-comsuming.
And as for garbage collection, those of you who say that "GC is slow" are just plain wrong. There is a lot of research on garbage collection, and nowadays a good GC may be faster than manual malloc/freeing. It does not lock up your programs either - there are incremental GC's which you won't notice. They are even used in real-time environments today.
A major hindrance for good GC's in commonplace programs is that people usually write in C(++), and you can't have a good GC in a programming language which allows pointers. Just to give an example: I once saw a program which used a doubly-linked list. But instead of storing "next" and "prev" pointers, it only used one value - the XOR of prev and next! It was possible to traverse, since you always knew the address of either the previous or the next entry, so getting the other was just an XOR operation. But please tell me, how the h* is a GC algorithm supposed to know this? Even if you use the loosest approach, "treat every single word in memory as a possible pointer", it will fail - since there are no real pointers at all!
So the result is this: There are lots of good GC's around, but they can't be used in C - so people don't know that they exist. More or less.
He mentioned he was porting to windows. In my experience, the windows kernel is much worse at handling things like that than the unix kernel is. Sometimes you have to reboot to free the memory.
If you can't figure out how to mail me, don't.
For linux tips: http://www.linuxtipsblog.com
No. Memory is limited by latency. It doesn't matter if the CPU is 100 ghz, if it has to wait on the memory the whole system will slow down.
It's like if you're in a car, you can go 150, the speed limit is 150, but the guy in front of you is going 50. The speed of your car doesn't matter, and the speed limit doesn't matter either. You're going 50.
Kindly hold back your flames till you know what you're talking about. It makes you look stupid.
If you can't figure out how to mail me, don't.
For linux tips: http://www.linuxtipsblog.com
Wieviel ist meine Deutsche nicht gut??? Sind specificish!! Und ich werde mach gutten!!!
Ich bin neu auf Deutsche. Aber ich learne!!
Danke.
if you had:
struct LinkedList {
LinkedList *next;
};
LinkedList *a, *b;
a = new LinkedList;
b = new LinkedList;
a->next = b;
b->next = a;
Then you never used a or b again, a reference-counting GC wouldn't be able to collect them: both have a non-zero reference count!
Considering that circular linked lists *do* happen, this is a BAD thing.
Female Prison Rape in NY
Specialized GC hardware is an old idea--is this something new, or just a rehashing of something that's been done?
I disagree with you that writing a GC has to be difficult. I've written several, and all of them have been a page or less of code. Here's the GC from my minimal Lisp interpreter, which I call Stutter:
voidgarbage_collect(void)
{
CELL*cell;
inti,count=0;
mark(binding_list);
for(i=0;i<protect_used;i++)
mark(protect_table[i]);
for(cell=heap,i=0;i<heap_si ze;cell++,i++){
if(!cell_mark(cell)){
cell_car(cell)=free_list;
free_list=cell;
count++;
}
cell_mark(cell)=0;
}
}
Surely, this is not rocket science.
-- GWF
The Computational Beauty of Nature
I like that!
Grammatik ist fur analeserhaltendes Scholarn und Smartyhosen!! Ich bin nicht einem NERD, ich mochte nicht Grammatik! So lang wie Kommunication ist nicht hindert, Grammatik mattert nicht!!
Cocoa (OpenStep) solves this pretty easily. Anytime an element is added to a data structure, it is retained. When the stucture is released, everything iside it is released as well. If that drops the reference count of the object to 0, it is freed. Cocoa also has an autorelease pool that is emptied between cycles of the event loop, so you can keep memory around for temporary operations.
I understand that this is working at a higher level than a garbage collector would work at, but it works very well.
uh. not really. In fact, you've got it all wrong. Better CPU's *do* increase SRDIMM memory, especially 100Mhz dimm/srdimm/sdram. The same ram in a slower computer would be much slower, due to the frequency that the cpu voltage is able to travel at (in between the memory, cache, and cpu.) It's not that hard to understand; think about it before you post false information.
It's not a matter of "better or worse", but of "different". If you're porting to Windows 3.1 or Windows 9x in 16-bit mode, then yes, you can corrupt the global heap by failing to release memory after a process terminates. That's because all global memory in those models is shared among all processes, and so the system can't clean up without possibly destablizing some other process. Bizarre as it sounds, that not a bug, it's a feature. It's one of the ways that Windows 3.1 and 9x preserved backwards compatibility with DOS, in which all processes had relatively direct access to the physical memory.
If you're porting to Windows 9x in 32-bit mode, or if you're porting to Windows NT, then, no, processes clean up after themselves when they shut down. That's because Win32 uses an explicit shared memory model, with shared memory retained against the file system. Thus, if two processes share memory, and the memory that one of them holds is released by process termination, memory held by the other process doesn't get corrupted.
First of all, thanks for pointing out that hack for implementing a doubly-linked list with the XOR of prev and next. I love interesting tricks like that.
Garbage Collection by Jones and Lins, point out that even if a programmer stays away from code like the hack in your example, an optimizing compiler can still make life difficult for a garbage collector. They give the following example (pg. 247):
GC for procedural code is optional. A good idea, IMHO- but still optional. If worse comes to worse, you can always "bubble up" responsibility for deallocating the memory to the block in which encompasses the entire lifetime of the memory. For example, if you had:
/* use mem_p */
/* does mem_p need to be allocated? */
/* use mem_p */
/* does mem_p need to be freed? */
...
...
/* we don't need to free mem_p here */
...
void * mem_p;
void foo(void) {
mem_p = malloc(somesize);
}
void bar(void) {
}
void bang(void) {
foo();
bar();
}
We can refactor this to:
void foo (void * mem_p) {
}
void bar (void * mem_p) {
}
void bang (void) {
void * mem_p = malloc(somesize);
foo(mem_p);
bar(mem_p);
}
We can do this because the life time of the body of bang() completely encompasses the life time of mem_p- which neither foo() nor bar() does.
Unfortunately, OO programming makes GC signifigantly less optional. And exceptions make GC no longer an option. Consider the following C++ code:
{
someclass * ptr1 = new someclass();
someclass * ptr2 = new someclass();
}
Can you spot the bug? If the program runs out of memory trying to allocate the second class, new throws an exception. But since the exception isn't immediatly caught, the stack is simply poped, and the memory pointed to by ptr1 is leaked.
The solution, as any C++ programmer will tell you, is to use smart classes and smart pointers, which implement GC. I.e., if the language doesn't have GC, you have to add it.
There are other reasons to use GC- by using GC you can more effectively divorce the implementation of a class from it's interface. In a GC system you don't have to export information about the memory management aspects of the implementation. This is especially important if you have two different classes implementing the same interface, but with different memory management requirements.
Which is faster, GC or manual allocation, often depends upon both the program in question, and the GC algorithm implemented. There is some data indicating that copying garbage collection is the fastest in complex situations- what you lose copying the data you win back in better cache and virtual memory utilization.
Implementing malloc() and functions like that doesn't seem to be such a good idea.
Refenrence counting is the more basic solution to the same problem.
> If you don't use GC, you essentially are
> "reference counting", which is one of the worst
> performing GC algorithms, in fact.
Maybe so, but wouldn't it be a lot smarter to implement a couple of instructions that would speed up reference counting consirably, instead of implemention a couple of high level functions?
Only to a certain point. Once the cpu speed overreaches the latency of the memory, the memory becomes the road block.
It introduces "wait states", which is basically a clock cycle where the cpu isn't doing anything but waiting for the memory to respond. These add up.
If you can't figure out how to mail me, don't.
For linux tips: http://www.linuxtipsblog.com
I work for a large software company. One of my major jobs is training new developers in "best practices". What you've just said is one of the classic things that people in academic circles believe...but which is utterly and unforgivably wrong. I spend a lot of time teaching them that they do care very much about how the system works, and that they care a lot about how the memory allocator in their system works.
You see, most modern commercial software is well-designed. But efficient design will only take you so far. One of the key performance bottlenecks on an modern computer is simple and straightforward: page faulting. The basic joke we tell is that doubling the speed of your processor gets you to your next page fault twice as fast...and it takes as long with a newer machine as it did before. This means that the cheapest way to get a performance improvement is to avoid page faults -- and you do that by avoiding heap allocation at all costs. If you need to do a malloc, do it in two stages: allocate a small buffer on the stack, test the space needed, and use that buffer if its big enough. It's on the stack, so it won't get page faulted out. (But always check that it is big enough...there's this thing called a buffer overflow condition that you risk there.) Only fall back on the heap when you have to. Allocate several continguous pages of memory to handle list nodes, and use them. Only go outside that block when you must. Etc.
And this is for any program. Do you allow the user to undo things he's done? Then you maintain an undo stack. Allocate nodes for it efficiently. Are you doing searches in dynamic lists? Allocate your tree efficiently. It's not enough to know that you need to use a B-tree or an AVL-tree to keep your data around; you must also make sure that you keep stuff compact. Do you write in C++? The x86 code to support C++-style exceptions (or Java-style exceptions) is fantastically expensive, since the x86 doesn't support dynamic context unwinding in hardware. Maybe you're better off doing nothing in your constructor, and using an explicit initialization step afterwards. That way, you don't need to handle the throwing of exceptions.
Don't get me wrong: good design is the most important key to successful coding. A well designed piece of software can easily run ten times faster than a badly design piece of software that does the same thing. But implementation is also critical, and that often means knowing more than just how the algorithm works, but also means knowing how much operations cost on the target system.
You know *YOU* want to see it too, so don't be a bloody hypocrit.
Question: is this OpenStep or Objective-C? At any rate, you still aren't saved from circular references.
Let's say you had a function LinkedList *CircularLinkedList() that returns a circular linked list.
c = CircularLinkList();
OK. The count of the first element in the list is incremented. No problem. You do some processing, then you do
free(c);
the first element in the linked list has it's reference count decreased. It's count is now 1. And it will remain 1 even though the guy who is refering to him is unreferencable!!!
This is a general failure of reference-counting techniques. There are other shortcomings, including extra space to keep track of ref. counts, and keeping track of the counts them selves. Think about: every time a pointer is assigned, passed to a function, or returned from a function, you need to increment or decrement some counter *somewhere* in memory. And this can't be parallelized like mark&sweep can and run when your main process it waiting - it takes a hit on the main process itself.
It may work "OK", but better methods have long since been developed. I hope that I've explained this well, but if you still think reference counting is good, then obviously I haven't.... because this isn't a contencious point.
I mean complicated, not the namby-pamby things you get in programs dealing with pure computer problems. If so then not having to do something that's as error prone as ensuring correct new/delete behaviour is a bonus that lets you solve your problems quicker.
:)
I think this line of thinking is plain ol wrong. I understand what you're saying, but I think you're giving up tons of performance, especially in cases of, as you said, REALLY complex data. For instance, if you're using something that takes care of all the memory management for you and allows you to concentrate solely on the problem, something like, say, java, you might be able to express your data conversion routine quickly, but oh-my-god is it ever going to suck to actually parse a few gigs of data.
At that point, having efficient memory management AND a tight loop could save you hours or even days of mining time.
Bottom line: anything that gets me the answers I want from my programs with less programming effort is a good thing.
Be sure to put that at the front of your user manual, I'm sure the people waiting on your code to do what THEY want will appreciate your insight.
And, damn is it ever funny that that post got moderated up.
-- blue
i browse at -1 because they're funnier than you are.
for all you people whining about learning from RISC architecture, and complex hardware, realize something. this is two grad students and their advisor working on a project to get their masters degree. how many people are there out there with a masters degree in cs? how many of these peoples projects actually turned into somthing used by a major hardware company? probably not many. but their research can still turn up a lot of useful information. that's why it's research. for example, how many people have read linus' comments about the academic community and micro kernels? he basically slams on people who advocate micro kernel technology, but he turns around and says that most of the optimizations that can be used to make micro kernels faster can also be used to make monolithic kernels (ie linux) faster as well. so maybe you think that the study these guys are doing is useless, but what they learn in this project may turn out to be very useful information. and they may do other things that could be very useful. (for example, i may be doing some work with one of these guys on the implementations of threading and garbage collection in kaffe. maybe that's not exactly related, but im sure the studies he wants us to read on garbage collection came from his work on this project)
oh andas far as comments about their html or english, neither of those is particularly relevant to their research. i've had both of these guys as TA's for cs classes here at iit, and i had dr. chang as a teacher. all three of them are not native english speakers and have very heavy accents. but while they are difficult to understand, they are all extremely knowledgable about hardware and hardware design, and are some of the brightest people i have met in college.
If I don't put anything here, will anyone recognize me anymore?
One of the biggest problems with garbage collection is that it can't really be controlled. If you're writing Quake3, you do not want garbage collection. You don't want the program to start cleaning up memory right when you're trying to generate the next frame...
BUT, if they do this and make it so you can, say, force the garbage collection to do its thing at a given time, maybe that could be do-able... still pretty crappy for high-speed apps though that need consistent speed...
Esperandi
That's a cool trick! Although it probably doesn't make the program especially readable. Does anyone else have any good tricks like this? Preferably relating to garbage collection but offtopic will do
perl -e 'fork||print for split//,"hahahaha"'
Eh? "How much is my German?"
Too much, I think. I've got nothing against posts in German, just not when it's random, offtopic, offensive drivel.
And turn caps lock off, or make friends with tolower() before posting.
perl -e 'fork||print for split//,"hahahaha"'
(Score:-1, Offtopic)
What a *brave* moderator we have here! Thick gloves, I suppose, but no thick condom? Nah, I guess he was just bluffing, he really meant "Troll".
hehe
-
From the site (DMM):
>The memory intensive nature of object-oriented languages such as C++ and Java
From a reply:
>why not make the hardware faster?
Fellow programmers,
am I the only one who still remembers Assembler and the intensive search for memory preserving methods ?
Am I the only one who tries to make things fast, without thinking about the processor ?
In the 70s we struggled for every byte we could spare (yes, and we created the Y2K problem).
"Star programmers" like me even changed the code during runtime, by moving new instruction over old ones.
Yes, it was very hard to read, but it was top efficient.
Fellow programmers,
are you all too young or what happened to the common sense ?
If I have to solve a problem for my daily life with my machines, I FIRST check if a SHELL SCRIPT can do it.
Not, because I'm too lazy to use C, but because it might be faster.
If you run The Shell, there are inbuild commands and external commands, some having the very same name.
Ie: A "test" runs faster than a "command test".
BUT, YOUR:
case "$1" in
hello) COMMAND
esac
RUNS *** NOT *** FASTER
If "test" is inbuild in your shell (and it is, folks) and you write:
if test "$1" = hello
then
COMMAND
fi
Am I the only one who knows that ?
Java needs a lot of memory.
It's hip.
But what is Java REALLY ?
Nothing than an interpreter language that just happens to have support by MS and Netscape (and due to this, now in our Linux kernels).
In the 80s we used (and I still do) REXX.
It's also an interpreter language, can also be compiled to object code.
It runs on MVS, VM, Unix, DOS, MS Windows.....
At that time, there were just no browsers (besides Mosaic).
It can do everything Java can,
the only reason why you guys and girls use Java is that Big Companies ship their interpreters with their browsers
and it looks like, as if Java runs on it's own, like magic.
The only thing that runs on it's own is Assembler.
Now I don't want to say you should use Assembler,
but I think I need to remind us all,
that hardware becomes faster and faster,
and that because of this, programmers get lazy and code stuff that runs on state-of-the-art hardware,
but would run 4 times as fast, if these programmers would first think about RESOURCES.
A good program will run on a 486 in a reasonable speed and on a Piii like "real time".
I want programmers to THINK about how they code.
It IS possible to write applications that do not need extensive memory and a fast CPU (or two),
IF the programmer would first THINK about how to write something and optimize the code himself, not only with a standard optimizer (if he uses that at all).
Read "The Unix Programming Environment" by
Brian W. Kernighan and Rob Pike.
After that, your programs will run 4 times faster.
Replies greatly appreciated,
fine day y'all, george./
When all of academia thought RISC, microkernels, functional languages, and Java were the greatest things since canned beer. And guess what? They weren't. All academic research must be taken with a huge grain of salt. This includes GC.
For example, what can GC do if the programmer has a circular chain of objects which all reference each other but aren't referenced by anything else? Alternatively, what if a programmer creates a hole bunch of objects in the beginning of a program (for example in the first line of the main() function). If the programmer doesn't use these objects anymore after line 10 of main() will the GC clean them up at line 11? It seems like the objects are still in scope so the GC couldn't delete them. However they are never used again so they sit around wasting memory. Using malloc and free you could malloc a ton of objects in line 1 of main() and free them in line 11 of main().
Are their GC algorithms which can be proved to handle pathological cases such as this? You might say that this is an irrelvant question because nothing can save you from stupid/malicious programmers. I would agree. However, GC is often touted as protecting programs from stupid programmers or programming mistakes. Yet if you can't prove that GC always works then it seems like GC just makes memory management bugs less common and much more subtle. After many years of C/C++ programming, I'd rather have many simple memory leaks which I can track down with various tools as opposed to rare but impossible to understand bugs due to problems with the GC.
I'm not trying to put down GC. It's just that when I've used various GC langauges such as Tcl or Lisp I've encountered things that seem a lot like memory leaks. Is this because of buggy implementations of GC, bad programs which prevent the GC from working or because of fundamental limitations of GC?
By the way, I've heard that in languages like Java you need to set objects to NULL after you are done using them to be assured that the GC will clean them up. I haven't actually used Java, but it seems like that would defeat the whole purpose of garbage collection. After all, if I always have to remember to set objects to NULL then I might as well just always remember to call delete! Please tell me that the person I heard this from is wrong and Java isn't that lame.
Thanks.
This is good stuff,
Whatever they can accomplish with hardware would take that load off the processor. There is a ceiling to processor power. We may be far from it now but when we do reach that point in time when processors just can't get any faster this will be invaluable technology. It is also good to look at graphics cards. 3D graphics consumes an inordinate amount of processing power. Because such a large percentage of power is taken, it is wise to have a graphics card. Software memory management may go the way of software 3D rendering and free our processors for other tasks. DVD is bogging your CPU? Get a hardware Renderer. Print management? Encryption? Compression protocols? All can be done in hardware. Think: If your modem card had a compression sub processor, you could use some pretty extreme zero loss compression techniques without adding to latency. If powerful hardware compression were standardized you could effectively reduce bandwidth requirements across the entire network. As long as all these CPU power robbing functions remain software we can expect to need faster and faster processors. As I have said, there is a limit to that.
Don't knee jerk because hardware may cause your favorite C++ functions to become obsolete. Just keep in mind what 3D cards have done and you will understand the direction these amazingly forward thinking boys are taking. Not that faster processors are a bad thing. I just like to get the most out of what I do have.
If voting were effective, it would be illegal by now.
...as a metaphor for software development, however Homer as Java programmer now makes SO MUCH SENSE, I feel bad about my poor perception.
Contrary to the popular belief, there indeed is no God.
Somewhat off-topic. I like to set pointers to NULL after their objects have been freed or are no longer valid. This helps prevent the nasty situation of code referring to an object that is now a chunk of memory on the free list, or worse, allocated to some other use.
Mea navis aericumbens anguillis abundat
Go ahead, call it flamebait, but it's true. I can only assume that the apparent obsession with spell checkers by some people is a result of their inability to spell properly on their own, which is a consequence of their incompetence. Spelling and grammar checkers add overhead. You don't need this overhead if you write proper english to begin with. If you have spelling or grammar errors, then your writing bad english.
(if you can't find the mistake in the original post, then by the same logic, I guess that makes you incompetent.)
my point? People will make mistakes. If you say you don't, then you are a lying asshead (thanks to another AC for that term.) There are certain tools to help them out a little. In software, garbage collection helps protect against mistakes in memory management. Not to mention that if I can just forget about MM to begin with, that lets me write better software faster.
(I realize the example above doesn't quite fit: the you're/your distinction is a semantic error that a grammar checker may not necessarily detect, depending on the sophistication, but I don't know much about them, whereas like someone else noted, GC is a bit more predictable. At least, a lot more useful for the kinds of errors that people usually make. You get my point.)
you really need to shut up. Just because someone forgets to call free() doesnt mean the machine is going to decrease in speed, or crash, or do anything noticable to the human eye.
Bzzzt. Wrong. The bloated memory caused by leaks forces the kernel to page more often and will have a huge performance impact if enough memory is leaked.
There seems to be a number of misconceptions as to what this is all about or how might it be useful in practice. I will offer some opinions on some of the issues raised in other posts.
The first one, as already mentioned by a number of people, is that hardware implementation of malloc and free has nothing to do with GC. The most difficult part of GC is to determine which part of the memory is garbage (this is not as easy as it may seem) without using lots of memory (after all, garbage collectors are typically only activated when free memory blocks are running low), and for those garbage collectors running as an independent thread, avoid possible race conditions with the foreground processes. Other issues a garbage collector faces include how to reduce the impact on overall system performance, and how to decrease memory fragmentation.
A garbage collector is not something easy to design and implement. Making a good garbage collector especially requires the almost-impossible combination of profound understanding in the memory usage behavior of typical programs, logical reasoning abilities, and coding techniques. In addition to the garbage collector itself, you also need support from the language and compiler side, and you have to integrate all of these into a clean design. That is about as hard as things like this can be.
(Of course, you can also write a conservative pointer finding GC like the libgc used by many functional language compilers --- but that is far from the state of art in this business.)
The proposed hardware support has nothing to do with the points we mentioned above, therefore has nothing to do with GC. Then, is it possible to build some hardware support for garbage collection? Maybe, but I am not an expert in this field. Whatever the solution turns out to be, it will never be as simple as hardware implementation of malloc and free.
Second, this also has nothing to do with real time systems. Many people seems to think that being "real time" means you have to code everything in assembly language and make things run as fast as possible, but that is simply not true. Being (hard) real time means operations must respond and complete within bounded time; as long as that bound is satisfied, there is no need for the task be completed "very fast". The trick of building real time systems is in making sure that nothing will delay the operation for a long time, and that requires careful analysis and implementation.
If you remain to be convinced, think of a typical hard real time application: CD-R burning control programs. They must feed a continuous stream of data (say, 600KB/s for 4X burning) into the burner, or the disk will turn into a coaster. Is it necessary to code it in assembly? Absolutely not, because pumping data at 600KB/s is very little work for current architectures with 600MHz processors and 100MHz memory busses. However, does that mean you do not need to pay special attention to make it real time? Wrong, because although the hardware is several orders of magnitudes more powerful than it needs to be, there are still possibilities that something will make the program stop working for a second or two and screw the party. It is the responsibility of real time system designers to make sure that it does not happen.
FYI, Win2k has much more agressive memory management. This is one reason why some apps are breaking in W2k
There are many issues you should worry about when programming if you want fast code, issues which may be in the implementation details. That's real life.
Maybe a garbage collector could be efficient but current implementations are terriblly slow - a "smidgen slower" is an understatement. Specifically, look at jvm's. They use one GC thread and this is bad when for example there is a huge spike in memory allocation, this thread gets behind and memory is exhausted. This is primarily why Java just does not currently scale. C/C++ with no garbage collector will outperform java any day, esp. when there are spikes in allocations.
I claim that there is no need for a garbage collector. Finding memory leaks quickly is not a problem _if_ you have the right tools. Purify is good but it is too slow. There are tools I have developped which can show with pinpoint accuracy where the leaks are and have no performance penalty.
So if you are able to find leaks quickly, with no performance pentalty, then why would you want a garbage collector? I am still unconvinced.
2 years and no mod points. Join reddit. Because openness is good.
And, damn is it ever funny that that post got moderated up.
Moderators, stay off the crack!
Sure, and some approaches to hardware accelerating reference counting garbage collecting and a lot more described in a book with title, guess... right! "Garbage Collection" by R. Jones and R. Lins. ISBN0471941484. Highly recommended for deeper understanding. There is also a mailing list on garbage collection at http://www.iecc.com/gclist/
I'd agree that if you're doing something like data-mining where you're program is essentially static and the volume of data is enormous. But qualification (b) was important for me: you're program requirements change drastically as you try and solve your problem. (`Maybe the right answer is to move the Kalman filter into a post-processing step rather than in the middle of th algorithm...') on data sets where a typical program run might take two hours but with an absolutely horrendous mapping between physical & geometrical constraints and their expressions in the code And by the way, I'm the only user of my code: I'm employed to get good algorithms and apply them to real problems and tell them the results; if I was going to be releasing my code to external people then by definition it'd be finished, static code that could be rewritten now that figuring out memory lifetimes is effectively trivial.
Well, correct me if I'm wrong, but I'd say that JWZ has a *lot* of first-hand experience with the *need* for garbage collection. Every version of Mosaic, Netscrape or Mozilla I've ever seen has had horrendous memory-leak problems.
-jcr
"I once saw a program which used a doubly-linked list. But instead of storing "next" and "prev" pointers, it only used one value - the XOR of prev and next!"
I don't see how you can extract either prev or next from the (prev ^ next), unless you have one of them stored elsewhere, in which case you haven't actually saved any storage.
-jcr
The only title of honor that a tyrant can grant is "Enemy of the State."
No, the Windoze crew were copying everything they could from NeXT, including the "recycle" icon.
-jcr
The only title of honor that a tyrant can grant is "Enemy of the State."
If you always chase the list, no problem. If you ever need to point into the list from outside, forget it.
--
Infuriate left and right
> I can only assume that the apparent obsession with spell checkers by some people is a result of their inability to spell properly on their own, which is a consequence of their incompetence.
Bad analogy. Your spell checker only runs when you invoke the spell checker command. In a "true" GC environment the GC is allways running.
Uh, so what are those red squiggly underlines I get as I type in Word? No doubt someone will attempt to dismiss this example by slamming MS. But MS didn't actually invent as-you-type spellchecking. It was around for years on some Mac WPs before it showed up in Word.
And besides that, you completely missed the point of the analogy by getting wrapped up in technical details.
Others have pointed out that reference-counting GC's don't work with cyclical data structures (without special logic to handle those), so I won't repeat them.
One thing about ref-count GC's that's frequently overlooked is the performance impact. Consider that every time you manipulate a pointer (copy it, set it to null, etc.) you have to de-reference it and manipulate the object's refcount. Pointer copies are no longer register-to-register copies; they now involve going out to cache, main memory, or in the really pathological cases, swap space on disk.
Summary: ref-count GC's are a bad thing. There are lots of other GC algorithms which provide better results.
(Off-topic, but reason number 452 why linguists shouldn't be allowed to design programming languages without adult supervision: according to the camel book, Perl still uses a ref-count GC.)
Yes, you can always pull of some smart trick in *theory*, but does that really work in practice? Determining which block of memory can be freed at which time automatically will require global flow analysis to the program, and in my opinion that is simply not feasible for imperative languages like C. Even if you could somehow do that (which means you should receive Alan M. Turing award say, in the next ten years or so), current seperate-compile standards (object file formats) will not be able to support this kind of operation. The buttom line? This will not work in practice. Maybe not even in theory.
Computer scientists all over the world are now having great fun solving a relatively easy problem: to perform global flow analysis and find anything interesting from programs written in a pure lazy functional language (like Haskell). Even seemly trivial properties like "is the use of arrays serialized in this program?" will take too much computational power to determine so that the algorithm turns out to be useless in practice (such a property would allow destructive updates for arrays, a big performance win).
Do not underestimate the difficulty of program analysis. You make think they are easy, but hell, you make mistakes all the time, so that does not count.
Well, more or less. As part of a programming languages class I took in school, I actually had to write a garbage collector for a Scheme-like language. While one could implement references in a Java/Scheme/LISP-like language in the above fashion, that's really slow; it adds another layer of indirection that isn't really necessary. If the run-time system provides information about the types of objects in memory (which is required for Java and Scheme anyway -- instanceof and symbol?), then the GC can reach in and directly change the pointer values within the program's stack or whatever.
From the point of view of the applications programmer, though, the two implementations are equivalent.
As far as I'm concerned, the primary difference between pointers and references is that with a reference to an object of type Foo, you are guaranteed that the object referred to is a valid instance of type Foo. This is not the case with pointers. In C/C++ terms, int i; float *pF; pF = (float *)(&i); The object at *pF is not a valid instance of a float.
(Yes, you can in fact break this guarantee in C++ using a series of really crufty typecasts. C++'s references are broken, in a number of respects.)
Of course, as MassacrE points out, references are almost always implemented internally in terms of pointers, because that's what the underlying architecture tends to support.
MacOS 9 is pretty darn leaky too.
...and the end of malloc software as well.
Guilty as charged- I am "in academic circles," and I haven't ever had to work in an industry setting on really big projects where speed was very important. But I maintain that what I said was still accurate, while conceding your point: it may be true that with current tools, programmers do sometimes need to think about machine-level details (like whether their code is likely to cause page faults) in their high level designs, but it's also still true that they shouldn't have to. I have mentioned elsewhere in this thread the idea (popular among the faculty here at Rice) that you ought to first write your program in a very high-level language (at Rice, it's MzScheme), figure out where it's important for your code to go fast, and then rewrite those pieces in a low-level language where you can control machine-level details. The claim is that you get a speedy enough program, because you've optimized the hell out of the critical sections, and the stuff that it doesn't matter whether you optimize or not, which is most of your code, was written much more rapidly than would be possible with a lower-level language.
-jacob
No.............. I hate everyone.
Warning: Please reply carefully. Otherwise, you just feed the troll ;)
This is all old news. Bill Bishop, a Ph.D student at the University of Waterloo in Canada has been working on something like this for more than three years. He had malloc, new, free and delete in hardware implemented on a high end Xilinx FPGA a number of years ago... He was my T.A. when I was a 2nd year student.
NT has had numerous issues about system processes and apps leaking memory while running. This is different, though.
If you manage to run an NT box for more than 14 days -- and some people do -- then you might experience a different kind of "leak": The build-up of strange, inexplicable crust-like matter that slowly sticks into the cracks of the system -- a kind of electronic, cholestrerol-like kipple. You feel it, but you can't pin it down. Reboot the machine and it starts up fresh and strangely more agile.
Some have pointed out that NT has problems with memory fragmentation. I have not seen any solid evidence of this. Whatever it is, Linux ain't got it.
Pam?
Rydw i'n dysgu siarad Cymraeg yma. Cymraeg o Slashdot? Diolch byth!
De neu Gogledd Cymru?
Siaradwch yn arafach.
Rhaid i mi dysgu siarad Cymraeg, achos dydw i ddim da iawn.
...Upgrade now to Schrodingers Dog...
This article is by now so old that no one will read this comment, but what the hell. Karma whore, etc.
Some of the articles that have been posted seem to miss the point. Several people suggested that this design goes against the principles of RISC. I am puzzled. The RISC philosophy is about maximizing efficiency by reducing CPU complexity. But this is memory management research, i.e. it proposes a new MMU design, not a new CPU. It is like suggesting that 3D accelerator boards are contrary to the principle of RISC design because they involve complex hardware. There's no contradiction in having a simplified CPU and complex off-chip hardware to back it up.
Others have suggested that there's no point to this work because a hardware implementation of malloc() and free() would run only marginally faster than their software counterparts. I suggest reading some of the publications on their Web site, particularly their Introduction to DMMX. They aren't merely trying to implement malloc() and free() in hardware, and the solution they describe would allocate and sweep the heap in constant time. If the scenario described in this paper is feasible, it could be pretty interesting stuff.