It's a pretty hard problem; people are still actively researching this and the best results are only so-so.
The best way to find code would be to e-mail the authors of the papers you've found. They probably have implementations, and academics are usually willing to share under something like a BSD license or the GPL.
> Step back and think a moment about what you just said. Garbage collection makes no > guarentees about when GC will actually occurs. That means every time kalloc was called, > there would be the potential for it to sleep while the GC searched the reference tree. > This is absolutely no good for the kernel.
Actually, there are several real-time garbage collectors around. It's not at all impossible to have a real-time one with low overhead (lower than certain mallocs), and still win in terms of ease of use, correctness, and heap compaction. Don't forget that malloc implementations also need to do things like coalescing that can cause an alloc or free to "sleep" (that is, take a while) too!
Anyway, I wasn't *really* suggesting retrofiting linux with a garbage collector. They already do a pretty good job of memory management. I'm just saying that for a new kernel, the idea of GC is not so preposterous. Not all garbage collectors act like emacs lisp or the JVM.
> The zlib_inflate and zlib_deflate routines do not call malloc()/free()
This is interesting. If so, then how can the kernel be subject to this double-free bug? (Some people were claiming that.)
> Compression/Decompression is inherently a costly operation. It fortunately though is not > one that requires a great deal of allocation/deallocation. Therefore, there is no > justification for adding the overhead of a complex memory tracking system.
I don't think this follows. You mean overhead in terms of complexity or in terms of run-time? If it doesn't do much allocation, then (supposing the garbage collector imposes a significant performance hit) it will not be impacted much by a garbage collector. If you are talking about code complexity, well, it would certainly be overkill to implement a garbage collector just for that one library, but I'm not suggesting that. I'm suggesting that we use high level languages where possible, particularly ones with garbage collectors -- in this case it SIMPLIFIES the code. In fact, here it would remove a potentially exploitable bug.
> Besides, as many posters pointed out, there is a mechanism in LIBC to allow memory to be > tracked dynamically. The performance trade off is just not exceptable though. So called > high level languages are not the solution to every problem.
It's not fair to compare the performance of debugging LIBC calls to the performance of an optimized safe high-level language. But either way, I don't think any "tradeoff" that sacrifices correctness for speed is really a "tradeoff", it's just a mistake.
Well, as much as I like an architecture built for high level functional languages, I definitely think it would be hard to convince Intel to switch the IA64 to LISP64 and convince linus to totally rewrite the kernel.
What I'm actually proposing is pretty easy: Applications don't have a high degree of interaction with low-level code, and can be easily written in high-level languages.
> Do you make this stuff up as you go along? > Anything used via a char* or a void* pointer not aliased. But please continue talking out > of your ass. Sorry to interrupt your bullshit session.
What the HELL are you talking about?
Here is what pointer aliasing is:
char * c = (char*) malloc(1000); char * d = c; ... free(c); c = NULL; ... free(d);
Note that irrespective of the assignment c = NULL;, the allocated memory is still freed twice. (Of course, the situation is not always this simplistic. Pointers are values so they are copied all the time; when you call a function with a pointer argument, or when you store them in data structures, etc.) Note also that I am using char *. So no, I don't make this stuff up as I go along. If you still think I'm wrong, how about explaining?
> Go to the functional programming contest site and look at the timings of the winning > entries and the language they are written in. The last contest was "compressing" a > hypothetical HTML-like markup language. The C++ entries ran in a few seconds, whereas the > O'Camel entries took several minutes to complete.
I competed in this contest, and I'm quite familiar with it and the performance of the various entries. (In fact, our Non-C entry placed 9th overall.) The performance of Non-C languages was very good in this contest. Anyway, the best one was written in Haskell, and that's because it had the best algorithm. Did you totally overlook this? I don't understand where you're getting these numbers from.
Of course, most of the C and C++ entries didn't even work.
> Show me one such high level "safe" language with > the performance and low-memory characteristics of > C. Lisp? O'Caml? Scheme? Snobol? Please.
I'd say both SML and O'Caml fit that, yes. Also Popcorn and Cyclone are C-like and statically typed, but their compilers are less mature.
> That functional programming contest where > O'Camel wins every year even though it is by far > the slowest entry is a complete crock of shit. > The winning O'Camel entry ran 1000 times slower > than the equivalent correct C++ entries.
This is totally wrong. Where are you getting this from? I defy you to show me O'Caml code that is even 10 times slower than the equivalent C code. O'Caml is typically no worse than twice as slow as C, often around 20% slower, and sometimes even faster (high level languages have some advantages over C for optimizations). Check out Doug Bagley's benchmarks for some actual measurements. (http://www.bagley.org/~doug/shootout/craps.shtml)
> Ok, explain to me how one is going to implement a > Garbage Collector in the kernel?
Why do you think this is hard? There are plenty of user-level single-threaded garbage collectors, even real-time ones. I don't think it would be hard to do, and would even have some benefits, like heap compaction, that are important for long-running programs like an OS kernel.
I'm not suggesting that people go and do that; it's a pretty radical idea. But I don't think it is a ridiculous one.
> If libz was written in anything other than C, it > could not be used in nearly as many applications > as it is currently.
OK, so perhaps if libz is necessary in the kernel, then we have no choice but to write it in C. I am kind of surprised that libz is linked in there, actually. How does it call malloc()/free() if it's in the kernel?
I am more arguing about application software than systems software. So if this is truly "systems" software, I've got a much harder argument. But I think it's likely we could get away with a high level language library for applications and a small C program for decompression only for the boot loader.
> Yeah, and Java VMs don't also have coding errors. > You must be the only Java user in the world not > have their JVM app crash on them for no reason. > The Hotspot JVM is written in C++ and is also > subject to these same types of issues.
The fact that you immediately think I must be talking about Java shows that you need to do some more learning. There are many high-level, safe languages! (And Java is about 30 years behind the best of them...) For instance, check out O'Caml for a natively-compiled, yet safe language.
Either way though, writing more code in a safe language reduces the size of your trusted code base, and this makes it easier to audit your code.
> Are you still beating this dead horse topic to > death? Garbage collected languages could not > even be in a position to be used for what zlib > does due to the speed requirements. Get over it > - you need C for raw speed - SSL, zlib, Linux > kernel, a browser. Garbage collected apps are > not suitable for this task.
> Tell you what, Tom7, please post your super > efficient Java zlib replacement to Slashdot > complete with benchmarks comparing it to the C > version. Also show how your java version will > use four times as much in-process memory as the > C version.
ehehe. Did I hit a sore spot?;)
Java is not the only alternative to C, fortunately. (Although, it is probably the easiest to learn for a C programmer.)
Performance is dead. People want programs that work much more than they want programs that are fast.
There may come a day when I need to rewrite zlib in SML. If I do, I'll post benchmarks so we can compare it to the C version. Based on my past experience porting similar code, I think it will be highly competitive. (And of course... security hole free!)
What I do is to copy my archive from my old media to new whenever a new format comes out. I plan on being able to read CD-Rs for a while, but when they start to go out of favor, I'll copy all of that stuff on to DVD*RW or whatever is in fashion. Then when a new higher-capacity storage medium comes out, I'll just copy again...
Where do you get DVD-R media for less than $5? I'm interested in getting a drive for data archiving, but the media seems expensive. $5 would not be too bad.
Unfortunately, most code isn't reviewed (or if it is, it's not reviewed carefully). I think it's just a myth that openness implies more review. (One might even make the argument that openness causes laziness!)
In fact, the very first piece of linux code I looked at carefully (the MD5_crypt code in PAM) had some very obvious mistakes in it. Anyone actually auditing it should have noticed it. And this is a highly security-critical piece of code!
I'm not saying that open source doesn't have its benefits (it certainly does), but simply making something open doesn't make the code better. People have to actually review it, and they seldom do.
Like most recent security holes in linux software, this one would be unexploitable in a modern safe language. (In fact it would be *impossible* to make this error in a garbage-collected language!)
The typical response I hear to this kind of comment is that "high level languages are inefficient". (I don't belive this is true, but most other people here do.) But whatever, let's pretend they are.
Now, what kind of crazy world do we live in where we value performance more than correctness (security)?? We are seeing more and more security holes as we try to write bigger and bigger packages in C. Why do we accept this? Who here really cares more about the performance of zlib than the time it takes for them to patch all of their statically-linked software, and their risk of being rooted until they do? I sure don't.
Forget about all this "coding practices" stuff. It simply takes too much effort to produce bug-free code in C. The OpenBSD people, kings of code review, just had an exploitable bug in sshd! While we need to use C for some tasks (ie, most parts of the kernel), I think we are seriously unpowered to do this for most applications (as evidenced by the high number of simple errors made, and sometimes caught).
If we simply wrote our software in high level languages, we would automatically rule out the largest classes of security holes, which would give us a lot more time to work on more important things, like high level architecture review and optimizations. I think we'd end up with a better system. So what's keeping us?
Well, the problem is undecidable in general (of course), but it's certainly possible to do some static analysis to find bugs.
Much easier would be to simply write code in a modern language that is immune to these kinds of bugs. There's nothing so low-level about zlib that it couldn't be written in a high level language.
That's good, but compilation is awfully parallelizable: You could (almost) just assign a computer to compile each individual source file; the total time would be the time to compile the slowest file plus link time. You could accomplish this with a shell script and a network file system -- what's the benefit of doing it with a shared-memory system like NUMA?
Yes, mine does. I think that's pretty common, since DVI output is seen as an "extra".
I think that the analog output is off normally (since there is a way to select outputs in the control panel), though I'm not sure if that prevents RF emissions or what.
I doubt anyone here actually knows about this, but we can all speculate together...
How safe is a LCD monitor with a digital (DVI) connection? The video card is probably not putting out RF emissions (because it's sending a digital signal), and there's no scanning CRT to track. What would be the easiest route to eavesdropping on that?
It's a pretty hard problem; people are still actively researching this and the best results are only so-so.
The best way to find code would be to e-mail the authors of the papers you've found. They probably have implementations, and academics are usually willing to share under something like a BSD license or the GPL.
Well, that chick is a bad enough actress to be real...
(Or just a bad actress!)
Does your computer "define who you are"??
> Step back and think a moment about what you just said. Garbage collection makes no
> guarentees about when GC will actually occurs. That means every time kalloc was called,
> there would be the potential for it to sleep while the GC searched the reference tree.
> This is absolutely no good for the kernel.
Actually, there are several real-time garbage collectors around. It's not at all impossible to have a real-time one with low overhead (lower than certain mallocs), and still win in terms of ease of use, correctness, and heap compaction. Don't forget that malloc implementations also need to do things like coalescing that can cause an alloc or free to "sleep" (that is, take a while) too!
Anyway, I wasn't *really* suggesting retrofiting linux with a garbage collector. They already do a pretty good job of memory management. I'm just saying that for a new kernel, the idea of GC is not so preposterous. Not all garbage collectors act like emacs lisp or the JVM.
> The zlib_inflate and zlib_deflate routines do not call malloc()/free()
This is interesting. If so, then how can the kernel be subject to this double-free bug? (Some people were claiming that.)
> Compression/Decompression is inherently a costly operation. It fortunately though is not
> one that requires a great deal of allocation/deallocation. Therefore, there is no
> justification for adding the overhead of a complex memory tracking system.
I don't think this follows. You mean overhead in terms of complexity or in terms of run-time? If it doesn't do much allocation, then (supposing the garbage collector imposes a significant performance hit) it will not be impacted much by a garbage collector. If you are talking about code complexity, well, it would certainly be overkill to implement a garbage collector just for that one library, but I'm not suggesting that. I'm suggesting that we use high level languages where possible, particularly ones with garbage collectors -- in this case it SIMPLIFIES the code. In fact, here it would remove a potentially exploitable bug.
> Besides, as many posters pointed out, there is a mechanism in LIBC to allow memory to be
> tracked dynamically. The performance trade off is just not exceptable though. So called
> high level languages are not the solution to every problem.
It's not fair to compare the performance of debugging LIBC calls to the performance of an optimized safe high-level language. But either way, I don't think any "tradeoff" that sacrifices correctness for speed is really a "tradeoff", it's just a mistake.
Er, not all metal is magnetic. If they're made from copper, for instance, they won't be.
Anyway, if they are actually still connected to the PCB, I doubt a magnet would be strong enough to straighten them...
Relative to earth, my friend, like we always measure when we're on earth!
hehe.
Oh my god! We're travelling at subsonic speeds!!
Well, as much as I like an architecture built for high level functional languages, I definitely think it would be hard to convince Intel to switch the IA64 to LISP64 and convince linus to totally rewrite the kernel.
What I'm actually proposing is pretty easy: Applications don't have a high degree of interaction with low-level code, and can be easily written in high-level languages.
Wow, no kidding.
;)
But like CDR media, I'm wary of low quality stuff. Does anybody know whether I should worry about this? I want the 100 year lifetime.
An AC flames, (got a lot of these today
> Do you make this stuff up as you go along?
> Anything used via a char* or a void* pointer not aliased. But please continue talking out
> of your ass. Sorry to interrupt your bullshit session.
What the HELL are you talking about?
Here is what pointer aliasing is:
char * c = (char*) malloc(1000);
char * d = c;
...
free(c);
c = NULL;
...
free(d);
Note that irrespective of the assignment c = NULL;, the allocated memory is still freed twice. (Of course, the situation is not always this simplistic. Pointers are values so they are copied all the time; when you call a function with a pointer argument, or when you store them in data structures, etc.) Note also that I am using char *. So no, I don't make this stuff up as I go along. If you still think I'm wrong, how about explaining?
An AC
> Go to the functional programming contest site and look at the timings of the winning
> entries and the language they are written in. The last contest was "compressing" a
> hypothetical HTML-like markup language. The C++ entries ran in a few seconds, whereas the
> O'Camel entries took several minutes to complete.
I competed in this contest, and I'm quite familiar with it and the performance of the various entries. (In fact, our Non-C entry placed 9th overall.) The performance of Non-C languages was very good in this contest. Anyway, the best one was written in Haskell, and that's because it had the best algorithm. Did you totally overlook this? I don't understand where you're getting these numbers from.
Of course, most of the C and C++ entries didn't even work.
Why are you posting anonymously?
I use control-b to open my bookmarks window, which is easier to scroll with the keyboard. Have you tried that? The keyboard is often much faster...
An AC flames,
)
> Show me one such high level "safe" language with
> the performance and low-memory characteristics of
> C. Lisp? O'Caml? Scheme? Snobol? Please.
I'd say both SML and O'Caml fit that, yes. Also Popcorn and Cyclone are C-like and statically typed, but their compilers are less mature.
> That functional programming contest where
> O'Camel wins every year even though it is by far
> the slowest entry is a complete crock of shit.
> The winning O'Camel entry ran 1000 times slower
> than the equivalent correct C++ entries.
This is totally wrong. Where are you getting this from? I defy you to show me O'Caml code that is even 10 times slower than the equivalent C code. O'Caml is typically no worse than twice as slow as C, often around 20% slower, and sometimes even faster (high level languages have some advantages over C for optimizations). Check out Doug Bagley's benchmarks for some actual measurements. (http://www.bagley.org/~doug/shootout/craps.shtml
(Sounds like you need to do some more learning!)
> Ok, explain to me how one is going to implement a
> Garbage Collector in the kernel?
Why do you think this is hard? There are plenty of user-level single-threaded garbage collectors, even real-time ones. I don't think it would be hard to do, and would even have some benefits, like heap compaction, that are important for long-running programs like an OS kernel.
I'm not suggesting that people go and do that; it's a pretty radical idea. But I don't think it is a ridiculous one.
> If libz was written in anything other than C, it
> could not be used in nearly as many applications
> as it is currently.
OK, so perhaps if libz is necessary in the kernel, then we have no choice but to write it in C. I am kind of surprised that libz is linked in there, actually. How does it call malloc()/free() if it's in the kernel?
I am more arguing about application software than systems software. So if this is truly "systems" software, I've got a much harder argument. But I think it's likely we could get away with a high level language library for applications and a small C program for decompression only for the boot loader.
Another (?) AC flames,
> Yeah, and Java VMs don't also have coding errors.
> You must be the only Java user in the world not
> have their JVM app crash on them for no reason.
> The Hotspot JVM is written in C++ and is also
> subject to these same types of issues.
The fact that you immediately think I must be talking about Java shows that you need to do some more learning. There are many high-level, safe languages! (And Java is about 30 years behind the best of them...) For instance, check out O'Caml for a natively-compiled, yet safe language.
Either way though, writing more code in a safe language reduces the size of your trusted code base, and this makes it easier to audit your code.
An AC flames,
;)
> Are you still beating this dead horse topic to
> death? Garbage collected languages could not
> even be in a position to be used for what zlib
> does due to the speed requirements. Get over it
> - you need C for raw speed - SSL, zlib, Linux
> kernel, a browser. Garbage collected apps are
> not suitable for this task.
> Tell you what, Tom7, please post your super
> efficient Java zlib replacement to Slashdot
> complete with benchmarks comparing it to the C
> version. Also show how your java version will
> use four times as much in-process memory as the
> C version.
ehehe. Did I hit a sore spot?
Java is not the only alternative to C, fortunately. (Although, it is probably the easiest to learn for a C programmer.)
Performance is dead. People want programs that work much more than they want programs that are fast.
There may come a day when I need to rewrite zlib in SML. If I do, I'll post benchmarks so we can compare it to the C version. Based on my past experience porting similar code, I think it will be highly competitive. (And of course... security hole free!)
What I do is to copy my archive from my old media to new whenever a new format comes out. I plan on being able to read CD-Rs for a while, but when they start to go out of favor, I'll copy all of that stuff on to DVD*RW or whatever is in fashion. Then when a new higher-capacity storage medium comes out, I'll just copy again...
Copying isn't the same as stealing.
Where do you get DVD-R media for less than $5? I'm interested in getting a drive for data archiving, but the media seems expensive. $5 would not be too bad.
Unfortunately, most code isn't reviewed (or if it is, it's not reviewed carefully). I think it's just a myth that openness implies more review. (One might even make the argument that openness causes laziness!)
In fact, the very first piece of linux code I looked at carefully (the MD5_crypt code in PAM) had some very obvious mistakes in it. Anyone actually auditing it should have noticed it. And this is a highly security-critical piece of code!
I'm not saying that open source doesn't have its benefits (it certainly does), but simply making something open doesn't make the code better. People have to actually review it, and they seldom do.
Like most recent security holes in linux software, this one would be unexploitable in a modern safe language. (In fact it would be *impossible* to make this error in a garbage-collected language!)
The typical response I hear to this kind of comment is that "high level languages are inefficient". (I don't belive this is true, but most other people here do.) But whatever, let's pretend they are.
Now, what kind of crazy world do we live in where we value performance more than correctness (security)?? We are seeing more and more security holes as we try to write bigger and bigger packages in C. Why do we accept this? Who here really cares more about the performance of zlib than the time it takes for them to patch all of their statically-linked software, and their risk of being rooted until they do? I sure don't.
Forget about all this "coding practices" stuff. It simply takes too much effort to produce bug-free code in C. The OpenBSD people, kings of code review, just had an exploitable bug in sshd! While we need to use C for some tasks (ie, most parts of the kernel), I think we are seriously unpowered to do this for most applications (as evidenced by the high number of simple errors made, and sometimes caught).
If we simply wrote our software in high level languages, we would automatically rule out the largest classes of security holes, which would give us a lot more time to work on more important things, like high level architecture review and optimizations. I think we'd end up with a better system. So what's keeping us?
For more discussion, see our big argument in the story about the OpenSSH root hole. http://slashdot.org/comments.pl?sid=29123&cid=3124 957
I dunno, most double frees come from freeing DIFFERENT copies of a pointer. Setting one to NULL won't help in this case...
;))
(A much better solution is to use a garbage collector.
Well, the problem is undecidable in general (of course), but it's certainly possible to do some static analysis to find bugs.
Much easier would be to simply write code in a modern language that is immune to these kinds of bugs. There's nothing so low-level about zlib that it couldn't be written in a high level language.
Patches are available, true, but that doesn't mean that no damage is done. It takes a while for people to get patches installed.
(I'm still working on openssh and openssl installation problems myself!)
That's good, but compilation is awfully parallelizable: You could (almost) just assign a computer to compile each individual source file; the total time would be the time to compile the slowest file plus link time. You could accomplish this with a shell script and a network file system -- what's the benefit of doing it with a shared-memory system like NUMA?
Yes, mine does. I think that's pretty common, since DVI output is seen as an "extra".
I think that the analog output is off normally (since there is a way to select outputs in the control panel), though I'm not sure if that prevents RF emissions or what.
I doubt anyone here actually knows about this, but we can all speculate together...
How safe is a LCD monitor with a digital (DVI) connection? The video card is probably not putting out RF emissions (because it's sending a digital signal), and there's no scanning CRT to track. What would be the easiest route to eavesdropping on that?