Are Buffer Overflow Sploits Intel's Fault?
Bruce Perens submitted a story he wrote for his website on overflows and who's fault they are. I'm pretty skeptical of almost every point raised in this story, but it's an interesting read. [Updated 21:13 by t] As Sea Monkey points out, Bruce has now taken down the article, with a brief note: "I've withdrawn this article after enough people convinced me that I didn't know what I was talking about. It happens sometimes. Thanks." What if everyone displayed such grace?
People could very well start using segmentation again without so many horrors. The problem before was that the segments had a size of 64kB, whereas the total RAM was measured in the megabytes. With the 386, the address space of the segments was the same as the address space of the entire machine. So using segments isn't terribly different from just using straight pages. You need 15MB on the stack? Not really a big problem. Therefore, there should be no worries about changing segments all the time, because all your data is in one segment.
I normally code in Java..but for Linux/UNIX coding you *have* to use C. why ? because C doesnt require overhead (support files for VMs etc like Java). C programs are standalone. C can be recompiled for any architecture. C is FAST. C is SMALL. C can use 100% of your CPU and memory without bothering about overhead. C can manipulate strings faster than any other language. Allocation of arrays and storage space in C is trivial. Looping thru arrays is blazingly fast. C has curses based screen manipulation. C can have inline assembly for those REALLY tight ops.
In short - Performance, Portability, Control, Simplicity.
Please don't moderate up his post despite sounding good. It's almost entirely wrong.
Doesn't matter how many times I recompile, bud. Sorry. That's kindof a moot point. Look at all the different distributions which use different compiler sets and different optimization parameters and all of them end up getting bit by the same bug. A recompile, if you're lucky, will prevent some exploits from being effective, but that's because of bad exploit writing.
There's also nothing to prevent the MacOS from executing a buffer overflow. You have relocatable code, just like everyone else. Almost all successful exploits rely on offsets, not absolute addresses (because those can change on a reboot, not even a recompile).
Finally, internal datatypes and bare pointers are neither necessary nor sufficient for buffer overflows. For starters, there are languages and platforms without internal datatypes and with bare pointers that can't (assuming you don't trip a bug in the language itself) result in an exploitable buffer overflow because of other protection systems. Second, there is always the possibility to trip a bug in the compiler/interpreter which results in an executable buffer overflow in a language where it shouldn't be possible.
I will agree that it will, almost entirely, go away if we used pointer-free languages, but unfortunately, in many cases (i.e. an OS kernel) that isn't particularly attractive to do.
--
Ben Kosse
Remember Ed Curry!
Bruce Perens submitted a story he wrote for his website on overflows and who's fault they are.
"Who's" is possessive. It should be spelled "whose"
I disagree with this, making it the developers responsibility to write bounds checking code every time they deal with input is why we are in this mess today. Not a week goes by without annother buffer overflow sob story on Bugtraq. Asking software developers never to make any mistakes, ever, is not a realistic solution and assigning blame isn't going to make the problem go away.
There are a few possible solutions, none of them really easy.
Buffer overflows and other common security problems have been with us for over thirty years and still aren't in the "solved problems" bin. This is inexcusable. If people are going to rely on computers in their daily lives, the computer have to be reliable and having the possibility of security comprimise using 30 year old techniques does not a reliable computer make.
-- Remember: Wherever you go, there you are!
Hrm, interesting. Do you have a *real* Deja link?
At least gcc's features are open --- If people are worried about gcc accepting non-standards-conforming code, why not hack it into -pedantic or -ansi themselves and release a patch?
And to help half-wits like us still do something useful with software, it might be nice to give us tools we don't hurt ourselves with too much. You know, scissors instead of a Samurai sword. A Toyota Camry instead of a Formula 1 race car. 110V household current instead of 50KV "professional" power.
The fact is, C/C++ is too powerful for me. Oh, I understand it just fine, and with enough time and effort, I can make something work semi-reliably in it. But what's the benefit I get for that self-flagellation? Should I spend all that extra time because finding the bugs "hurts so good"?
The fact is: writing code in C/C++ is a lot of unnecessary work. The only reason why people put up with it is because everybody else does it, so it's the path of least resistance if you want to use the "standard" compiler on your platform, use a language other people are likely to understand, and use other people's C/C++ libraries. But make no mistake about it: C/C++ is successful these days in spite of its cumbersome design, not because of it.
No, you don't need to put in bounds checking manually. Bounds checking is a property of the *compiler*, not the *language*. There are many C interpreters (and even many C compilers) which will implement bounds checking.
Hello? Funny? Didn't you mean 'insightful', moderator?
Great games
Yes it was what I meant. I always thought "type safety" was kind of a continuum. There's complete type unsafety (perhaps not unlike machine language); there's complete type safety (of a language unknown to me), and then there's all that in between. In order to do inheritance and stuff like that in C, you have to use generic pointers, which means that your compiler can not easily (though it is still possible) to catch type errors. So in that respect, C++ is "type safer" than C is, though I agree it is definitely not completely safe.
And in such systems, you pay a lot of overhead if you try to generate code at runtime, because it always involves a system call. That may be fine for 1960's style COBOL and C/C++ code, but it is not acceptable for 2000's style languages and programming.
Buffer exploits can be be solved by more liberal use of segments. You can have one segment for VM from 0-2G, which is Ex/Re, and another for VM from 2-4G, which is Re/Wr. The problem is that OS'es like Windows and Linux are very primitive, and just use four segments, code and data flat selectors for each privilege level. If this scheme was used, buffer overflow exploits would be literally impossible, but it requires advanced OS'es which have not been developed yet.
It is also the fault of the language. For its first 10 years (when it did not have C), there was literally never a buffer overflow bug on any program in VMS. For example, the internet worm exploited only Unix hosts, and couldn't do anything to VMS hosts. However, in the last 10 years, VMS has adopted more C programs, and the number of buffer overflow exploits has risen from 0 to a siliar amount on Unix. The string and array methodologies under C are incredibly fragile and primitive (as well as very low performance - you need to access byte at a time, which is VERY SLOW on modern architectures), and more advanced languages have much more high performance and more secure methodologies of dealing with data.
C++ is type safe unless you use constructs which make it non-typesafe. As long as you don't use those constructs, like casting, it's a typesafe language.
You have a very narrow view of "bugs". Remember the old system() bugs? I don't know if Java has a direct analogue to system(), but it would still be possible either way. I'm talking about logic errors, not "programmatic" (for lack of a better term) errors. Every language or standard library that allows you some control (e.g. all current programming languages that I'm aware of) is going to give you the possibility to grant crackers root.
This is misleading -- verging on FUD -- as GCC will happily build ANSI-compliant code (AFAIK). Furthermore, most of the extensions can be easily removed to port to other compilers. True, the Linux kernel makes heavy use of extensions and would be difficult to port to something other than GCC, but that's a *very* special case.
Here's some of the extensions I find useful, so you can judge their evilness for yourself:
Maybe *you* want your languages handed down from the mighty and revered standards committee. That's fine, but don't try to keep me from using neat, helpful features.
-- ;-)
Kuro5hin.org: where the good times never end.
Old archives (prior to 2000 it seems) are currently not existent on Deja. Anyway I don't see the problem with some GNU extensions and functions not being able to be disabled, maybe because I already have an understanding as to what's standard and what's not. Oh well.
fgets
'' Oh? I'm fairly sure machine code is /also/ "unsafe", and that's what your pretty source code ends up as. How do you prove that your oh-so-wonderful language is still safe when rendered into raw machine code? ''
I'm sure you're thinking that you've got me beat, but in fact this is a great question!
The old, less-satisfying answer is that compilers are less likely to have bugs than the programs they're given. This is probably true (I don't recall any exploits due to compiler bugs, though to be fair I do recall some Java VM exploits).
The new exciting answer is: We use a type-safe subset of the target machine's assembly code.
In our TILT compiler for instance, we take ML code and put it through a series of transformations. At each transformation or optimization we also translate the types (a static proof that the program can't crash) until we get to machine language. This catches a lot of compiler bugs, and helps propagate safety properties to the raw code.
The result is that we have machine language which can pretty easily be checked for type-safety. This allows us to do some other cool things, like ship the proof along with the raw machine code, to be executed on someone else's machine. They don't have to trust us (just the proof), and it doesn't suffer sandboxing costs like java. Wow! Read about Proof Carrying Code .
The second answer isn't really usable today, except in theory. The first is absolutely practical, though. (even if it fails due to compiler bugs, we'd still cut down on a high percentage of errors -- and we'd only need to fix bugs in one place).
What part of what I'm saying is hard to believe? Though we can translate Java or other safe languages to C, and the resulting C program is "safe", we aren't able to verify that about the C program unless we have around the Java program as well. (or we avoid using some unsafe C features).
You're certainly right about the compiler doing work for you, and really that's at the root of my point. Though it's possible to write safe C code, why not let the compiler verify (and indeed, prove) that your code is safe? (Clearly C programmers still make the kinds of mistakes we think are easy to not make...)
Arggh.
If the Code Segment is not marked readable, you cannot read the code.
You can't jump into the stack. You can jump to a "far return address" that is stored in the stack. You can jump to an address in the Code Segment which just happens to be coincident with something in the Stack Segment.
With NO CONTEXT SWITCHES, there are 8000+ local code spaces, 8000+ global code spaces, 8000+ local data spaces, 8000+ global data spaces, 8000+ local stack spaces, 8000+ global stack spaces. All these code and data spaces. All independent. All in the same context. But the OS has to handle them. No OS wants to bother. They all effectively point code, data, and stack to the same memory with bounds set to all of memory.
There are four protection levels, but only Multics that I am aware of has ever used more than two levels.
The hardware design of the 80386 is targeted at a Multics-like OS. Current i386 code is using separate code data and stack spaces, with bounds checking on all references to memory. Unfortunately, code, data, and stack point to the same memory with the bounds being all of memory.
Speed. Loading a segment register is comparable to integer multiply, the instruction, not the addressing shortcut.
Unless Operating Systems, Language designers, and programmers are willing to give up the nice flat 32-bit address space and go back to the old 8086 DOS segments and offsets as a programming paradigm, things are not going to improve. The problem with the old DOS is that segments were used to break the 64k barrier in a flat address space rather than used as segments. With the 386 architecture, segments are no longer limited to 64k, and there is no correlation between segment number and physical address.
Actually type safety has everything to do with buffer flows. The problem is that the types available are either unsafe or unchecked. :(
What is the type of a pointer to a 120 character buffer? If the effective type is a pointer to max 64k bytes starting here, then it is unsafe.
Strongly typed languages are easier to write programs in that do not have certain kinds of bugs. They are harder to write programs in that are buggy and seem to work OK.
Will things get any better? No. There are too many buggy programs that sorta work well enough that will be stopped dead by anything that enforces safety. What I'd love to see is a good free production-quality Algol68 compiler. The 68 stands for the year 1968. Progress? nah.
The same is true for Modula-3, Oberon, Sather, Eiffel, and other languages. Yes, they don't have the support or user community that C has, but on technical grounds alone, they are fast, small, have no overhead, etc.
Very interesting post this one of yours, I could agree with you 100% in the first line, and gradually go down to 0% on the last line.
If I read you correctly, the burden to avoid buffer overflows should be put on the programmer, not on the hardware or on the machine. Then you start advocating that the compiler/library should take care of this! Isn't that like just laying the safety net a couple feet higher?
As a general rule, I think dumb tasks should be left for the machine, the noble ones for the programmer. Checking whether a given input validation is a potential door for an exploit is the programmer's responsibility.
Another thing that you seem to suggest is that quick and dirty test code should be the basis for production code, thus holes could migrate to the final product. I think the ideal solution is to discard test code altogether and start from scratch, but who does that, right? One other approach, which I actually use and believe many do, is to check for errors (at least return values) from the very beginning, so as those parts which will inevitably be cut and pasted to the production source are structured in a way that adding extra checks would be much easier (favouring using heap memory, encasing calls in try/catch blocks in C++, for example)
Just my $0,02
See subject..;)
Altho, I would have LOVED to see what he wrote!
Anyone mirrored?
English is not my first language, so cut me some slack -: Om du kan lasa det har sa kan du Svenska
A language can certainly stipulate that bounds checking is performed on an array (as ML and Java do), making it a property of the language. A compiler or interpreter is just an implementation of a language spec.
It's good that there's a way to run C code safely. But interpreted C code is certainly going to be orders of magnitude slower than compiled ML or Java code. Probably unacceptable.
Regardless of the bounds checking a compiler might insert automatically (GCC certainly doesn't), C remains an unsafe language (casts and pointer arithmetic being the main troublemakers). This is not just an impediment to security, but to debugging (ever tried to debug a malloc/free error? ouch).
This is quite interesting in my opinion. I've often wondered the same thing.
Although it may be disagreeable to side with faulty software manufacturers <co>microsoft<ugh> it is possible that if buffer overflows were cut off at the hardware level, we wouldn't have to worry about the idiots writing bad code.
You can write object oriented code in any language you like. You can write procedural code in any language you like. Object oriented langauges simply facilitate the use of object orientation by providing you with tools that take advantage of it. You can write oo basic, and you can write procedural java (just make everything static and put it in your main class). The paradigm is not being forced on you. You simply ignore some of the features. But don't take my word for it, look at the tour if C++ in Starstroup's The C++ Programming Language.
Polymorphism, encapsulation, and inheritance are simply properties that fall out of the OO paradigm and are implemented to take advantage of it of its implications. Ultimately the right tool for the job in any given case may or may not implement these.
--locust
Buffer overflows are the fault of the LANGUAGE. Important system utilities need to be written in bounds-checked languages. Some compilers, no matter the architecture, will write executable code on the stack: "trampolines". Unfortunatly, this is common enough that the OS can't blindly turn off the executable bit on the stack pages. And non-executable stack pages don't stop all buffer overflow attacks, they just require a 2 part attack: A heap buffer to write the code, and a stack buffer to overwrite the return address. The heap buffer doesn't necessarily even need to be overflown, the attacker just needs to be able to deduce the address. And one can't set heap-addresses to be nonexecutable, simply because there are MANY language environments which do create code at runtime, such as interpreters, JITs, etc etc etc.
Nicholas C Weaver
nweaver@cs.berkeley.edu
Test your net with Netalyzr
An AC posits:
"That demonstrates a flaw in the StackGuard compiler then, not a fundamental shortcoming of the C or C++ languages."
I say:
Can you clarify?
C and C++ are fundamentally unsafe languages, and it will always be possible to write C code with buffer overflows (and other kinds of bugs avoided in a safe language). It is possible to limit the scope of buffer overflows to disallow executing arbitrary code, and maybe stackguard techniques could do it.
I think these "holes" could also be called versatility. Trying to prevent these attack through hardware means, ruins versatility.
Also complaining that it's the i386's fault, doesn't accomplish much since everyone and their grandmother has 5 already.
I'm not so sure about having a compilier that fills these security holes, but I'm all for having a complier warning saying, "Stack securtiy comprimised (line: xxx)". This doesn't seem as if it would be all that difficult to implement, either...
An innocent question: what is the safe version of sprintf ?
Conscience is the inner voice which warns us that someone may be looking.
Conscience is the inner voice which warns us that someone may be looking.
-- H. L. Mencken
1998-04-07: AppleShare IP Mail Server Buffer Overflow Vulnerability
--ian
Not if you want to parse command lines, for example, or use many of the C library functions (e.g., printf). Of course, in your own, 100% new code, you can use STL and stuff like that, but when you're interfacing with existing code or trying to use (a few of) the language's features, you have to use the unsafe constructs at least a little. That's particularly true when it comes to user input, which is (naturally) where buffer overflows come from.
--
-jacob
-jacob
Everything that you said about C applies to C++ (except for being forced to use it under UNIX!)
A lot of (bad?) programmers do a lot of unnecessary work with their strings because they don't know whether they need to be copied or not, so they always do it. C++ offers string classes that make this irrelevant. Reference counting leading to copy-on-write ensures much greater simplicity and often better performance. A decent string class implemented using a bridge pattern will only be 4 bytes (same as a char*) - it will just consist of a pointer to its implementation and so it can passed around on the stack very quickly. They also have the benefit of being guaranteed '\0' termination. Much easier to handle embedded NULs too.
Other languages considered integer overflow to be an exception. Once C took over the world, there was no reason for new (post-VAX) architectures to support this. As a result, it's more inefficient to do safe arithmetic on modern architectures now, even though we have cycles and cycles enough to spare.
not why, but *when*. If you are knocking up a one time program for your own use, use whatever gets the job done. If you are writing an in-house app for an urgent problem, take the risk that your loyal staff won't be trying to cause buffer overflows. If you are writing for a wider audience, *then* you need to be careful.
Surely the performance hit (and often, more importantly, developer time hit) isn't worth the effort.
A pizza of radius z and thickness a has a volume of pi z z a
First, it's not because of the CPU. Hell, the first well-known stack 'sploit was in the RTM worm, which worked for two flavors of Unix on a VAX cpu.
It's because MacOS uses one big-ass shared memory space to run everything in that it's safe from being taken over by buffer overflows. Well, gee, if it's all unprotected, why is it so safe? Because while you can still crash a program with a buffer overflow, you can't predict the stack address. And the critical part of a stack overflow exploit is to get the program counter pointing to the exploit code on the stack.
And even if you could, what would you do with it? There's no shell (at least not until OS X, but that's a completely different OS) to give commands, and no root privs to exploit (actually you are "root" at all times!)
Intel is relatively low on the fault scale here. A bigger problem is the number of people running Linux distros with the same binaries in them. If you compile your own code, the stack addresses will be less predictable (though not completely unpredicatable), and you'll be in the same boat as MacOS: without a predictable stack address, there's no way to run the 'sploit code!
If we simply had more people compile code their own binaries, the problem would be reduced.
But at heart, the fault is one of languages that let you stick things into memory without any sort of range checking. Get too much data or lose the null terminator from a C string and your stack is toast.
And most of these problems happen inside of a library routine. But you can't blame the library routine when it has no way to know the size of the destination buffer. The best it can do is know where the frame pointer is and to not write past it.
If C strings were more than just bare buffers with only a lone null to save you from oblivion, the library routines could be smart enough to save your ass. So I blame C and its strings as the primary problem causing buffer overflow exploits.
Use a language with internally checked datatypes and no bare pointers like Java or Perl, and this type of exploit will go away.
--
"Open source is good." - Steve Jobs
"Open source is evil." - Microsoft
Well, almost any C++ program needs to use some sort of OS API, which is usually C-style and forces you to muck around with arrays and other repulsive constructs :).
Also, the C++ standard library is often just too slow. I was writing some code that parsed text with the standard string class, and every time I wanted to pass as argument a part of a string (rather than an entire string), I had to create a whole new string object with substr()! My program became ridiculously slow (something like 10 seconds to parse 4000 lines). Finally I gave up and replaced everything with char*s.
Anyway, I still agree with your point in general. IIRC, Stroustrup once said something like "I find almost every use of the term 'C/C++' to be indicative of ignorance."
Update from B. Perens' website:
I've withdrawn this article after enough people convinced me that I didn't know what I was talking about. It happens sometimes.
Thanks
Bruce
Note to self: idea for new article->Blame programmers for buffer overflow security exploits.
---
Interested in the Colorado Lottery?
Interested in the Colorado Lottery or Powerball games?
check out http://colotto.com
It's already been done and it's called (strangely enough) SLINT. Unfortunately, it's not available to the public. Perhaps some polite inquiries could persuade them to release it? Or maybe not. Anyway, it's there.
Don't push it, friend. You don't want to incur the wrath of the Java Defenders' Flame Brigade, now would you? :)
---------///----------
All generalizations are false.
--
I like to watch.
Doing a bit of search I found a good article abour rgis at buffer overflow it details how no exec stacks can be used to allow data to stay where it belongs and the actual stuff to be execed to always be kept in memory.
It does make it a bit harder to program at the OS level but it would provide a good tradeoff in security and save all the buffer overflow attacks in programs which I think everyone would be thankful for!
I'm a big flaming gay nigger
Ok So maybe I'm not always right, but I DON'T hide behind AC. I take pride in my mistakes. If I didn't make them, I wouldn't learn. And I work for an NT hosting company, I have 50+ hour a week experience with BS microsoft puts into their software. I know what's there that's no good. I run linux at home and have a very secure network that has thwarted the numerous cracking attempts it's been met with... thank you very much.
Uhuh.
How many backdoors did you put in that RPM?
<grub> Reading
Actually, RMS encourages people to charge as much as possible for free software. The money acquired in this way, he reasons, can be used to develop more software. You can charge for a GPL'd program, you just cannot stop whoever buys it from distributing it, and must make the code available.
If C strings were more than just bare buffers with only a lone null to save you from oblivion, the library routines could be smart enough to save your ass. So I blame C and its strings as the primary problem causing buffer overflow exploits.
Use a language with internally checked datatypes and no bare pointers like Java or Perl, and this type of exploit will go away.
Programmers writing SUID programs should also be capable of using pointers without creating buffer overflow exploits.
Or do you think the problems with buffer overflows outweigh the potential gain from using pointers in the first place ??
I strongly prefer the additional power of constructs in C that are provided by pointers. I do not think that a higher level language is likely to be any safer. Sure, the language may conceptually be without overflows, but the increased size of the compilers/interpreters makes those much more difficult to check, and still prone to overflow.
But you need references/pointers to do polymorphism, so there is some limit.
Scuttlemonkey is a troll
I haven't ever written/read/used programs in FORTRAN, but I've done somo programming in Eiffel, and programs look a little Eiffel-like...
That shouldn't be surprising, because it's intended to give C programmers some Eiffel-like features.
- char cBuffer[128];
need to be severely beaten over the head with their copy of The C++ Programming Language, Third Edition and forced to use Visual Basic for MS Office for the rest of their lives.Just my $.02
Help save the critically endangered Blue Iguana
If you actually _use_ the intel segment structure, you aim a read/write data segment at exactly the extent of code you need to modify, not the standard bit of CS:12345 being the same byte as DS:12345 being the same byte as SS:12345.
... until ...
With byte granularity, the segment can precisely delimit up to 1 Megabyte of code. With 4k granularity, the limit is 4G.
For non-trivial security, I think Intel did it right, only none of the OS's follows through. Somethink like it works better to control the keys than to control the locks.
There seems to be too much all or nothing, user or kernel. Anything legitimate has to be more complicated than that.
If you have 12 guests and room for 10 chairs on your property, you just put 2 chairs on you neighbors property. Chances are your neighbor neither knows nor cares. Everybody is happy
Sure, but if you're willing to have a bit more 'crap' in the kernel, you reduce the chance of being cracked. If creating an exploit for a kernel with the stack-exec patch is more of a task, there will be fewer exploits that do so. In applying that patch, you'd successfully be lowering your chances of being cracked.
I don't see that as a bad thing.
A poor workman always blames his tools. 'nuf said.
That's a bizarre definition of "type safe". How do you accomplish anything at a system level without pointers, etc.? Even in Java it is necessary to downcast things. C++ is just as safe if you use dynamic_cast.
Please explain how one can do anything in a strongly typed, inheritance-based language without casts? The concept of type safety does not include the impossibly that programs may crash when one ignores or misuses those very features that guarantee type safety.
The above message was moderated down in error. It should not be "-1, Troll" but rather "2, Funny"
Thank you.
--
Tired of FB/Google censorship? Visit UNCENSORED!
If everybody showed such grace, Slashdot would get a maximum of three replies to every story.
:)
--
-jacob
-jacob
They exist for ANSI C compatibility (who would ship a compiler that's not ANSI compliant?) and compatibility with old source.
There are a number of man pages which quite cutely remind you not to use certain functions (sometimes not for securtity reasons):
man strtok
...
BUGS
Never use this function.
...
It's not so much like going out with too many guys. That the normal state of the web. It's more like letting some of these guys scope out your house for broken door...
What is the point of calling something "type safe" then?
That's like saying a wu_ftpd is secure, except for the remote root exploits.
It's only slow if you don't bother to learn what code a C++ compiler generates, using lots of mechanisms without realizing it.
Yes. But this stems from C++ being descendant of C, which is basically assembler on steroids. To effectively program in C, you should know your processor, your OS and your operative environment to the tiniest guts. Which means, you spend too much time on this, so or you lose your programming effectiveness, or you tell "heck with that" and make program uneffective, but meet the deadline and not get fired.
So, the thing you need is compiler that know to optimize (and hint about) your constructions. After all, compiler si written once, why not invest in it maximal system knowledge? And you, ideally, should just tell it "I want to do so&so". Obviously, that's not C++ or C, and neither Java. Some scripting languages are closer to this, but hardly enough.
Language should make easy to program efficiently and hard to program inefficiently. Current languages fail to do this, they are mostly just high-level assemblers. That's pretty bad - in 30 years people could invent something better.
-- Si hoc legere scis nimium eruditionis habes.
An AC flames:
"Do you not know C++? he specifically mentioned dynamic_cast which will throw an runtime type exception for unsafe casts (this is where java lifted the idea). Your criticisms stem from ignorance. "
I say:
I fully understand dynamic_cast. The presense of safe features does not make a language type safe -- only the complete lack of unsafe "features" can.
Java is type safe because it does not have an unsafe cast. (Nor pointer arithmetic, manual memory management, or non-checked array access).
C++ is not type-safe because even though you can do a lot of the same dynamic checks as in Java, there are features of the language which let you do undefined things.
Should we blame the language? Should we blame the programmers? Should we blame Intel? No! Blame Canada! :)
---- Email is reversed
An AC says:
"duh, this one is really easy to solve. wrap delete with an inline function that checks if the pointer is NULL, deletes, then sets it to null.
no more multi delete problems. (of course, you really should fix the bug that causes you to delete the pointer twice, but you can't really blame the language for your own ineptitude...) "
I boggle, and respond:
Wow, now THERE is a severely misinformed post.
#1. I'm contending that we use more advanced languages because they make life easier for both good and bad programmers, not merely compensate for "ineptitude".
#2. Your solution is 100% wrong.
C * a = new C();
C * b = a;
delete a;
delete b;
'b' is a different memory location, which isn't set 0 when 'a' is deleted. Unless you're implicitly suggesting that we move to memory handles (which is going to give you MUCH worse performance than modern languages which just don't let you write this kind of program), your solution doesn't solve anything at all.
Always use snprintf:
char buffer[256];
snprintf(buffer, 255, "Hello %s, how are you?", name);
(or just use a language which has type safety and bounds checking).
It's practically impossible to avoid new and delete, which also make the language unsafe. (try deleting a pointer twice).
An AC says,
>
I say:
This is a pretty good point; dumb people will always be doing dumb things.
But I think that if everyone were using a type safe language, there would be fewer security holes. The reason is this: With a type safe language, no matter how dumb (or extremely clever) you are -- it's impossible to invoke "undefined behavior" and start executing code that wasn't part of the original program.
I think the most common thing would be unescaped user input with shell metacharacters included as part of a command. (how many perl mail scripts have we seen with remote exploits?) I actually believe type systems could probably keep people from doing that accidentally -- but even if they didn't, we still would never see a (now) very common class of exploits.
Shhhh! Don't mention OO Basic near my university's Comp Sci department... they'll do anything to inconvenience students!
In all seriousness, however, how DO you write OO Basic? I am a big fan of the OO paradigm, but I owe the creators of BASIC for instilling a love of coding in me at a young age. Has this actually been done?
grep -ri 'should work'
Arrays are not the same as pointers, but that doesn't have any bearing on the argument.
Arrays are just const pointers.
Regarding #1: Ha ha ha.
Regarding #2: Programmers are often at the mercy of the compiler author and OS code. Not all buffer-overflow expoits are the fault of the programmer.
Regarding #3: Wow, thanks for giving thousands of /. script kiddies a weekend project
Regarding #4: To the contary, I think that it's valuable protection against unskilled (but malicious) users and lazy sysadmins.
Regarding #5: Most crackers lack either the knowledge or dedication to bother with anything like that. They'll have to wait until someone codes a tool for them. :)
Regarding #6: Well, I agree in principle. This story was both Overrated and Flamebait. At least it's not another crappy anime story, eh Taco? Don't get me wrong; I love a few anime series, but I prefer computer-oriented stories. (Yeah yeah, Lain, we know.) And while I could block the topic in my user preferences, I'm not always logged in. But now I'm just whining.
My own offtopic note: anyone else notice that Slashdot recently aquired the "slashdot.com" domian? Wow, Rob has really dumbed this place down since that Washington Post article.
---------///----------
All generalizations are false.
--
I like to watch.
Open range is faster and easier than dealing with well-tended fences. Unfortunately benchmarks measure what they can measure and the victor wins by shirking safeguards.
Buffer overflows have been solved long ago, and the solvers are dead. IBM 1401, Burroughs.
The worst is that the buffer overflow exploits, like the web defacements, are minor nuisances. The effect of storing the 11th element of a 10 element array can be rather damaging.
The language is the right place to attack the problem, but the language exists at many different levels. There is also a problem in that there may be several different ways to interpret a set of bits in memory. There is also a problem in that it is much easier to point to something than to precisely delimit the extents. Language is the key, and it needs to be a box not a point. Not easy, not backwards compatible. Much that is running today is broken but nobody realizes it. When the buffer overflows, et. al. are handled, these programs will not run because they are in fact broken. Closed source or too lazy, you cannot fix the buffer overflow problem. Open source, just maybe.
Dead Beef proclaims:
"That's a bizarre definition of "type safe". How do you accomplish anything at a system level without pointers, etc.?"
This is the only definition of "type safe" I know of, and I read a lot of programming language journals and would have expected to come upon the "right" one if I had it wrong.
Java is type safe. Though it is dynamically typed (you can get run-time type exceptions, and unfortunately it must make run-time type checks), you can't make it start executing abitrary code. That is, there aren't any undefined situations in Java.
"Please explain how one can do anything in a strongly typed, inheritance-based language without casts?"
(properly checked) casts can be safe, as they are in java. It's tragically clear, though, that you've never stepped outside the Object-oriented paradigm - ML, for instance, is a functional language which has static typing (no runtime casting or type errors) and real type safety (my definition: programs can't crash, no matter what).
There are in fact extensions to ML which allow system-level programming (they wrote a network stack, for instance). But most applications that compromise security are network apps, not system apps -- fingerd, sendmail, apache, ftpd don't need pointer arithmetic. They could all be rewritten in a modern language.
Sure you could blame the programmer, calling him a bad programmer, but seriously how many programs have bufferoverruns and similar problems?
I bet most of the more advanced programs you are running have or have had such problems. I bet programs like Apache have or have had them (don't blame me if it doesn't, it was just an example =))!
Why? Well I wouldn't say because of bad programmers, rather because of a language that allows buffer overruns to happen easily. I mean even the best programmers makes mistakes. Mistakes of this kind are easy to make.
Languages like Java have checking against overruns but the main argument why not to use Java seems to be speed.
Sure checking for Exploits of this kind makes the programs slower but it's worth it!
I heard rumors that Ericsson have started using Java despite it is slower than C, why?
First because it takes less time to write a program because lots of code is already written, which saves money.
Secondly because it is easier to debug, you are less likely to make errors (easier syntax, etc), which saves money.
They are just 2 examples, in a big project they can save lots of money, which means that they can by a faster computer that makes up for the fact that Java i slower than C.
Sure Java isn't a perfect language for everyone and every program but it's an alternative one should consider when starting to make a program.
There are two types of dirt: One dark kind that sticks to light objects and one light kind that sticks to dark objec
That's a pretty pathetic argument. Firstly, that's like saying guns aren't capable of killing people, as long as you don't use them. That they can kill people, regardless of whether they are used or not, is an inheirant potential in them. So saying C++ is typesafe by telling people not to use any unsafe operations is similarly incorrect.
Secondly, it is possible to crash C++ programs even if you never use any unsafe constructs. For example,
int* foo;
std::cout << foo[0] << std::endl;
Oops, I segfaulted. Okay, so I used a pointer, let me give you another example:
vector<int> foo;
std::cout << foo[10] << std::endl;
Segfault, you lose.
So having separate address spaces for code and data might cause problems. self-modifying code still exists, etc.
But the cool thing is -- you can just make the page tables regarding the code address spaces point at the same pages as the data (and vice versa), and do this at page granularity. So you can make an 8k buffer that is read/write/execute (but accessed with different linear addresses when used as code or data) for your genetic programming, but keep the rest of your program safe.
VM tricks are so much fun. ^_^
The enemies of Democracy are
mikpos says,
"... there's complete type safety (of a language unknown to me)
Here are three examples of completely type safe languages: Java, ML, Haskell.
I'll agree that there's such a thing as incremental improvements in the amount of typing. Programming in C is a little easier than assembly, and C++ a little easier than C. This has a lot to do with the type system (also a lot to do with syntax).
But there's something really nice about programming in what you call "completely" type-safe: No matter what, you know that your bugs aren't those weird Heisenbugs you get when you accidentally write over malloc's bookkeeping, toast your return address, or free something twice. You don't get this in a "mostly" type-safe language.
What if everyone displayed such grace?
Or what if everyone bothered to do some research before writing and self-promoting some inane rant.
If you used the segment registers, the result was basically a highly non-linear address space. In a lot of ways, it was an 8 bit processor with 16 bit registers and hardware bank switching (for those of you that remember bank switching).
as a result, there were a few 'standard' memory models that programmers used:
- Small address space: All segment registers the same. don't touch them. This gave you a flat 16bit (64k)address space, turning the machine into a glorified 8085/Z80 -- almost completely source code (assembler!) compatible. It also gave a slight speed advantage, since all pointers and integers were 16 bits wide.
- Intermediate address space: segment registers point to disjoint spaces. not too much difference but you get some breathing space since the code and data don't share the same (tiny!) 64K address space.. pointers are still 16 bits, but you now have to remember which segment you're talking to.
- 'large' address space: all pointers are 32 bits wide. (include both segment registers and then pointers within the segments). This gives you access to the full 1M address space. (the 640K limit was because 380K was reserved for I/O space).
The 80286 allowed people to break the 1M barrier without doing bank switching (EMS?), but it turned the segment register/pointer problem into a serious horror story. Unless you were seriously masochistic (or just plain desperate) you just made it look like an 8086 that ran a bit faster.SERIOUS performance hit. If you allow arrays >64K then just about every array access requires you to calculate and load the segment register. address math sucks because if you have two 32 bit addresses A and B, A != B does not necessarily mean that they don't point to the same memory, and *X++ can require some serious work to do the exepected thing.
When they came out with the '386 you now had segments of 4GB each. This was at a time when a 2GB ram module could have been camouflaged as a desk and would have required a 15KW watt power supply.
Most programmers and OS designers just set all the segment registers the same (the '386 equivalent of the 'small memory model', and forget about them (I called this traumatic amnesia).
So, yes: Intel has a Segment model that could be used to provide security, but few people are brave/stupid enough to risk the horror stories/ flashbacks that enabling it might entail.
Intel: Just short of intelligent.
Free Software: Like love, it grows best when given away.
The implementations with the stack growing toward higher addresses is somewhat less efficient. Here's why: a called function only gets the current value of stack pointer (that is usually assigned to a frame pointer register). When the stack grows toward smaller addresses, it already points to the beginning of the arguments, and (in case the arguments are of different sizes) it is enough to figure out the size of an argument to find the stack location of the next. ...
<-stack grows/FP-points->| a |b| c
If 'a' can tell us its size, we can find the address of 'b', etc. Not so if we simply store arguments backwards into the forward-growing stack - to obtain the address we need to know the size of the argument the pointer to which we're trying to obtain:
| c |b| a |SP-> grows, points
(no way to find where 'a' starts without additional info) Therefore we have to pass the old value of SP (or the total size of all arguments) to the function on the top of the stack or in an additional register, to achieve this:
FP->| a |b| c |SP->
Doing a bit of search I found a good article abour rgis at buffer overflow it details how no exec stacks can be used to allow data to stay where it belongs and the actual stuff to be execed to always be kept in memory.
Why don't you do some more research and discover why the poster to whom you replied said that unexecutable stack dosn't prevent buffer overflows from being possible.
Learn what a trampoline is (and why the poster mentioned it) and why the bandaid you propose dosn't stop the patient from bleeding.
I have discovered a truly marvelous sig, unfortunately the sig limit is too small to contain i
Yes, C++ is somewhat better than C because it does allow you to build abstractions that perform more checking and automatic resource management. But even if you do that C++ is fundamentally unsafe. Why?
C++ still uses the C pointer model and adds a similar reference model. And C++ still uses manual memory management for dynamic allocation. You cannot, in general, address those problems by creating safe abstractions. If you try, you end up severely limiting language semantics, and as soon as you face any outside library, you have to convert to raw pointers anyway.
And C++ still does not guarantee fault isolation among modules or any way of determining from the source code of a module whether that module is safe or not. That is, any piece of code you link with can cause arbitrary problems in any other piece of code, and you have no way of telling. Perhaps you think that's inevitable, but it is not. None of the other languages that I mentioned have that misfeature.
Arguing that one should not bother fixing those problems because there are lots of other ways in which people can make mistakes is wrong. The problems C/C++ creates for programmers are easily avoided, without performance penalty or other drawbacks. A day lost trying to chase some avoidable pointer bug in a C/C++ program is a day that could have been spent on testing and fixing some conceptual security bug.
I have been using C for 20 years and C++ since before its first public release (nearly 15 years?). I still use them a lot because that's what interfaces best with the software that's out there. At the time, they were reasonably good tradeoffs. But this is the year 2000, and tradeoffs that were good then are not good anymore.
Tom,
You're definitely in danger of becoming a one-trick pony here: in almost every post I've seen you make this summer, you've been raving about this ML stuff. I'm afraid your closeness to the project is limiting your objectivity on this issue.
Type-safe languages (and ML in particular) are not a programming panacea: they have neat features (and I particularily like ML, especially functors which are quite a great deal more elegant than C++ templates), but they have their drawbacks: you do get nice things like type-safety, bounds checking, (in ML/Haskell/et alia) functions as first-order variables, generic programming done right......but your language features don't come for free.
ML is never going to be as fast as C (and I know that you can optimize more aggressively because of pervasive strong typing and lack of aliasing issues, and it's an interesting research area), since the lambda calculus is not the natural paradigm on a von Neumann machine.
Plus, performance aside, sometimes you want to use pointers, because that's the most expressive tool for what you're doing -- experienced C programmers often feel hamstrung when programming in stricter languages like Java. I like ML, I also like C, and Scheme, and Pascal, and Perl, and ..... ML doesn't necessarily work for everything.
Blatant jingoism alert: It's like safety scissors: not going to poke your eye out, but a clumsy tool for someone who really knows how to cut paper. ;)
Cheers,
Greg
Java's a great language. It was used in my first year computer science classes, which were an introduction to programming in general, and object-orientated programming specifically. If it had an ncurses-style library, I'd use it for my current project (a roguelike). But since stuff like that is against the whole idea of Java, I'm using C++ instead.
When I said "lower level", I meant "lower level compared to the languages I just listed". C's not low level, but people shouldn't go calling it high level either.
Yes, you /could/ educate the programmer, and they /could/ write a tight program in C/C++ (or even assembly).
The point is, for each programmer that does it right, there will be hundreds of non-educated programmers doing it the wrong way.
Why risk your business on your programmer knowing enough of their stuff. Why not just use a safer language?
Sure, C++ programmers are tough, don't need no bounds checking, don't need to comment/indent code, but what's the price to pay when someone didn't have enough sleep on the weekend and slips in an overflow vulnerability in there?
---
C is not fast. After conversion to the intermediate language, all high-level languages are indistinguishable, and that's just a tenth of the compile process. What you mean is the _compilers_ for C make fast code. C in no way lends itself to speed, with its generous use of pointers. Fortran may suck for OS programming, but it's faster than C for number crunching b/c the compilers are better and there are no weird constructs (like pointers).
As for inline assembly and blazing loops, I doubt that many people could hand code a bit of assembly that runs faster than the compiled version. Give the compiler a straightforward loop or something, and it often knows best. You could eventually get a better optimization by hand, but the trouble is almost never worth it.
C is small, though. But it's not like memory is a limiting factor anymore. The only time it matters is when the entire program exceeds the the CPU cache, and memory fetches start hurting you.
>
You mean, run it through the interpreter and give it every possible input it might see? Buffer overflows don't happen every time a program is run; they happen (obviously) when a malicious user supplies bad data. C interpreters can help track down bugs which happen all the time, but I don't see how they help track down the bug in question here.
I like C too. I just think it's an inappropriate language for big programs or (especially) security-critical programs.
Hey dudes! I'm kinda new to this slashdot thingy here, but could some sweet, loving guy please explain the buffer overflow to me. It seems like a pretty cool idea, is it like sorta like going out with too many guys or something? help me please!
Oops...I did it again tour info, bio, and more!
So, if you don't use pointers, address-of, array subscripting, call-by-reference, or any library functions that do, yes, then you can write type-safe C++ programs. Too bad that you also can't do much in that subset of C++.
There is no problem with providing those unsafe constructs in a systems programming language. In fact, lots of languages do, just like C/C++. The problem with C/C++ is that the safe constructs and the unsafe constructs are indistinguishable, and that means that even the 99.9% of a program that can be written nicely with the safe constructs use the unsafe ones, and as a result are much more likely to crash.
Do non-intel/non-linux Unix boxes have the buffer overrun hole? If not, why not?
I always thought that the purpose of segmented arch was to simulate the 64K address space of older processors/and or speed up pointer operations on small-scale programs. But what do I know...
Maybe *you* want your languages handed down from the mighty and revered standards committee. That's fine, but don't try to keep me from using neat, helpful features. Whether or not the features may come in handy
Because it is actually fixed on the Intel architecture and broken on the VAX. It's really the OS's that are broken, limiting the Intel to being only a bad version of a VAX.
Oops. I was -going- to say that while the features may be handy, compiling with the -pedantic and -ansi switches should provide notification for -all- non-compliant code.
Heheh, at first I thought he was talking to me.
But, like I said, that is a rather pedantic definition of type safe. I don't see what pointer arithmetic, memory management, or lack of bounds checking have to do with type checking. They might encourage logic errors, but not type errors.
Although... gcc does let me pass an array of one static length to a function that expects a different length, but that is a type error only if you consider char[3] and char[4] to be different types. I just think of them as "arrays of char", the length itself has nothing to do with the type.
And it looks like Java does not let one assign a length to an array type, that's always an attribute assigned at run-time.
Is there a Linux or *BSD distribution that does away with C programs as much as possible? I would feel more secure having the services I want to be up (httpd, ftpd, dns, proxy, etc.) if they were written in a safer language.
I would really like a language which had and extended perl's notion of "taint"; this would probably double the security of the system by making it very difficult to do something bad.
-- Too lazy to get a lower UID.
The underlying problem is that C/C++/Objective-C do not have mechanisms to protect against these kinds of problems. In fact, it's impossible to write substantial programs in those languages that use only "safe" constructs. This is a peculiar and fundamental bug in the C-family language design.
There are excellent alternatives around. Modula-3, Oberon, Ada, Sather, and Eiffel all have efficient, free, open source implementations around, they all provide access to unsafe features when needed, and one of them should satisfy anybody's programming needs. Java is an excellent applications and server programming language, although it has a bit more overhead and no access to low-level features.
So, folks, get with the program and stop writing servers and other applications in C/C++.
This is a very old debate, and it's been raised on the kernel list several times. The problem is that it seems pretty clear that given a buffer overrun attack which can be exploitable without the stack-exec patch, it's possible to transform that attack into an exploit which will work with the stack-exec patch present.
It may require more work to create the exploit, but it's the sort of thing which only one person needs to do and then share with 100,000 of his best friends on some cracker web site. Hence, such a patch only provides the illusion of security, and it adds crap to the kernel. (There's all sorts of kludges you have to put in there to make sure that trampoline code doesn't break, etc., etc.)
Yet there are cases where C or C++ should be used:
Having said that, I think that all C++ programmers should seriously look into learning at least the fundamentals of Java (most of which, as C++ guys, you already know). I also think that if you're choosing a language in which to implement a new project, think twice before just jumping into C++. Yes, I'm a Java zealot.
And I hate to nit-pick, but C is not really low-level. I instead think of it as the only "medium-level" language, and this is part of why it's so useful. You can't write a device driver in VB, and you can't (reasonably) write an office suite in x86 assembly, but you can do both in C. Yes, Java does place more restrictions on you (although if you're an OOP nut like me, you don't really mind). This is why Java is the ideal language to teach OOP, because even if you think in strutured or procedural programming terms, you are forced to implement them in OOP constructs. C++ allows you to go with either way, and some people enjoy that freedom.
---------///----------
All generalizations are false.
--
I like to watch.
Tee hee. I've encountered the JDFB before.
Java compiled to machine code would be a very appropriate way to implement the network daemons that are so often the "root" (so to speak) of security problems in unix.
I'm not a big Java fan (it has its weaknesses) but here the important issue is type safety, which Java definitely has. It also has name recognition and a familiar paradigm.
The problem isn't Intels fault because the arch has an execute bit in the segments. The original idea was you put your code in a separate code segment from your stack and data segments. The real problem is OS designers who for various reasons decide that the x86 arch's segmentation should be ignored and set the code segments equal in size to the data segments and stack segments. It then becomes a simple matter to just jump into the data or stack segment and begin executing code.
Of course since most of the OS's don't properly use the protection mechanisms Intel has provided, I guess it becomes Intels fault if they don't extend the arch to support a feature and potentially break downward compatibility with other OS's using the current paging system.
In C++, this might look like:
char *p = unsafe::allocate(char,100);
char c = unsafe::ref(p,10);
float f = unsafe::castref(float,p,0);
Of course, C++ would also need to eliminate the unsafe constructs it has outside namespace "unsafe", and in some cases add safe constructs to replace them. Then, you could limit the use of unsafe constructs to only the few places where you actually need them. That greatly reduces the probability of making errors. Languages that do this exist: Modula-3, Oberon, and others. There are no such languages in widespread use yet that look like C or C++, unfortunately.
Blame the language! C and C++ continue to be inappropriate for security-critical work.
Aside from speed-critical stuff like kernels and Quake 3, I don't see the need to write programs in C and C++ any more.
Let's start using modern languages with type safety. They're easier to write programs in (because debugging is easier) and not that slow.
I know that I'd gladly take the 2x speed hit on my security-critical apps (mail daemon, web server, ssh, etc.) to know that they cannot have this kind of bug in them, because they were written in a language like ML, Eiffel, Haskell, or even Java.
It comes to a point where you have to say, thats just really REALLY silly.. I mean, who's putting these stories through?
It's like saying FORD is responsable for car wrecks, or Delta is responsable for plane crashes (well.. okay.. maybe Delta is, but not Ford... so lay off).
Yeah, Intel makes chips.. code that runs on these chips is potentially exploitable? this articles solution is to make the chip more complicated instead of working towards better software.
It seems to me that if you consolidate code into libraries (as is the latest fashion statement) and stop re-inventing the wheel, then you can focus on actually FIXING the problems and plugging the holes.
Now some of the shi..err..stuff in there does make sense... like having a memory flag that says 'don't write here', but I get confused over how that gets controlled... if you have memory that is tagged with 'don't write' then how do you re-allocate the memory once the program is done? If the OS can turn off that flag, chances are some hacker is gonna figure out how to as well... they'll write some nifty exploit and the scr1pt kiddies will run with it... we'll be right back with the same problem.. software..
oh well.
Price, Quality, Time. Pick none. What, you thought you had a choice?
> What you are forgeting is that the halting problem is partially solvable. You can easily make a procedure that says "Yes, this program terminates on this input" for any program that does terminate.
:P
Or (hehe) how will you answer this question: "will this so called 'halting problem solution program' halt given a specific program?"
You see, you just described a program that 'recursively enumerates' the set of all programs that terminate in a given time (say the time you're willing to wait for them to ternimate)... The problem discussed was *decidability* -- whether you know *for sure* that this program will terminate...
Yes, there are partial solutions to that too.. but the two are not even close
struct Cast { int x[1]; float y; };
int float_bits_as_int(float f) {
Cast c;
c.y = f;
return x[1];
}
Here is another example:
int float_bits_as_int(float f) {
float *p = new float;
*p = f;
delete p;
int *ip = new int;
int v = *ip;
delete ip;
return v;
}
These "logic errors" are related to type errors: they allow the bits of an object of one type to be interpreted as the bits of an object of another type. A system that's type safe guarantees that that doesn't happen.
In any case "type safety" doesn't just mean compile time type safety. Java has a lot of runtime type safety, where type errors are caught at runtime, not by the compiler. That's still fine for many purposes. C++ has neither.
I wouldn't blame the intel architecture, but..
There are architectures (Gould/SEL-32/xx is one) that allow for and, in some cases, insist on strict divisions of code and data pages. The code sections are read only, and will generate a fault if an attempt to write to it occurs from a non-system level.
The data sections are read/write, but you cannot branch there.
It makes it a bit difficult to write self-modifying code, but not impossible if you really need to.
---
Interested in the Colorado Lottery?
Interested in the Colorado Lottery or Powerball games?
check out http://colotto.com
if you store the return adress on the stack BEFORE the buffer, then an overflow in the buffer would only corrupt the data and could not modify the return address, no?
It would require a change on the ABI, so it's unlikely that this modification happens anytimes soon of course.
It is so obvious that I must be mistaking, could someone tell me where I'm wrong, or is it the ABI change which makes it impossible??
"But, like I said, that is a rather pedantic definition of type safe. I don't see what pointer arithmetic, memory management, or lack of bounds checking have to do with type checking. They might encourage logic errors, but not type errors. "
Well, the definition of "type safe" doesn't really matter as far as the argument goes, as long as we understand what each other means. Typing is indeed a separate issue from safety (though we usually get safety through typing); maybe it should just be called "safety".
But I caution you in case you ever pick up the ACM Transactions on Programming Languages -- to the PL community, "type safe" means that the language does not posess any unsafe features whatsoever.
Executin code off the stack is extremely helpful
Can you give some examples? Specifically, can you give some examples of where building the new executable code on the heap rather than the stack isn't just as helpful?
Slashdot - News for Herds. Stuff that Splatters.
Unfortunately, "BetterC" has the side-effect of making your program look like it was written in fortran...
We live, as we dream -- alone....
Buffer overflows can be tackled at many levels, but to me only the language one makes sense. We've had >50 years of computing and we are still firmly entrenched in using languages where we have to do bounds checking manually! efficiency shouldn't be an issue either.
The article putes quotes around the words "code" and "data" and that is the problem. In i386, CODE segments can either by read/execute or execute only. DATA segments can either by read/write or read only - not execute.
So what is an Intel-baesd, flat-mode program to do? It sets up two segments - one data, one code - pointing to the same memory. Goodbye hardware security.
Of course the VM doesn't protect against execution - that is the segmentation system's job. Linux (and anything else that assumes a 68k or VAX flat address space) just blows it off.
Simple solution - bring back seperate code and data. Excuse me, I and D. Just like the PDP-11 UNIX grew up on.
For an example of an OS that does in fact use the hardware to enforce encapsulation.
EROS is a pure capability-based OS. A read capability is disctinct from write capability, is distinct from a start capability, is distinct from a resume capability.
Buffer overflows are the fault of the implementers of the OS and the apps in question, and they can't push the responsibility off on the languages. Anyone who doesn't know by now that C doesn't have strings or arrays, shouldn't be writing any code that needs to be secure.
-jcr
The only title of honor that a tyrant can grant is "Enemy of the State."
You're right. I've been programming C++ for about 6 years solid and I find it hard. I realise how much effort it takes to write good C++ code when I program in Java... the whole time I'm worrying that I'm forgetting something because it's so much easier! Even so, I wouldn't ever work on a straight C job as I can't express my self in that language and it lacks many features of C++ that I find reasurring from a software engineering perspective (see, I'm brainwashed!).
Thanks for that.
For a language like C to have bounds checking, a compiler would have to insert code into the target object code file to check the bounds. Also, any arrays passed to a function as a pointer would also need to have the bounds passed in as well.
So that's two extra compare instructions per array index and an extra size_t per function call.
You can keep it.
By all means, make it an option. The C standard puts bounds breaking under a category called "Undefined behaviour", meaning that a compiler can do anything. Generate an error, core dump, make demons fly out of your nose.
But require bounds checking. No thanks.
Bill, likes C.
With everyone bashing C and C++ here, maybe they should *upgrade* to C#?
Haha, maybe not. I am hoping for the second coming of Java though.
"...we are moving toward a Web-centric stage and our dear PC will be one of
EverCode
>
No, it really is true. Define the problem rigidly and I will give you a reduction to the halting problem, easily.
It boils down to the fact that an address in C can come from anywhere -- from arithmetic, or the result of evaluating an expression, running another program, anything. Proving at compile time that this pointer points somewhere "safe" is tantamount to statically predicting that the program terminates. And Turing has given us a great proof that this is impossible in general.
By slashdot standards, withdrawing such a poorly thought out piece of crap may seem graceful, but in many other environments people are routinely expected to be graceful enough to refrain from publishing such tripe in the first place.
Slashdot - News for Herds. Stuff that Splatters.
I guess this makes me feel better. I know that half (or more) of the code I write would be laughed at by better programmers. It doesn't stop me from coding though. Like my Dad always says, "No matter how good you are, there's ALWAYS someone better!". So big deal, Bruce made a mistake. It makes me feel good about the fact that I can only do the best I can do and I can't waste my time (too much) worrying about how good someone else's code is.
Thanks Bruce...
--
Quantum Linux Laboratories - Accelerating Business with Linux
* Education
* Integration
* Support
*Condense fact from the vapor of nuance*
The SPARC chip has the concept of eXecute bit for pages.
/etc/system file with the nonexec_user_stack=1.
In the SPARCv7/v8 ABI the eXecute bit must be set by default to be ABI compliant but Solaris from 2.6 onwards provides the facility to turn off the eXecute pages on the stack by updating the
The SPARCv9 (64Bit) ABI has the eXecute bit for stack pages off by default.
However having said all that, it is still possible to exploit buffer overflows but it does provide a basic level of protection.
They need to be deprecated more forcefully. All the unsafe functions should be pulled from the standard C library and moved to something like "deprecated_unsafe_library.h". All set-UID programs need to be purged of those functions. Now. Any manufacturer shipping a system with those functions in a security-critical program should be sued for gross negligence.
So, the question is: is there an efficient way to do this in a macro?
--
Ben Kosse
Remember Ed Curry!
Buffer overflow attacks form a substantial portion of all security attacks simply because buffer overflow vulnerabilities are so common and so easy to exploit.
However, buffer overflow vulnerabilities particularly dominate in the class of remote penetration attacks because a buffer overflow vulnerability presents the attacker with exactly what they need: the ability to inject and execute attack code.
The injected attack code runs with the privileges of the vulnerable program, and allows the attacker to bootstrap whatever other functionality is needed to control (or "own") the host computer.
M$: "We're #2!"
Yes - this is the libsafe i mention in my previous post. It causes some programs to crash or behave badly. - for example XV coredumps.
Here's a pretty good description of how to write a buffer overflow: ftp://ftp.technotronic.com/rfc/phrack49-14.txt Jim
Like every RPM, DEB you download you of course can never be 100% certain of its integrity. You cant even be confident of the integritiy of source unless you meticulously read each line. At some stage you simply have to trust ppl :).
Packages signed with keys are all very well but how mant ppl actually verify them?. How many RPMS have you grabbed from rpmfind.net and installed. ?
Intel's 386 protected mode has "code" segments and "data" segments. Writes to code cause a segfault. Branches into data cause a segfault. But if the OS points the code and data segments at the same area of RAM, the 386 doesn't care.
<O
( \
XGNOME vs. KDE: the game!
Will I retire or break 10K?
The Tao of Buffers
Escapes most but plagues many.
Has Intel caused this?
What part of libc? How is any code in libc usefull without a function call setup?
Wait, I think I get it. Your tailor your overflow to write a return adress AND overwrite the previous functions stack layout so that when the libc code executes it uses the previous (now overwritten) stack. You've also writen the stack so that upon a return from function the registers are poped from the stack that you've tailored.
Plus or minus some code. I think I now get it.
It is still quite a bit harder than current shell code.
-- I am not a fanatic, I am a true believer.
Nonsafe:
Read or write reference to any array.
Casting is safe enough. The problem is that any indirect reference to any kind of array is unbounded and capable of accessing all of available memory.
If you don't like C, fine. But I do like C and I use C. Again, fine. Different code for different folks. Where you're wrong is making a blanket statement as to why people use C. I won't deny that you could find people who actually believe that reason. I use C for entirely different reasons. But I won't tell you what the reasons are, because I'm not flaming you about specific reasons; I'm flaming you for generalizing totally inappropriately.
now we need to go OSS in diesel cars
If you mean Bruce, I guess he did this not for self-promotion (I guess he has enough already), but because he though so. So now he's convinced he was wrong. Great he recognised his mistake and corrected it. To err is human, everybody can do stupid things. I did them many times. But it is much harder to come before the crowd and say: "Well, guys, sorry, I was stupid, now I see it, thanks for pointing out". Believe me, not everybody has common sense and brains to do this.
-- Si hoc legere scis nimium eruditionis habes.
Meanwhile, does anyone remember the IAPX 432? It was a flop for several reasons: ADA flopped, its performance was bad, it was a real departure from the architectures of the day. But it had some real innovations - every function ran in its own protected space, using message passing for communication, and your program could protect itself from itself. Is it time to revisit that sort of architecture?
Thanks
Bruce
Bruce Perens.
Crispin
-----
Immunix: Free Hardened Linux
Chief Scientist, WireX
A lot of C could be written in something better, but C (or C++) is pretty much mandatory for a lot of tasks.
Ask Linus.
Ask Larry Wall.
If you're involved in a LARGE project (millions of lines of code across dozens of large packages) it's pretty much a necessity to use something like C++. (Eiffel and Ada probably fit the bill here too, but I have no hands-on experience with either )
P.S. Pascal isn't high level. Pascal is C, simplified for teaching, but much much worse.
As I said myself, those languages have smaller user communities, and that means they have fewer libraries and tools for them. But you can call C and C++ code from them and they are quiet usable, in particular on Linux.
I don't recommend doing everything in them, but give them a try for some projects, in particular open source projects for Linux. That's the only way this chicken-and-egg problem of moving beyond C++ will get addressed.
Intel won't do anything to fix their aching old archtitecture, but they'll give us MMX.
MS needs to include a browser, but to hell if we need virus protection or some other os level protection against rouge processes.
love is just extroverted narcissism
As someone who sees attemplts against his own system on IRC every day and sees newbies announce to the world at general "Hi I'm running wu-ftpd on Redhat 6.0 and ARRGHH what the hell just happened?" who's this guy on my box!?" I decided to investigate the feasability of providing safe versions of commonly run services.
Libsafe is quite good but cant catch everything and breaks quite a few programs if you set it up in ld.so.preload.
The Stackguard compiler is definitely more robust and seems to work well during the course of my tests.
I've prepared RPMS of BIND8.2.2pl5 and Wu-FTPD 2.6.1 with Stackgaurd 2.0 Stout for RH Linux 6.2.
As I've just prepared the rpms this week on my Slackware 7.1 system I dont know how well they perform as they haven't recieved a great deal of testing (No bug reports so far tho)
The RPMS are available at http://indigo.ie/~fowler/ELSL/ with more daemons to come soon and hopefully DEBS. Try them out and mail me with any problems (My email address can be read from the reuslts of rpm -qpi file.rpm)
Good luck and keep safe on the net
Gnubie_ Efnet #Linux
What about that huge chunk of interpreter, written in C or C++? Have you audited that, too?
In a higher-level language, the simplest code can have side effects that might provide a security hole, so to reassure yourself you're going to effectively have to audit the behaviour of the interpretation of your program, not just the program itself. In C, at least, you know when you're making a function call, and you can be reasonably confident everything your program does is done explicitly rather than being hidden.
I went to a rather informative lecture by a person whose business is selling security services and also works on OpenBSD. His view was that C programs, calling a minimal set of libraries (excluding GUI libraries, amongst others), are the only things that should ever be suid root.
Any sufficiently advanced technology is indistinguishable from a rigged demo
--Andy Finkel (J. Klass?)
Don't blame the hardware, don't blame the language, blame the programmer. Relying on the hardware to fix bad programming style is like a parachutist relying on a safety net.
Any input operation that overwrites memory it is not supposed to, is bad programming style. Ideally the programmer does not know what hardware their code will run on, maybe it will be a flat memory machine with 0 memory managment hardware.
In C scanf("%s",foo) is nice and handy for little programs. But it is not production level code. Production level code should instead always use limited length routines. So it is a little harder, maybe the first implementation has to be audited to remove the screw ups. This is like checking the return values on printf and scanf nobody does in the test code, but damn well better be done in the final code.
Programmers need to limit themselves to limited input routines or at the start of the project development build a little library of limited input routines.
What someone really needs to do is come up with a "no-overrun libc" that does not included any unlimited input functions and spits horrible messages whenever a standard input function is called with arguments allowing unlimited input.
You link and run development code against this library and fix any place where it screams.
I really am suprised big development houses don't do anything like this. But of course no one has time to do it right.
I remember having seen a -fbound_check option on
a compiler on a SUN machine, but I'm not sure if it was the GNU C-compiler or the SUN C-compiler.
I was almost sure it was GCC, but I can't find anything related on my Linux box ( and I've checked in the architecture-specific flags ).
Does anybody knows anything about ?
Ciao
----
FB
That being said, yes, I'm not a fan of C. My favourite languages are perl and Java. If it wasn't for people being so closed-minded to change, you'd see other languages being used a lot. In the current situation, there's the feeling that you must use C or C++ or you're doomed.
This isn't flamebait. Remember, not all
Very possibly... before the iAPX432 came out (I still own the original architecture manual), there was the Burrough x700 mainframe architecture. It had 48-bit words, each with a 3-bit tag... one of the tag values was reserved for executable code. It also used an indirect base+displacement pointer architecture which allowed hardware bounds checking.
I did quite a lot of kernel hacking on that as we had the full source code. Unless you did something stupid with the process dispatcher, it was impossible to overwrite memory or hang the system.
Elliott Organick wrote an excellent book on the architecture. I think with some modifications this would make an excellent (and fast!) Java machine...
I know that.
For speed decreases, I am referring specifically to run-time bounds checking.
I work on an ML compiler, and have seen it beat C code performance-wise -- bounds checking on array accesses are the big thorn in our side for systems programming. (But I'm also saying that I'd gladly take the bounds checking and safety any day!)
good programmers think algorithmically
This is exactly my point - stupid imperative languages make programmers think algorithmically. You should be thinking about what you are trying to accomplish and let your compiler worry about how to do it!
sizeof() is compiletime. Contrary to what someone else said, it's not about pointers, it's about the result of the sizeof operator being a constant.
--
Ben Kosse
Remember Ed Curry!
Ick. That's just the sort of mundane task I want a compiler for. As a programmer, I already have too much to worry about -- bounds checking is one simple task that I'd just as soon have the compiler do.
In most cases, the bounds check can be hoisted out of loops, so there's almost no overhead. In a perfect world, I'd like to see a compiler that, when given a high enough warning level, warns that it can't hoist bounds checks.
Hey, pal. My post might be interpreted as flamebait (I didn't say they were bad languages, I said they are inappropriate), but I certainly believe it. And I am certainly not ignorant!
Buffer overflows are NOT just gets and scanf. Those are impossible to secure (I think the newer libc/egcs actually warn you at runtime/compiletime not to use them at all). Plenty of other library functions are unsafe in C; even home-grown array operations. If you've read bugtraq in the last few days, you know that "printf" formatting bugs are all the rage recently.
If "new" programmers are good at C, how come we continue to see buffer overflows in wu_ftpd, netscape's jpeg code, etc?
Blame the developer!
Sure, some operating systems or languages or chips hold the coder's hand and make some dangerous things impossible or difficult to do.
It's still the programmer's fault for not knowing what the (void*) they're doing.
This is the same argument as "C++ is slow!" It's only slow if you don't bother to learn what code a C++ compiler generates, using lots of mechanisms without realizing it. C++ implements its mechanisms as tightly as it can, but every mechanism you use takes some time to operate.
Back to buffer overrun security: If you are gonna accept data from an untrusted source, why are you (1) putting it on the must-be-kept-inviolate stack, (2) not doing everything in your power to accept no more than n bytes that have been allocated?
If the compiler docs specifically say "data in auto variables will never be put into an executable address space," and it does, then it's time to fix the compiler or docs. Likewise if the docs belie the behavior of a chip, time to fix the chip or docs.
Don't blame a microprocessor for your mess. Don't blame a language for your mess.
You have only yourself to blame.
[
Try this hypothetical: what if, instead of doing public speeches, polticians took to publishing their opinions in articles on the web? That way, if anything they say produces a bad reaction, they can just edit it away, and no one will be able to figure out what the complaints were about. Very convienient, eh?
My take: If you publish an article, and then later recant, the thing to do is to add a link at the top pointing to your later thoughts on the subject.
If I recall correctly, Pascal came before C, as it was created by Wirth in the late 60s. It's a teaching language, sure, but it's not C.
This is just wrong --
A bounds checking language is able to DETECT when a read/write goes out of bounds. It can deal with that however it wants (core dumping, or more nicely, raising an exception). The reason that buffer overflows are so dangerous in C, though, is that you can overwrite the return address on the stack, and make the program start doing weird things (like executing shell code inside the mischevious user's input).
I'm not sure what "strict memory checking thingy" is. The patch being discussed most often prevents code on the stack (mischief) from being executed, but it wouldn't prevent, for instance, setting the return address somewhere else in the original code of the program. Bounds checking would.
The problem is that many programmers do not take the time to harden their programs against such attacks. They are often too focused on the program's real purpose to bother with security.
And this is exactly how it is supposed to be. Why should programmers waste time worrying about something that a language designer and a compiler are supposed to take care of?
Another problem is that the language shapes the way we think about programming. Good language makes it easier to do "the right thing" and makes it harder to do "wrong and potentially insecure" things...
Please, readers of slashdot, don't make up words when you write. I know that sometimes it takes too much effort to speak a whole extra syllable, hence we abbreviate words when speaking. In the case of 'exploit' you are only dropping one letter to make it into 'sploit'. Dropping the extra syllable is not important in reading. Dropping the letter doesn't justify the consequent grammar problems. It's one letter 'e' that is already close to 'x' on the keyboard. Since 'x' is below 's' on the keyboard, it takes no more effort to keep it as an 'ex'.
Tytso writes:
The grammar nazi is not quite sure what to make of this sentence. Why wouldn't the stack-exec patch do what it was supposed to do and prevent any buffer overrun attack? If you are arguing the age old "build a thicker door and the enemy will build a bigger ram to knock it down with" debate, then dugh. Thanks for stating the obvious. The only fact that this sentence efficiently points out is that tytso likes unclear run-on sentences.Keeping
I'd much rather Debian developers be cats than sheeple. Besides, I like cats.
Apparently, there are a large number of people who are too challenged by languages that allow direct memory access to effectively use them. I might suggest that rather then 'blaming' a language for the existance of security vulnerabilites based upon overrun exploits, those people reconsider the source of the problem as the programmer who is to challenged by the task of writing good code. It would obviously be better if these types of programmers avoided such languages altogther, and instead just worked with higher level languages that don't allow direct, unprotected memory access, since such programming requires a level of understanding, effort, and diligence that makes it difficult for them to create defect free software.
the Harvard architecture.
There's a separate data bus, and a separate instruction bus. I don't think it is strictly required to have a separate data memory area and a separate instruction memory area, but I think it's usually implemented that way. There are a number of microcontrollers that use this architecture, storing the program in a ROM and accessing a RAM chip for scratchpad area.
If tits were wings it'd be flying around.
An Anonymous Coward thinks,
"This is total nonsense. In the end every compiled language gets compiled into the same x86 code that a C or C++ program gets compiled into. Therefore the problem then becomes a compiler problem. The compiler could (without any extra work on the part of the programmer!) emit code that protects against buffer overflow attacks. Indeed, the StackGuard compiler does just that."
But I say:
The idea is that where the compiler can't statically prove an array access is safe (some modern compilers can), it inserts bound checks. There's no "problem" here, except for the frequently-maligned speed hit (StackGuard has speed penalties too, even when not using arrays at all!).
Read recent BugTraq articles about how StackGuard does not protect against certain kinds of overflow attacks (like printf formatting bugs, in particular).
The developer is the one who writes the software and they can do the bounds checking in their program.
Slashdot have their own problems as my signature shows!
----
----
Another Security Problem in Slashdot
Why?
Because a lot of people are forced to program in C...
I'm not a fan of C, the library has some horrid things (like routines without buffer-overrun checking), and the language is very low level. But when working with other people, sometimes C is a necessary evil
What to do? You can get some higher level programming using BetterC, a C library that gives you Eiffel-like exception checking with a minimum efficience penalty, and without leaving your favorite C compiler.
It's my Nirvana, I don't use debuggers anymore...
Lest you be confused by the +1 funny on my post, let me say that I am not joking.
2x slower is the most conservative estimate for the speed of modern safe languages against C code. (In practice I've seen much better. Does anyone trust benchmarks?) My point is, even if it is 2X slower, I'll gladly take it and sleep a little more soundly at night knowing that my linux box isn't being hacked due to 20 year-old issues. 99% of my box's CPU time is spent at Nice -19 trying to find big primes for the GIMPS project.
Modern languages (take java if OO is your thing, but there are more intersting languages around) have SOLVED this problem with buffer checking (or static proofs that checking isn't needed). Without having to worry about this type of common security hole, programmers can spend more time on things we REALLY need: documentation, maintainable code, asymptotic speed increases, and the other possible security holes (ie, not escaping shell metacharacters in user input).
See my thread on Functional Languages for what I think is a convincing argument about modern typed languages in general. I know my position is extreme, but that doesn't make it a joke.
http://slashdot.org/comments.pl?sid=00/07/01/23
a non-executable stack does nothing, you just return either into your data segment or into libc. this has all been hashed out before on various mailing lists. all of these patches only disable a particular method of exploitation, but the overflow still exists to be exploited in some other way!
this is security through obscurity, plain and simple.
Segmentation (what you're describing) would partially fix the problem. We had this with DOS... very hard to program for.
The flat (paged) memory model that most systems use these days can solve the problem in a similar way; just as the article describes. Other than the paging that the kernel sees, address are just addresses. This model is much more convenient to think about and use.
Like a system, and langauge can be as secure or insecure as you can make it. One can write an extremely tight program in C++ while writing one in Perl or Java that leaves gaping security holes open.
This statement troubles me. C/C++ addict who have little exposure to other languages have little knowledge of what they're missing.
_Many_ (if not most?) security attacks involve buffer overflows. You have to _work_ and _think_ to free yourself of buffer overflows in C/C++. In other languages, this protection comes for free.
Yes, it's possible to make a secure program in C/C++. But it's just a hell of a lot easier in bounds-checking languages.
So there.
It seems that a simple way to prevent writing to the return address on the stack would be to simply reverse the direction that buffers (strings) are stored on the stack. In this way if too much data is placed in a buffer, it would simply write past the top of the stack into la la land instead of back down the stack onto useful return addresses etc. Of course this would eventualy overwrite the heap if too much data was supplied, but then you'd only be able to crash the program, not change its execution. It could also be detected if you checked for writing past the end of the stack segment.
There would still be the problem of passing pointers to buffers to lower functions on the stack, but anyone who is passing a buffer this way should also be passing it's length.
BTW, is the way strings are stored (incrementing or decrimenting) a property of the architecture, the OS or the compiler?
"You saved 1968." - Ms. Valerie Pringle to the crew of Apollo 8