Analyzing Binaries For Security Problems
Matt writes "At the last talk at BlackHat in Las Vegas, Greg Hoglund demonstrated a product for sale by his new company that analyzes binaries for security vulnerabilities. He showed the analysis of several commercial products, the results of which were shockingly insecure. This product should help end the debate of closed source or open source applications being more or less secure."
Then again, it's not like virus scanners don't do the same thing.
Slashdot really needs an [Advertisements] section so I can disable it.
Isn't it kind of strange how they make such big claims but present no actual evidence?
Is that, provided you have the ability, then you don't have to sit around and wait for someone else to fix the problems in the programs you use...
Still, politics aside, perhaps with more applications like this freely available, perhaps more bugs will actually be fixed - rather than relying on security through obscurity - sitting tight and hoping no-one notices...
Leave me alone! - I can dream can't I ??
Huh?! So Windows users can finally revolt against Microsoft's security policies? I'd like to see some OS comparisons with this!
True confidence comes not from realising you are as good as your peers, but that your peers are as bad as you are.
You can get the slides of his presentation here:
h -us-03-hoglund.pdf
http://www.blackhat.com/presentations/bh-usa-03/b
I'd like to know exactly how it does this, considering how much of a mess compiled/optimised c++ code can look at an assembler level. It's also unlikely to be any use on a semi-compiled runtime, such as those used by Visual Basic, .NET etc as the only 'code' is the runtime, the actual program is held in a data section.
I can't imagine this program to work very well - finding buffer overflows and other possible security vulnerabilities can be an immensely hard task when you actually _do_ have access to the source code. Also, the available compilers produce quite different assembly for the same code. This just all sounds a little bit too good to be true...
Homepage
$ file /usr/lib/jed/bin/w32/w32shell.exe
/usr/lib/jed/bin/w32/w32shell.exe: MS Windows PE 32-bit Intel 80386 console executable not relocatable
And voila!
Judging from the url, they don't have a lot of faith in open source software.
I just put my boss's Windows 2003 Server CD under a microscope to examine the binaries.. Started zooming in.. and then SNAP. The bitch cracked into 2. I'll put gentoo on the server now and just tell him that a security cracker broke his shit.
-B
If this can be used to detect for example buffer overflows than does n't it also help speed up a crackers turn around rate?
I mean instead of trying to find flaws instruction by instruction with some debugger, simply specify all exe and dll's in your %winroot% directory press start and wait for the report and then manually inspect hilighted areas.
So this analyses binaries and will find all issues where the code will halt and will exceed its resource requests, thus eliminating the need for testing...
I call Snake Oil.
For those who don't know about the Halting Problem or Busy Beaver Problem then you should really know about what computers can or cannot do.
I dare say these people have some basic pattern matching, but this is NOT a reason to stop testing.
An Eye for an Eye will make the whole world blind - Gandhi
any program you can crash by feeding invalid data has a (possibly exploitable) security problem, so there hardly is any program not vulnerable.
.doc or .xls files and see what happens, try to load corrupted images in photoshop or gimp.
:-)
try to feed word or excel corrupted
probably that's no problem if you don'r read files not produced by your own programs, are not connected to any network and are the only one to access your compuert
Which isn't to say that this product is useless, it's entirely possible to have useful approximations or rules of thumb for checking programs out. Heck, that's how people mostly do it, and automating what people do is fair enough.
-WolfWithoutAClause
"Gravity is only a theory, not a fact!"So actually you will end up with a report that cannot mention if you are safe or not, and no way to change the application if you think you are.
Snake oil. Very good against any kind of bugs, esp security bug whatever those may be.
This space is intentionally staring blankly at you
Lets look at the quote on the web page, shall we?
"The alternatives are to laboriously test software or meticulously review source code line by line. But these options are so time consuming and expensive that few companies will do it." (emphasis added)
So how exactly, as the article submitter says will this "help end the debate of closed source or open source applications being more or less secure"? The product page already says that few companies have the time or money to check source code, and how many others do? Sure, it's great to have the source, but when you install apache do you check every single line for buffer offerflows? Of course not. You rely on others doing it, and you rely on others doing it correctly. That may well be a mistake, are you sure someone else will check every revision line by line?
So, frankly, this product contributes nothing to open or closed source arguements, it's simply a nice tool to automate some reviews.
(as an aside, it appears that bugscaninc have made their choice over open and closed source,
Server: Microsoft-IIS/5.0
X-Powered-By: ASP.NET
The webpage says "report is created for each program identifying the specific locations of potential security vulnerabilities"
All programmers know that high level languages create very large binary files. A small program that prints few lines written in Visual Basic, might take hundreds of kilobytes space. Hundreds of kilobytes might mean even millions of lines of assembly code.
Let's take an example. The bugscan reports that there are bugs on lines 24.234, 93.234, 134.834, 342.234, 534.444, 767.835 and 822.511 out of 1.023.890 lines. The BugScan might even report that those lines are from abcd.dll, efgh.dll, ijkl.dll and aaaa.dll. Do you now feel reliefed? No, I didn't think so either. I mean that BugScan might be very useful on low level languages, but when there are ten layers of different libraries between your code and the machine code, I bet the usefulness is not that high.
"This product should help end the debate of closed source or open source applications being more or less secure"
how so? who's to say *this* tool is an official measure of security? its *a* measure. and how would you actually do the comparison? that statement just doesn't make sense.
Looks like a lot of hot air.
The PDF presentation tells us things that we know already (buffer overflow, race conditions, whatever).
Two screenshots show debuggers and disassemblers. Another screenshot shows the "analysis results" of the "tool": "wsprintf: This function is insecure, use another function." Even this info is useless, because wsprintf is insecure only if it is used the wrong way, and I bet the "tool" doesn't check that. Besides, everyone uses std::string these days (or at least should do so).
It's also worth to note that about every University in the world has one or more groups working on topics like "automatic code verification", "code path analysis" and other things. This stuff is nowhere rocket science, but there's a lot to happen until it will go usable by the mainstream of developers.
Beta is broken and the link to classic doesn't work. Stop wasting our time or there won't be anybody left here.
The moderators, under influence, had this as interesting just before I went to post this.
Seriously, put Debian on your server. You'll thank me many many times.
It's better to be the foot on the boot than the face on the pavement. ~~ tkx Kadin2048
but complex ones? I imagine what this software does is it scans the binary for things like instances of strcpy calls instead of explicit strncpy calls. Given that the software is likely not executed, how would it be able to catch more complex bugs? How can it find all instances of user interaction which could modify a variable where that variable is used as a parameter in strncpy for example?
Dollars to do[ugh]nuts says that even with a program that gets a clean bill of health, there are still countless bugs undiscovered.
Want to improve your Karma? Instead of "Post Anonymously", try the "Post Humously" option.
Compressed executables, a la UPX.
A friend asked me to help her install an operating system on her brand spanking new PC. I have installed many operating systems - Debian, Slackware, Mandrake and Red Hat among them - and thought I knew a bit about the process. Boy was I in for a surprise!
..... 'cause that's exactly what Richard Stallman and Linus Torvalds got famous for doing!
.rpm, .deb or .tar.gz files on the CD. I've analysed it thoroughly and I found no sign.
The OS she wanted me to install was Windows XP Home Edition. I have never bothered with Microsoft software in the past, not since Bill Gates got all pissy-arsey about people making copies of "his" BASIC interpreter at the Homebrew Computer Club. Grow up guys! You liken your ideas to your babies, but babies eventually grow up, leave home and learn to survive without you! Well, Gates was basically saying that if people didn't pay for their software, programmers would go out of business because nobody would want to create software unless they got paid for it. Right
So I have never bothered with MS stuff, never having felt the need. But I figured, it could not be too difficult to install it, could it?
Windows XP comes on just one CD. First installation attempt sort of worked, but it was a bit flakey and it was a bit slow. And the desktop is just downright annoying - both in terms of colour sceme and general UI. It's a bit like KDE, but not quite. Only one desktop, for crying out loud! And it's slow and crash-prone. Just like Mandrake where you get a really bloaty stock kernel {drivers for god knows what compiled into it just in case anybody ever needs them}. So I figured, first thing we should do was maybe recompile the kernel. Never recompiled a kernel in Windows, never even run the damn thing. Never even likely to now.
Could we find the Kernel Configurator? Could we hell! And the command prompt was useless. It seems to be based on the old DOS command line. And it doesn't understand make menuconfig.
The kernel configurator was not the only thing we could not find. There didn't seem to be any Packages either. You know, stuff like KWord, KSpread and Kate. MySQL, Apache and a scripting language like PHP, Perl or Python. And some simple games. Just the basics. There is something called Internet Explorer, which is a bit like a cut-down Konqueror, but it's nasty to use.
So I'm guessing that the missing configurator probably is part of a Kernel Source Devel package which is not installed by default. In fact, almost no packages seem to be installed by default. And there are no
In the end, I installed Slackware 9 and configured it to look as much like the Live CD as I could manage, but obviously not running everything as root. I can only suppose those missing packages are on another CD which we weren't sent for some reason or another. I mean, she has paid good money for the software, so she is entitled to get it! And the source code. Especially the source code! After all, if we can't check out that source, we have no way to be sure what we're running. It could be sending every keystroke to Microsoft, for all we know!
Anyway, my friend is well chuffed with Slack so I suggested to take the XP CD back to the shop and get a refund. But of course, that might be difficult seeing as she doesn't seem to have the full set. We'll keep you posted as this story develops.
Security problems are often inteoperation issues. You can make sure a program is bug free, but this will not guarantee that your program is not going to fail if the rest of the pieces are not functionning properly. To analyze the interconnections, Open Source is required.
What does this have to do with open source vs. closed? Sure, in theory, every single person who downloads an open source program will review the code themselves to make sure there are no buffer overruns. If they find any, they will of course report them back to the maintainer, who will then fix the bug.
In practice, this doesn't really happen.
As an open source developer, I can assure you that very few people are interested in reviewing other people's code for free. I'm sure the bigger projects, like Apache and Linux, manage to get a good amount of code review -- but then, big closed source projects usually do ample code review, too. As for little open source projects, like the ones I run, you're lucky if people even take a peek at the source. Really, no one is interested. I do not believe that open source projects are any more (or any less) likely to have security issues than typical closed source ones (Microsoft aside).
As long as people are using C, there will always be buffer overruns. C is just that kind of language -- it makes it so amazingly difficult to do simple things (like allocate space for a character string) that programmers naturally take shortcuts (giving the string a static length) without taking the proper precautions (bounds checking). We can't make programmers not be lazy, so the only real solution is to move on to a better language.
I realise that this particular software may not actually decompile or disassemble anything, but this presents a very good reason for making reverse engineering of any software legal in any country: if I'm not allowed to make my own private analysis of a piece of proprietary software out there, how am I to know what it's going to do to my computer? How can I know that it isn't going to take liberties and do damage (such as installing backdoors) on my systems?
To be fair, many software packages I see for Windows machines these days do take advantage of this fact, such as by giving users adverts, invading their privacy, and withholding information to them about what their computer is doing. (One example is Freeserve, a UK ISP: some of their dialling software refuses to tell you what numbers your computer is dialling out to. This can be got round, but it's the principle of the thing...). For the past few years, I've refused to run any software on my desktop machine where source code is not made available, for that reason. If they are prepared to reveal to me what they're going to do to my computer, then I'm not prepared to run their software.
Here's another question: if I have a copy of this software on a machine in a country where reverse engineering is allowed, but then I shell in to that machine (via ssh, vnc, or some other means which will allow me to control that machine remotely) from a country where reverse engineering is not allowed, and then carry out the reverse engineering over that link, is that illegal?
# bugscan bugscan
Segmentation Fault
Hehe
Slashdot Sig. version 0.1alpha. Use at your own risk.
Uhm well. Nice yea. But where's the article?
Does anybody have any idea how many binaries are protected nowadays, wich encryption, obfuscation and/or compression?
If a program uses any kind of serial entry, CD check or other kind of 'protection' scheme, you can be sure the makes have run an obfuscation program like 'PEcrypt' on it.
Even then, I don't see this program unpacking unprotected executables that have been packed with UPX or one of the other dozens of PE compressors.
Simply put, this program will have VERY limited uses for normal consumers. The only one who could use it would be the firm who made the program in the first place, before obfuscation/protection/compression, but why would they? They have the source code. A source-code checking program would be MUCH more effective.
That's a good start.
I suspect that this product will flag a lot of false positives. After reading the white paper, I believe that the following code would be considered "insecure."
#include <stdlib.h>
#include <string.h>
char *duplicate(const char *input)
{
size_t len;
char *out;
out = NULL;
if (input != NULL) {
len = strlen(input);
out = malloc(len + 1);
if (out != NULL) {
strcpy(out, input);
}
}
return out;
}
Note the use of the "evil" function strcpy().
Once before, while working at a client site, I was installing a 3rd party application. Well, in setting it up and looking for any security holes, I found a pretty large one. Apparently, the client application talks to a MSSQL server using a single account (which happens to have dbo access). Not only did it use a single account for everyone, but the username and password were stored as cleartext in the executable itself! Now granted, not likely that an end user would look there to find this information, but if someone did, and the client did happen to know someone breached the security, the only way to block the intrusion was to shut down the entire system. With the username and password hard coded into the executable, there was no way to change it witout having the vendor make the change and send out a new executable.
Just goes to prove that MS programmers are a dime a dozen, but most of them are worth that too!
It is just a bunch of simple IDA pro plugins and it will give you a false sense of security.
Halvar has published is own open source version called BugScam on sourceforge
If you can't decrypt or decompress them, you can't run them anyway.
" He showed the analysis of several commercial products, the results of which were shockingly insecure"
Where will this end, if even their results are insecure? Can they be trusted?
Reminds me of my first microscopy class at U. The Zeiss phase-contrast oil-immersion scopes cost the equivalent of over $20000 at today's prices and they gave them to 18 and 19 year olds (almost all male) to use. The only things that ever got broken were the 2c cover glasses. It made me appreciate German engineering.
Panurge has posted for the last time. Thanks for the positive moderations.
Why do I get the feeling that when you watch TV you pass out because your brain becomes overworked and forgets to breathe?
1: Any halfway decent optimising compiler will inline things like strcpy() so it won't see them.
2: If its not an external call to the OS and the binary shipped stripped of debug symbols how can it even tell what function gets called?
So its going to raise flags on badly optimised code and its going to spot some problems if developers bother to use it on debug builds. On random binaries it will fail and give a false sense of security.
they can't even likely tell what code is going to execute, so that severely restricts their options.
.Net
.Net code on installation if you like. then the binary on the client machine is native code, compiled down to machine language on install (instead of execution), and optimized for their particular system (processor optimizations, api optimizations, etc).
odds are they are just scanning for loops that copy until they find a null at the end of a string. (searching for resulting patterns from compiled strcpy as opposed to strncpy).
as most exploits are buffer overflows, this would theoretically catch all of them. it would also catch all sorts of potential buffer overflows that would never be possible given the level of user input (since it's not running the code, or disassembling, it can't know).
but this is why i made my own string object wrapper that stores the bound of a string (and a regexp to define allowed chars) - and then overload the cpy functions to prevent a string from ever copying a single byte more to itself than it should, and always makes sure it's nicely null terminated. but that's just responsible coding.
and it's easy enough to get a compiled binary from
there's an option to 'finalize'
// "Can't clowns and pirates just -try- to get along?"
1. Many false positives, as apparently insecure constructs are totally secure given knowledge the programmer has about the source of inputs. E.g. a static buffer may appear prone to overflows, but maybe it's copying data with a known fixed size.
2. Many missing positives that depend on external factors: security settings, file visibility, encoding algorithms, etc.
My guess is that the false positive issue will make the approach unusable for any real software. If the developers can fine-tune that, the tool may be a good way of eliminating the most common kinds of security flaws in software today. But IMO crackers will simply find more subtle ways through the maze.
The key to software security may be good programming practice on the one hand, but that has to go hand in hand with simplicity and the elimination of unnecessary features, and transparency, so that security problems can be found by inspection and usage.
Ceci n'est pas une signature
From the PDF:
Use this logon to scan any binary free for blackhat attendees for the next 60 days...
http://www.hbgary.com/freeblackhat/
Now the people who sound like they know what they're talking about can actually try it out and prove it ;-)
The facts have a liberal bias. --The Daily Show
that sounds misleading. the white paper states that "for example, using strncpy " is a good security practice"
even though strncpy and strncat are actually used incorrectly MUCH MORE OFTEN than strcpy.
Let me explain. People that use strcpy tend to use malloc()ed memory because they
know how it works, and that they have to supply a certain size before they copy in it.
However, almost nobody knows how strncpy works. (as for strncat, i don't recall seeing it correctly used)
i wouldn't call that "safe", i see most strncpy uses as "oh well there's probably an off-by-one there". (i'm not pushing for strcpy() use, it's horrible, i'm pushing for strlcpy() use, with which you know you understand the API, and you can detect truncation easily. google for the paper, and the stupid gnulibc objections)
The PDF is from a 2002 conference and the link is thus months out of date and no longer allowing a free trial. ;-(
The facts have a liberal bias. --The Daily Show
The feet of man who uses hypotheticals may no longer be aground.
Never argue with a drunkard, a woman, or a fool.
Proof by analogy is fraud.
Get thee glass eyes, and, like a scurvy politician, seem to see things thou dost not.--King Lear
Somewhere inside there is a pointer to a nul-terminated string buffer obtained with malloc(). Someone's got to write and maintain that.
By using most C++ STL/MFC classes you just single-threaded your app through the heap. At least - because you're assuming that C++ class is MT-safe. On a lot of platforms the "normal" C++ class library isn't really MT-safe. For example, on Solaris the IO streams libraries (used by cout and cerr) are decidedly MT-unsafe. Another is that in a multithreaded app even if cout/cerr is actually MT-safe, statements such as 'cout << "n: " << n << endl;' are not atomic.
So you wind up forced to use lower-level C-style operations that are "hard" so you can meet performance and/or data integrity requirements.
Users submit program binaries to the BugScan appliance via a web interface and a report is generated automatically. It's that simple.
Ahh, so because "not disclosing the source" isn't security-through-obscurity enough, they don't even allow you to see the binary of their own software?
Yeah, like I'd trust that.
So the plan is to automatically find possible security problems in assembler code, even though such a process is not computable. Hence the program either doesn't find every security problem or finds ones that aren't actually problems. Is this really a good way to judge software?
I mean, I can write a program that scans executables and tells ya which ones are good and which ones suck. It won't actually mean anything though.
A near as I could tell, for almost any executable you gave it, it reported there was a bug. The exception is that if you dropped its own executable on itself (even a renamed copy), it reported no bug. That seems pretty accurate to me.
Lots of people have been doing this for years.
;)
Check the University of Wisconsin's WiSA project. And, of course, the commercial solution
Standing on the shoulders of giants...
With the username and password hard coded into the executable, there was no way to change it witout having the vendor make the change and send out a new executable.
if it was in cleartext couldn't you just edit the executable, so long as the new username/password was the same length you'd be set.
There are places where the networks are not touching,and there are places where they are-Boeing's Lori Gunter
Sure it makes sense. If you analyse the open-source code and it comes up secure, and the closed-source comes up insecure, then you may have not quite proven, but you have at least bolstered, the assuertion often made by the open-source lobby that open-source code is more secure.
Of course, it also could come up the other way, thus giving closed-source advocates more fuel.
www.wavefront-av.com
Greg Hoglund demonstrated a product for sale by his new company
Yes. But is it safe to run this 'product for sale' on your system? Has anybody scanned it to look for security problems?
And where do we download the source code for it?
A Good Intro to NetBS
BugScan requires users to log into the system in order to user its' services. Enter your username and password into the appropriate fields, and click the "Login" button. If you do not have an account, ask your Administrator to add one.
I asked my administrator but I couldn't answer myself.
There are places where the networks are not touching,and there are places where they are-Boeing's Lori Gunter
I get shot down every time I bring this up, but...the OpenBSD folks have a database of coding flaws that can cause vulnerabilities. Every time they find a new one, they go over all their code and fix all the other places that have it, whether they find an exploit at those spots or not. This is the process that makes them so secure. So...why not publish all this?
Whenever I bring this up, people say "just read the man pages for strncpy, you moron." Which is ironic, because if you read the paper I linked, you'll find that the OpenBSD team has found problems with strncpy, too...that's why they made strlcpy.
Even the excellent book Building Secure Software doesn't mention strlcpy.
German engineering is what won the war in Europe.
The German planes were beautiful, like works of art. Highly engineered, with beautiful consoles. Every model of plane had it's own unique design.
The American planes were cheap, simply made, easily repaired and there were more of them.
rapists to make sure that there's enough room for these "criminals". Obviously, this is a more pressing situation! These people must be incarcerated at all costs! Even if a few people have to die and get raped.
However, you do have a point. Why is Airbus Industrie eating Boeing's lunch? Because their planes are cheap, and they have more common parts.
What I would really like to see is for them to run that program on that program. Now, that would be interesting! It would also help determine how much confidence I have in their software.
Sorry to nitpick...
Any sufficiently advanced technology is indistinguishable from a rigged demo
--Andy Finkel (J. Klass?)
Why not store them in encrypted format? Decrypt them on the fly so that the clear text is nowhere in the code. I like to use an RC4 encryption routine, where the key is stored as another string which is base64 encoded, so you decode64 the key, feed it and the connection string to your RC4 module, and get back a clear text connection string which is stored only in memory in a local variable that I reset to "" as soon as I establish the connection. There's a nice free COM object to handle both base64 and RC4 at http://www.paipai.net/texts/asp-cast128-1.1.htm. The overhead for this is about nil as far as I can tell. Even the code is pretty secure - reading it without knowing what objCast was wouldn't tell you what the password was, only how it was formed, I would have had to tell you what the com object was, and where the strings were loaded from. I don't use open source databases myself so I can't speak to that, but I think you can do this same routine for any database connection. The key and the encoded connection string I store in a tiny XML doc on the web server, the text of the xml doc is read at application start up and stored as text as an application variable thereafter. Allows easy mods to the passwords without touching the code, and without any overhead for constant file access. By the way, the old MS theory of "write a COM object to return your connection string and you're safe" is incorrect. If you do so, you can open the DLL with Notepad and see the password and userid in cleartext. The rest of the file is a jumble of meaninglessness, but the one thing you wanted to secure isn't. It's when I discovered this on another forum that I switched to the above paranoid method.
The Golden Rule Of Programming:
Never check for an error condition you don't know how to handle.
I mean, what use is this? If you do not have the source, you may use this tool to check for potential security vulnerabilities. The result will leave you with a binary which you cannot change because you don't have the source, and with a list of potential vulnerabilities, which you can't validate without a great deal more of work which you would need to create working exploits. Failure to produce an exploit does not prove that there is no vulnerability, though.
And if you happen to have the source, what use is this tool? There are better tools to find this class of errors on source level.
Kristian
Isn't what all OSes do? Take a binary and read it bit by bit and then analyse each bit and convert it in understandable format by the CPU?
-- Leeeter than leet
This product should help end the debate of closed source or open source applications being more or less secure.
Open source is not inherently more secure, and people need to STFU about it. Many eyes only find more problems if many eyes are looking, and even then it doesn't always help enough.
Proprietary code includes Windows, but it also includes OpenVMS and Multics.
When someone might yell at me, it has to be OpenBSD.
So compiling a program is now a form of steganography??? That's a misguided and just plain wrong statement. Reading your comment makes me wonder if you've ever taken an assembly class, let alone a regular programming one.
Programs are compiled so that the machine can run it. Period. That's the only reason. Comments are stripped because they are pointless to the machine. The original code is further jumbled by optimization.
However, the fact that disassembly is difficult is a side effect, not a goal. Companies who produce closed source products may be happy about it, but it is nothing more than happy coincidence.
And if compilation is anything, it's encryption. Steganography has fuck-all to do with encryption. It deals with hiding. Encryption deals with making your data difficult to read. If you know the assembly code is there - guess which one it's closer to...
The stuff presented is hardly new. Halvar Flake ... does
presented IDC scripts to analyze binaries in a
similar (if not better) fashion in 2000
it really take three years to rip an idea ?
Presentation Amsterdam 2000
Presentation Vegas Spring 2001
Presentation Vegas Summer 2001
Furthermore, there's a free sourceforge project
which has all of BugScan's features plus some
more:
http://sourceforge.net/projects/bugscam
So what's up with re-announcing old work in a
pretty new dress ? And if there's a free
alternative, why announce a commercial variant
on slashdot without mentioning the free one ?
If you give the people what they want to hear, regardless of it being true or false, not many would bother to even question whether it is true or not.
I don't see why this has to be a separate physical device, aside from being able to analyze programs without taking up your CPU time. Why not just sell it as a program?
Any binary content (read 'file') is just and only a number.
It's ridiculous to declare illegal to analize a given number because it can be viewed as a program (.dll), or pretend that a number cannot be copied because it can be processed as a song (.mp3).
No sense rules are void by definition.
What's in a sig?
However, using g_strdup_printf() in Glib (or asprintf() in GNU libc, but it isn't portable enough) will be a far easier option.
unless they're manually forcing pointers into their programs, it's unlikely that a buffer overflow is even possible
The problem with this is any monkey and a LAN sniffer can see the userid and password as easily as if it was in the clear in the exe. Next.
slashdot troll = you make a compelling argument I do not like the implications of.
It doesn't sound any more useful than grepping through the source for calls to strcpy and sprintf... any cluefull developers would have fixed those problems already.
"Freedom means freedom for everybody" -- Dick Cheney
Something I've been lobbying for a while - it's time to deprecate "sprintf", etc. Tney should be removed from the standard include files. If you need them, you should have to explicitly include something like <unsafe/string.h>. This would break lots of programs on recompile, but after some work, things would be better.
Note to ACs: I usually delete AC replies without reading them. If you want to talk to me, log in.
The programs you link to seem to work on source code , not on binaries. That's a major difference. With this new tool you can (theoretically anyway) check for security problems in programs that you don't have the source for.
Hmmm. Let's see how "unbreakable" this oracle thing really is.
If you HAVE the source code, use a source code analyzer like my flawfinder tool (or Viega's RATS tool). Source code analyzers can immediately identify where the problem is, and several are freely available. And has been noted elsewhere, the problem with binary analyzers is that they may show where some possible problems are, but it's very difficult to actually FIX the binary without the source code. That doesn't mean this is a useless product; if nothing else, if you're planning to use a proprietary program, a tool like this one might help you begin to understand your risks.
- David A. Wheeler (see my Secure Programming HOWTO)
I have seen strncpy() abused so many times it makes me sick. Like you, I prefer strlcpy(), although I have no idea about the politics behind its adoption in GNU -- I just link in -lgen (the xpg4 lib) under Solaris and code away. I usually have -lgen linked anyhow for strecpy() and such. The Apache Runtime (apr.apache.org) has apr_cpystrn() which is fine, too.
..where a is an array of chars and b is a "known good" ASCIZ char *.
I have actually seen code like this during a code review:
strncpy(a, b, strlen(b));
What the hell is the point of that?
{
char *a = malloc(10);
strncpy(a, b, sizeof(a));
}
At least that won't overrun, but lord help the guy who tries to put five characters into b.
From the same programmer, I have even seen this:
{
static char a[10];
if (a)
strncpy(a, b, sizeof(a));
}
Okay, the strncpy won't overflow, but what's the point of checking if a is NULL or not?
I like to to this:
{
strncpy(a, b, sizeof(a));
a[sizeof(a) - 1] = (char)0;
}
I suppose we should point out for the neophytes that strncpy() doesn't write the trailing NUL if it fills the buffer. So the next time you read the string, you're screwed; you have no idea how long it is.
Do daemons dream of electric sleep()?
If this program only searches for certain byte code sequences, or looks for link-time libraries, wouldn't it fail to find anything when the binary is compressed or otherwise protected?
Did anyone notice that this thing is sold as an "Appliance" and not as an application? That seems very... odd. There is no indication that this takes any special hardware. Anyone have any idea what the deal is with that?
David Whatley
I think it is not very useful.
It just outputs standard warning texts when it sees the use of some function, up to ridiculous things such as:
[1] (buffer) strlen:
Does not handle strings that are not \0-terminated (it could cause a
crash if unprotected).
or:
[1] (buffer) getc:
Check buffer boundaries if used in a loop.
Yeah, sure. When you need that kind of warnings you better not program in C.
Useful warnings are about gets or sprintf, but the gcc linker even has some of those.
To be of any value, an analyzer like this needs to actually analyze something. Like string functions that are outputting to a buffer in the stack.
Uh....firewall off your database from anything but the web server(s). There should be no connections between your sql server and the internet.
I think you'd really want to do at least static analysis per this tool as well as dynamic analysis of the executable to get much confidence from results on a binary with no source. (Rational Purify seems to do a good job of locating dynamic buffer overwrites, etc.).
The static analysis may catch some code paths that aren't typically executed, while the dynamic analysis can catch problems not evident from the source code. You still can't prove an executable is safe, but you can show where it is clearly unsafe. Putting both of these functions into a single product might make more sense.
Computers are not Turing Machines. They're finite atomata which approximate turing machines. It's entirely possible to determine whether or not a computer program will halt. Doing so can be very resrouce intensive, but it's still 100% possible.
This is a response to Slick_Snake. Anybody who understands the difference between "translation" and "encryption" should skip it.
...) into another (assembly).
---
There are people who can start with a stream of ones and zeroes, change them to hexadecimal, look up the first instruction, find the number of additional data to be included, interpret that data, and repeat for the rest of the stream.
Even for HelloWorld, this would take a very long time and much knowledge about how binaries are written for a particular architecture.
There are people (programmers) who can take a set of tasks and write a program to do it. There are some people who can take the program and turn it into binary code the machines can understand. That process would take a very long time.
The first paragraph is about "decompiling"; the third is about "compiling". Both tasks would take a very long time if done "by hand", so some of the first programs were tools to remove much of the work to make more tools. Operating systems and compilers reduce the grunt work of talking to hardware and translating from human-understandable to processor-understandable. Unix, gcc, and VisualStudio reduce the amount of time spent teaching the machines to do something. They also reduce the knowledge required. (Would you like a VB programmer poking bits into memory to create graphics?)
But both tasks are only translating from one language (C, Pascal,
So you are admitting that you require a program to understand the binary and convert it back into something that you can read
Even if the other poster had the skill, he would probably write a program to do it. Computers were invented to do very repetitive tasks.
That sounds exactly like decryption to me.
Please look up translation
- If I translate this post to German, and you do not understand German, it is still not encrypted, just translated.
My point wasn't that it was impossible to read the binary files only that is was very difficult to the point were few would attempt to and less actually could.
That is why we build tools. Difficult != impossible. In this case, it is not "difficult", just incredibly repetitve.
- Is assembly language still taught in computer school? It is not difficult. Processors do not understand very many verbs (mostly various ways of saying "get" this and "put" it there), and most of the nouns are memory locations. (They also know how to add one to a number!)
Just because a form of encryption can be broken doesn't mean that it is not encryption.
This post is encrypted in English. Please do not break the encryption.
Granted I do agree that compiling was not created as a form of encryption, but as programmers have become increasingly dependent on higher level languages they have become less and less familiar with byte code.
Just because you do not understand German does not mean everything written in German is "encrypted", it just means you cannot read it.
- Most 2-year-olds in America would have difficulty reading this English. That means that they do not know the language yet, not that this is encrypted.
Encrypt - to convert from one system of communication into another; especially : to convert a message into code Decrypt - to discover the underlying meaning of
You are confusing definitions for code.
2: a coding system used for transmitting messages requiring brevity or secrecy
3: (computer science) the symbolic arrangement of data or instructions in a computer program or the set of such instructions
The act of translating programs into computer code is not the same as encrypting for secrecy.
---
Personal Information:
- I am a programmer.
- I do not understand German. I could attempt to translate by hand with a German-English dictionary. Or I could use a computer program to translate. Guess which I prefer.
- I am not good at reading or writing computer assembly language.
- I am a very good programmer.
I spend my life entertaining my brain.
However, both flawfinder and RATS (and ITS4) have the same basic problem - they only do basic lexical analysis, and none do an in-depth data and control flow analysis of the data sources. That's definitely a weakness, to which I say: I agree - so where's YOUR code? Please develop a more impressive source code analysis tool, and I'll be glad to reference it. Tools like smatch might help you implement one.
Sure, you shouldn't be coding in C if you don't know about how to protect against buffer overflows. But having a simple tool to help you find the spots you may have forgotten can help.
- David A. Wheeler (see my Secure Programming HOWTO)
Sometimes, I get people emailing me saying that my program has a security bug due to its use of strcpy, and that strcpy is unsafe. They don't bother to notice that my code is safe (in a setuid program) and yet faster than strncpy: int main(int argc, char **argv) { char buffer[256]; if (!argv[1]) return 1; argv[1][sizeof(buffer) - 1] = 0; strcpy(buffer, argv[1]); } I hate having to deal with this in my program. I can't imagine what code reviews would be like from my bosses... >_ Myria 3
"Screw Sun, cross-platform will never work. Let's move on and steal the Java language." - Visual J++ Product Manager
asdf
strings
http://www.club977.com/ - The 80's Channel!
Your source for commercial free 80's music!
With the username and password hard coded into the executable, there was no way to change it witout having the vendor make the change and send out a new executable.
There are places where the networks are not touching,and there are places where they are-Boeing's Lori Gunter
Someone should run this bugscan against itself..
I am familiar with the US using Navajos to translate messages for World War 2.And it served as a form of encryption because the Germans did not understand Navajo. As I pointed out earlier, this message is not understandable to someone who cannot read English. But it is not "encrypted" the way we use the word when talking about computers.
There are 2 forms of "encoding" for secrecy:
Codes: Replace each word (or concept) with another word (or concept.)
Ciphers: Replace each letter with another letter.
Codes are translated.
- If I replace each word (or concept) from one language using a dictionary, then I am translating. If the dictionary is German-English or Navajo-English, then the result will be understood by people who understand the new language. If I use a "secure communication" codebook, then the new message may seem like English, but to the (desired) receiver, it will have a different meaning. "The dogs got out again" can mean "Meet me at the corner", but someone without the language (dictionary, codebook) will think we are talking about canines.
Ciphers are encryption.
- Every letter gets replaced with a specific letter, such as ROT1 or ROT13, or using a chart such as cryptograms in the newspapers. These can be easily broken due to patterns in our language structure.
- Use a revolving key. The first letter can be offset by some amount. The second letter adds the first letter. The third letter adds the second letter. To be more secure, add offsets to every letter.
- XOR every set of a certain number of bits with a big number. Without knowing the big number, it is very difficult to break encryption.
Off-topic:
I do not know why the revolving key is not used with the XOR method. The big flaw in most of today's methods is that the length of the key is known. Wouldn't security be improved if the key length changes for every pass, and the data is garbage for the first pass?
- I recently wrote an encryption routine that encrypts data based on a password. The data is XML, and the encryption routine will be open source (visible, but in a commercial product). The key generated from the password is variable length, and the key revolves based on the data. And since the key length is variable, I shuffle the data based on the key length. I believe (and I am certain that I will be notified if anybody cracks it) that the only method of to crack it is brute force: try every password and see if the result is usable. Only the administrators will have access to the encrypted data, and only the owners will have access to the passwords, and the account is locked if the password is incorrect 3 times. I hope this will suffice; I may hide the encrypted data from the administrators to prevent them attempting brute force attacks.
Back to the topic:
The Navajo language trick was successful for maintaining secrecy because the Germans could not guess the dictionary used, but it was still a "code" which was translated. Machine code is called code because it is a different language, but messages (programs) can be understood if you understand the language. Both can be translated with the proper dictionary (and grammar instructions.) Neither is "encryption" as we use the term today.
I spend my life entertaining my brain.
Java bytecode is an executable. It's executable by a Java Machine. There's only a few java machines out the, Sun created a chip that could run bytecode native and ARM has extensions that run most bytecode directly on the chip.
Here are the specs and my design. Please tell me if I missed anything. Any suggestions welcome.
Specs:
- Language = Java
- The encyption has 2 inputs:
1: data = long String of XML.
2: password = short String acceptable for input through a web form.
- Data is all of a user's logins for all websites for a single sign-on system.
- Data is currently stored after encryption where only administrators (and the valid user) can see it.
- (In development) Change so that administrators cannot see the data. Requires code to cleanup obsolete users.
- Requires password that is never stored in the system. There is no recovery method if the user forgets the password.
Encryption:
- Get password. All implementations should require SSL. (Why worry about the encryption method otherwise?)
- Key is generated by multiplying the password by the password with all the bits in reverse order, repeat using the result, and removing consecutive zeroes and the first one. One more bit is removed if the result has an even number of bits.
- Additional key lengthening could be added by using the MD5 or SHA hashes. My concern is that these return fixed length Strings. Probably use them before the previous step.
- Data is shuffled in portions where the length is based on the number of bits in the key. This is so first char != < and the last char is not >.
- Data is then encrypted using the key repeatedly. Since the key has an odd number of bits, it does not repeat on a byte boundary until 8 passes.
- (In development) I want the key to evolve on each pass, possibly adding bits to the length. Change the bits in the key based on the data. Probably need to remembers the last bit from the data, since the current key bit and data bit are already used to get the current encrypted bit.
The major strength is the unknown key length. Since the data is in ASCII, you can guarantee the first bit of the data should be a zero. Even if you knew the first char of the data, you could not know when those 8 bits of the key are reused, especially if the key grows erratically.
If I prove that the first bit is always zero, then I will remove it, leaving 7 bits per char of data. Do any web servers allow passwords or domain names using chars above 127? I deal with many international clients, so I worry about losing data.
Our Encryption class allows for new algorithms to be added easily while maintaining backwards compatability: data from older algorithms is migrated to the latest algorithm the next time the user logs in. There is no mass update, since the application does not know the passwords.
---
My concerns are:
1. Brute force attack - logging the number of missed attempts, and locking an account after a number of bad login attempts, should remove much of that threat.
2. Java is open source by definition. Someone with file access could change the code in the program to log all passwords or unencrypted data. The data must be stored in memory at some point to be usable. This is why hiding the encrypted data from the administrators is not a high priority, since they can circumvent the whole security system if they know how to work with Java.
- Changing to another language for the security system might help some, but even C executables can be hacked. And we would lose the single codebase for multiple OSes.
I spend my life entertaining my brain.