Secure Programmer: Keep an Eye on Inputs
An anonymous reader writes "This article discusses various ways data gets into your program, emphasizing how to deal appropriately with them; you might not even know about them all! It first discusses how to design your program to limit the ways data can get into your program, and how your design influences what is an input. It then discusses various input channels and what to do about them, including environment variables, files, file descriptors, the command line, the graphical user interface (GUI), network data, and miscellaneous inputs."
You'd be wise to add Cross Site Scripting attacks to your list of things to protect against.
I believe code reviews with a large enough group of people to be extremely useful. Yeah, it takes time and you get some irritating comments from a few people about how there is a space between something or comma between something, but when multiple eyes look at it, someone always catches something you didn't. A few hours of extra pain on the side of programmers can prevent pain for millions in the form of blaster viruses, etc.
The article's worth reading, and really does justify it's "Level: Intermediate" label. Unlike when I was learning to program, there are lots of sources of input beyond your deck of punch cards (:-), and the author does a good job of explaining many of them, such as evil things that environment variables and file descriptors can be used for.
Bill Stewart
New Fast-Compression-only CPR http://preview.tinyurl.com/dy575ks
There are no controls on Windows inputs. Any process can send any message to any other process. Talk about insecure.
You could probably majorly screw up a progoram by sending it random message numbers. It'd react as if you were sending random menu and other commands. Hmm, that sounds like a fun prank to play...
I still have more fans than freaks. WTF is wrong with you people?
It is a widely accepted engineering maxim that systems should be designed so that it is difficult to use them improperly. This is why (for example) a 110 volt plug will not fit in a 220 volt outlet. Developers who are concerned about the quality of the software they make would do well to follow this rule, and not just for security reasons. You should verify input data as early and as rigorously as possible wherever you can. Take advantage of things like XML validation and text box constraints to make it hard for users to enter bad data. And always follow the Fail-Fast principle...if something goes wrong: Complain! Loudly!. Don't let the user continue working if something has gone wrong. It's better to crash than to produce an erronous result.
Just a little advice from a developer who's made enough mistakes to know better.
And why should anyone be surprised? In this age of "I read a book on VB last week and now I'm a software engineer!" type environment?
I am not surprised that simple things like this are rehashed over and over. This is more suited to the programmer group of people who will sort data based on string comparisons, instead of learning how to use a real algorithm to do it, or keep writing static forms, instead of learning how to use a loop with a db backend - because they don't understand true programming concepts. In other words, about 80% of the current crop of overpaid, undereducated programmers that built corporate apps.
- Eric
Is news to others. Many "Programmers" out there write code that does not do any error checking or catching and the result is all the crapware that we see today. We were all warned in our programming classes about memory leaks and buffer overflows, but they are still very prevalent in today's software. Perhaps we should all look harder at our code before selling off one it as a final product.
Another excellent article by David... oddly enough, I was reading his Program Library HOWTO (http://www.dwheeler.com/program-library/) just the other day to learn about dynamic loading libraries in Linux.
The Kernighan & Plauger book "Elements of Programming Style" dated 1979 talked extensively about the need to validate all inputs to subroutines and from the user. This is *not* new, it is just that few programmers have the discipline to follow the rules.
The issue is making *no* assumptions about anything. The programmer *thinks* the file will be written be another piece of code that a team member is writing. But that program has a bug. or three years from now, other programs are creating the file and don't know abut some verbal discussion about field data. It takes great dligence and paranoia and management that allows you the time in the schedule to do this.
Somewhere along the line every application must trust something. At the very least, BIOS settings and environment variables that are owned by deeper layers of the OS must be trusted because they are inaccessible or indecipherable at the application layer. Reaching too far would break encapsulation and create brittle dependencies. An application can only check the variables and direct inputs that it has access to.
I don't argue against validating inputs. Certainly all of the direct inputs to an application should be assumed to be untrustworthy unless a secure checksum validates that the inputs are indentical to some previously validated inputs. Checking inputs (or environmental variables) of immediately adjacent processes is probably also warranted (as a redundant "brother's keeper" policy).
The real problem comes if the OS has a faulty validation methods. (And I won't get into the neccessity of trusting the hardware or bugs such as those that plagued the early Intel 586.00001 processors) If I check the validity of a user, filename, or geographically localized data format (e.g., a date), then my application is dependent on the quality of the OS's validator (and a lack of intervening malware).
Two wrongs don't make a right, but three lefts do.
No buffer overflows come from using flawed 1970's technology. Modern computer languages are immune to the worlds largest security problem: (i.e. buffer overflows) because they do something automatically that C programmers are supposed to do manually.
Eliminate the buffer overflow and malicious input becomes invalid data which can be dealt with in a controlled fashion rather than executable gibberish.
Jilles
Your examples don't take user input, but most of them do take input of a different sort. The point of the article was that input can come from unexpected sources like environment variables, and that an attacker can sometimes subvert these inputs. The cpu meter, bg, fg, ps, top, logout, and clock programs all take input, in the form of system and library calls. Some of them also read input from configuration files.
The ocean parts and the meteors come down
Laid out in amber, baby.
Ok, every last subroutine validates every last input. Then what do you do? Suppose an input is invalid -- do you halt? Throw and exception? Patch the input and keep going? Keep going but make an entry in a log file?
It is excellent policy to be ultra paranoid about user input and to put "firewalls" between major program modules. But for every last subroutine to have its own error checks -- what if you have a top level subroutine that performs error checks and than passes validated results to helper subroutines? Do the helper subroutines need to repeat the checks?
I think there has to be some analysis of the data flows and designation of raw and filtered data flows, who does the filtering, and what assumptions or assertions can be made about filtered flows, and assignment of responsibility to do the checking.
In summary 1) defensive programming is not a substitute for good overall design, 2) there is a place for delegating responsibility for error checking and not chronically worrying about checked data.
There are no controls on Windows inputs. Any process can send any message to any other process.
Well, not quite. There are ways of isolating programs, but it's very rarely useful. (In fact, I've never done it, but I know it's possible.)
But why bother with all that when you can just install a system-wide hook? It's quite easy to actually inject code into another process. Once you've got that you can muck with data or intercept system calls to your heart's delight.
What it comes down to is that if you don't want a user to be able to screw up a machine, don't let them install applications and don't give them write access to critical bits of the machine. Untrusted programs should be quarantined.
You could probably majorly screw up a progoram by sending it random message numbers. It'd react as if you were sending random menu and other commands.
If you can mess up a program doing that then you've got bigger problems. If the user can crash it then what difference does it make than an external program can as well? It can be crashed, deliberately if given the opportunity.
What world are you living in? Blaming poor technique on the tool used is moronic. There are ample examples of poorly written, poorly secured Java code the invalidate all of the premises in this rant. I've seen hard coded passwords baked into java source that were visible through a 'strings' call. Someone forgets to obfuscate his or her classes, and the entire structure of the program is available through a reverse compiler. Sure, the JVM protects one from buffer overruns and the like but don't for one minute think that programming in Java prevents stupid errors from exposing you to vulnerabilities.
Not to mention there are areas where java is not the silver-bullet you describe. If you need precise control over your memory allocation, java is not the tool to use. If your application requires precise timing, java is not the tool to use. Need to control over the placement of allocated memory? Writing your own transport layer? Need hooks into the kernel?
The prime directive still holds true - use the correct tool for the job at hand. Follow the lemmings of "this tool is the only one you need" at your peril.
That's a good point. I have seen developers mistake javascript for sufficient input validation. The proper use of validation in javascript is to simply give a legitimate user a proper error message quickly without actually needint to perform a transaction with the server that will fail. The server must still re-validate the input.
Write a kernel in Java. Write drivers in Java. Write init in Java. Then you can say that.