Secure Programmer: Keep an Eye on Inputs
An anonymous reader writes "This article discusses various ways data gets into your program, emphasizing how to deal appropriately with them; you might not even know about them all! It first discusses how to design your program to limit the ways data can get into your program, and how your design influences what is an input. It then discusses various input channels and what to do about them, including environment variables, files, file descriptors, the command line, the graphical user interface (GUI), network data, and miscellaneous inputs."
You'd be wise to add Cross Site Scripting attacks to your list of things to protect against.
I believe code reviews with a large enough group of people to be extremely useful. Yeah, it takes time and you get some irritating comments from a few people about how there is a space between something or comma between something, but when multiple eyes look at it, someone always catches something you didn't. A few hours of extra pain on the side of programmers can prevent pain for millions in the form of blaster viruses, etc.
The article's worth reading, and really does justify it's "Level: Intermediate" label. Unlike when I was learning to program, there are lots of sources of input beyond your deck of punch cards (:-), and the author does a good job of explaining many of them, such as evil things that environment variables and file descriptors can be used for.
Bill Stewart
New Fast-Compression-only CPR http://preview.tinyurl.com/dy575ks
The Perl language has built-in "taint-checking" enabled via the -T command line switch which causes Perl to automatically keep track of all information that possibly came from a user input and not allow any of it to do anything harmful (basically end up on a command line or in a file name).
We don't see the world as it is, we see it as we are.
-- Anais Nin
There are no controls on Windows inputs. Any process can send any message to any other process. Talk about insecure.
You could probably majorly screw up a progoram by sending it random message numbers. It'd react as if you were sending random menu and other commands. Hmm, that sounds like a fun prank to play...
I still have more fans than freaks. WTF is wrong with you people?
It is a widely accepted engineering maxim that systems should be designed so that it is difficult to use them improperly. This is why (for example) a 110 volt plug will not fit in a 220 volt outlet. Developers who are concerned about the quality of the software they make would do well to follow this rule, and not just for security reasons. You should verify input data as early and as rigorously as possible wherever you can. Take advantage of things like XML validation and text box constraints to make it hard for users to enter bad data. And always follow the Fail-Fast principle...if something goes wrong: Complain! Loudly!. Don't let the user continue working if something has gone wrong. It's better to crash than to produce an erronous result.
Just a little advice from a developer who's made enough mistakes to know better.
And why should anyone be surprised? In this age of "I read a book on VB last week and now I'm a software engineer!" type environment?
I am not surprised that simple things like this are rehashed over and over. This is more suited to the programmer group of people who will sort data based on string comparisons, instead of learning how to use a real algorithm to do it, or keep writing static forms, instead of learning how to use a loop with a db backend - because they don't understand true programming concepts. In other words, about 80% of the current crop of overpaid, undereducated programmers that built corporate apps.
- Eric
Perl programmers interested in writing secure scripts should *definitely* know about the -T (taint checking flag).
From the FAQ:
As we've seen, one of the most frequent security problems in CGI scripts is inadvertently passing unchecked user variables to the shell. Perl provides a "taint" checking mechanism that prevents you from doing this. Any variable that is set using data from outside the program (including data from the environment, from standard input, and from the command line) is considered tainted and cannot be used to affect anything else outside your program. The taint can spread. If you use a tainted variable to set the value of another variable, the second variable also becomes tainted. Tainted variables cannot be used in eval(), system(), exec() or piped open() calls. If you try to do so, Perl exits with a warning message. Perl will also exit if you attempt to call an external program without explicitly setting the PATH environment variable.
I'm a bloodsucking fiend! Look at my outfit!
Ya you can talk about inputs to programs and how misc. and unwanted data get in there but watch for buffer overruns because thats what can really kill your program.
There is or can be built a machine that can simulate any physical object. -Church-Turing principle
Is news to others. Many "Programmers" out there write code that does not do any error checking or catching and the result is all the crapware that we see today. We were all warned in our programming classes about memory leaks and buffer overflows, but they are still very prevalent in today's software. Perhaps we should all look harder at our code before selling off one it as a final product.
The recommendations on dividing the program into unsecure and secure binaries to handle setuid access in GUI's can very properly be extrapolated to non-graphical programs. This is a very good strategy for allowing relatively wild programs access to important facilities and can involve many types of IPC including memory-mapped files (with proper protection) and sockets. To really secure a client program that needs access to criticals, put it in a chroot jail and have it communicate with an outside process through (e.g.) a socket. Separating programs into safe and unsafe sections and applying different security techinques to each is far more effective, imo, than trying to secure a single, large application. It can also provide many other benefits of encapsulation, etc. The security onus shifts to handling client requests in the secure section which is usually much more easy to do.
Hacking articles at http://www.geocities.com/chroo
"Given that every single way to compromise security involves bad input, it's not surprising that it's in a security magazine."
What about program bugs that are not input related? If a program breaks when an internal timer overflows for example, or accessing a section of memory that has been deallocated. Such bugs can easily cause breaches in security as well as general system failure, all without any human intervention. It reminds me of the black out that Sterling mentions:
http://www.lysator.liu.se/etexts/hacker/crashing.Hacking articles at http://www.geocities.com/chroo
Java
.NET
XML
Basic point, I suppose, is that if you insist on using a U*ix-family OS in such an environment then you must ensure that the U*ix environment is clean at the beginning, which may well be more a matter of the procedures and quality control of the platform and the application deployment than of the individual apps.
Oh, and btw, I thought the head of the thread was a kneejerk reaction, but - flamebait? Shame on whoever moderated it that way.
I'm a little source code, robust and stout.
Here is my input, here is my out.
"They that can give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety."
See this paper. I remember reading the original document in (CACM IIRC) and was pleased to see it updated showing just how far "forward" we've come.
The proliferation of proprietary formats we are seeing that all do basically the same thing, like send sound files over the net, or view video clips, are encouraging mass downloads of programs from third party providers. These programs may well do what they said they would do, but with all this DMCA crap going on, its getting harder and harder to see if they are doing a little extra that wasn't in the bargain, like doing zombie work on the side to assist in little capers the originating author needs to pull off.
What firewall or systems programming can stop a deliberately malicious program installed by an ignorant user? Say the program "demands" access to the internet for "verification/auto-update", then you have to set the firewall to allow this program access to the net. Now what happens? Its like giving car keys to a valet parking agent. You only have to trust he's only going to do what he says he will do. To add insult to injury, consider you generally have signed any recourse you have when you click that "I agree" button that confirms you have read and understood the EULA.
What irritates me so about these "plug-ins", "macros", and "scripts" is that they are indeed executable. Nothing says the malicious person coding these things is gonna follow the rules. He is free to code some really nasties in assembler if he so desires. The state of music file distribution I find really disturbing. We have an MP3 format which is generally well understood, yet it seems everybody jumping on the bandwagon wants to use proprietary formats which are not generally understood, leaving us all open to the risks resulting from ignorance.
As a public, we aren't helping much. We agree to any damn thing they print in the EULA. As a public, we should INSIST that if we are to be kept ignorant by law how something works, if that something does something malicious, then its maker should have full responsibility for the problems it generated.
Basically I am proposing a trade. If you want the protection of law to keep the public ignorant, then you waive indemnity.
We have a patent system and copyright system in place. Both were implemented on the concept that the work was to be in the open. Why aren't encrypted work also known as "trade secret" and not afforded protection by copyright or patent? Basically, any work encrypted would be considered a "trade secret", not in the open, hence not eligible for protection by the patent or copyright system at all? But to make this happen, its gonna take the will of a lot of people to pressure the legislators to enact this. Pressure as in "if you do not do this, start polishing your resume.".
"Prove all things; hold fast that which is good." [KJV: I Thessalonians 5:21]
I wrote a similar article recently for SysAdmin magazine, although the focus is more about Perl.
The Kernighan & Plauger book "Elements of Programming Style" dated 1979 talked extensively about the need to validate all inputs to subroutines and from the user. This is *not* new, it is just that few programmers have the discipline to follow the rules.
The issue is making *no* assumptions about anything. The programmer *thinks* the file will be written be another piece of code that a team member is writing. But that program has a bug. or three years from now, other programs are creating the file and don't know abut some verbal discussion about field data. It takes great dligence and paranoia and management that allows you the time in the schedule to do this.
The article is interesting, and they are right to point out the many dangers of relying on environment variables. Where I work (unidentified to protect the incompetent), programmers are not allowed access to the unix command line. Instead, all user exits are trapped, and programmers are forced to navigate through a homegrown menu system.
This menu system relies on an environment variable ${WHATCANIDO} to store a list of permissions available to that user. Of course, I changed my .profile to add my own extension to the permission list. I even nicely dated, initialed, and described my change. ;)
export WHATCANIDO=world_domination:$WHATCANIDO # 2000/10/31 tw Too easy
So now when I get frustrated with the absurdity of this arrangement, I just take echo the environment variable to remind myself why I'm right and they're wrong.
> echo $WHATCANIDO
world_domination: [deleted]
Somewhere along the line every application must trust something. At the very least, BIOS settings and environment variables that are owned by deeper layers of the OS must be trusted because they are inaccessible or indecipherable at the application layer. Reaching too far would break encapsulation and create brittle dependencies. An application can only check the variables and direct inputs that it has access to.
I don't argue against validating inputs. Certainly all of the direct inputs to an application should be assumed to be untrustworthy unless a secure checksum validates that the inputs are indentical to some previously validated inputs. Checking inputs (or environmental variables) of immediately adjacent processes is probably also warranted (as a redundant "brother's keeper" policy).
The real problem comes if the OS has a faulty validation methods. (And I won't get into the neccessity of trusting the hardware or bugs such as those that plagued the early Intel 586.00001 processors) If I check the validity of a user, filename, or geographically localized data format (e.g., a date), then my application is dependent on the quality of the OS's validator (and a lack of intervening malware).
Two wrongs don't make a right, but three lefts do.
Almost everything in this article only applies because of hacker languages like C and C++, which Linux and FreeBSD use for virtually everything. It is so easy to forget to double-check bounds, input format, pointers, and all the other usual suspects. It's bizarre how programmers will use these error-prone languages for marginal performance gains just because their ego and haxor status is on the line. Sure, the kernel and drivers need to be in C. Sure, a Java VM needs to be in C. Sure, C++ is a good langage for game engines. But almost nothing else should be written in C/C++.
Command-line type programs can be written in Java and statically compiled into small, low-memory, fairly fast programs. And the JVM overhead is has almost no affect on the larger programs. But you have to work really hard to put a security problem into a Java app instead of working really hard not to. And you get garbage collection, an awesome API, security, faster compiling, dynamic classloading/linking, easier coding, etc. People think Java takes a lot of memory, is slow, and ugly. But that's almost entirely because of the Swing GUI, which is not actually all bad. Replace with IBM's SWT and you'll see a dramatic difference.
Of course there are other languages besides Java that protect against security problems, but few that do so as completely and easily. If half the effort had been put into inplementing the Java APIs in open source as just on GTK/GTK+ then linux/bsd could do nothing for ten years and still be ahead of the rest.
Your examples don't take user input, but most of them do take input of a different sort. The point of the article was that input can come from unexpected sources like environment variables, and that an attacker can sometimes subvert these inputs. The cpu meter, bg, fg, ps, top, logout, and clock programs all take input, in the form of system and library calls. Some of them also read input from configuration files.
The ocean parts and the meteors come down
Laid out in amber, baby.
Ok, every last subroutine validates every last input. Then what do you do? Suppose an input is invalid -- do you halt? Throw and exception? Patch the input and keep going? Keep going but make an entry in a log file?
It is excellent policy to be ultra paranoid about user input and to put "firewalls" between major program modules. But for every last subroutine to have its own error checks -- what if you have a top level subroutine that performs error checks and than passes validated results to helper subroutines? Do the helper subroutines need to repeat the checks?
I think there has to be some analysis of the data flows and designation of raw and filtered data flows, who does the filtering, and what assumptions or assertions can be made about filtered flows, and assignment of responsibility to do the checking.
In summary 1) defensive programming is not a substitute for good overall design, 2) there is a place for delegating responsibility for error checking and not chronically worrying about checked data.
Any process can send any message to any other process. Talk about insecure.
Accourding to http://security.tombom.co.uk/shatter.html it is much worse than just that. Not only can anyone send such a message, but the messages can even force the receiver to execute arbitrary code.
Do you care about the security of your wireless mouse?
That's a good point. I have seen developers mistake javascript for sufficient input validation. The proper use of validation in javascript is to simply give a legitimate user a proper error message quickly without actually needint to perform a transaction with the server that will fail. The server must still re-validate the input.