PHP Application Insecurity - PHP or Devs Fault?
somersault asks: "There have recently been a lot of people making jokes at the expense of PHP, but how many common security flaws in PHP are the fault of the language, and how many the fault of the developer? A recent Security Focus article (via the Register) has a brief discussion which suggests that PHP is no less secure than any other scripting language, and that it is the users of the language themselves who need to be educated. The other side of the story is that the developers of PHP should work on tightening up the language to make it more 'idiot proof' by default. Should the team developing PHP take a more active role in controlling the use of their language? What will it take to ensure that users of the language learn to use it securely, short of defacing every vulnerable website out there?"
and addslashes. quick, which one is SQL secure?
Saying that it's the programmers' fault for writing bad code is like saying being injured is the fault of a lumberjack for not knowing how to use a chainsaw which is dull and jerks a lot. It's much better to start with a tool that prevents such mishaps rather than being unsafe by default.
Want to improve your Karma? Instead of "Post Anonymously", try the "Post Humously" option.
The problem is that so many neophyte progrrammesr jump into PHP to create something visible and useful. Which they succeed in doing, more often that not, I guess. But without a proper background in security and proper practice, there's a ton of vulnerabilities that get created, accidentally, over and over again by every new PHP programmer.
The same can be said about any other language. Take for instance, C. Very easy to create working code that's vulnerable as hell. Is this the original author's fault? Of course not. I'm sorry that whoever chose to write a webapp in PHP is ignorant of basic security principals, but it's not up to the coders of PHP to protect us from ourselves.
Take 100 programmers selected randomly, and instruct them all to write a given application, but have 20 of them write the code in PHP, 20 write the code in Python, 20 write the code in Java, and 20 write the code in C++, and 20 write the code in Perl. Then analyze the resulting code.
http://outcampaign.org/
I mean, why can't we all just write our code in assembly language and get it over with?
The fact of the matter is, that a programming language is a productivity tool. It is supposed to enable the programmer to more simply express complex actions rather than having to deal with all of the low-level particulars.
PHP advertises itself thus: PHP is a widely-used general-purpose scripting language that is especially suited for Web development and can be embedded into HTML. So, PHP claims to be "especially suited for Web development". Given that one of the primary concerns of web development should be security, I would expect that the language, and the core libraries that are packaged with it, would promote and encourange safe programming practices.
So, should the language be "idiot proof"? No, not necessarily, but it should certainly make secure programming hard not to do.
A good example of this approach is that taken by the OpenBSD project when it redesigned some of the low-level C library string manipulation functions to make them "more secure" in that they eliminated the programmer's ability to make certain, common, mistakes.
I don't look at this as a "stupid" versus "smart" issue. It's a "does my programming language help me do X or not?" issue.
So, stop blaming the programmer and find ways to make their already busy lives easy.
I kind of agree with where you are going, but I would add the following point:
SQL Escaping is evil.
Why?
Because no user input should ever be executed. EVEN if it is escaped. The problem is that the escaping can be invalid and buggy and thus, insecure.
People should use parametric SQL statements. No excuses. In this manner, no escaping is ever necessary.
A separate issue is what to do about displaying user input. Here, things are more problematic, especially in the world of HTML. What would be nice is if we all got together and redesigned "the web" so that user input could be handled in a manner similar to parameters in SQL.
Obviously, there's a difference between data in tables and data in a formatted page. But I'm sure something could be done.
mysql_escape_string and mysql_real_escape_string should both work (assuming you're using MySQL, anyway), but the former is deprecated as PHP 4.3.0 in favor of the latter; it also does not respect the current character set setting.
If you looked at the documentation for addslashes, though, it will tell you nice things like An example use of addslashes() is when you're entering data into a database even though there are special characters that it does not escape that can be used for SQL injection.
My beef with PHP is that it's full of junky functions like mysql_escape_foo() in the core distribution, main namespace, which don't even have a hint of data verification in 'em. I hear there's a neat database abstraction layer in PEAR, it even has prepared statements. But I'll wager there are plenty of PHP developers who haven't even heard of PEAR. Somehow, though, Perl seems to have managed to put together a decent standard distribution without this sort of mess...
The World Wide Web is dying. Soon, we shall have only the Internet.
Trick question.
None of the above, use bind variables instead.
Advanced users are users too!
Erm... "Upload.php" is a logical thing to call a script which accepts an upload... it doesn't necessarily mean they all do the same thing. Ever noticed that there are an awful lot of C functions called "char** parse(char* s)" even though many programs have errors in their parse function?
PHP pretty much invites you to be insecure with MySQL. They ship with this tempting mysql_query() that takes as an argument... a single string. (well, and a connection ID). To get something in there, you need to do something like mysql_query("select * from foo where whatever = '$var'") -- and remember to have $var properly escaped. PHP does not give you a pretty library with prepared statements, parameter binding, and such. There's a nice DB and MDB2 package available on PEAR, but PHP doesn't ship with those. It ships with the compile option --with-mysql.
Perl ships with a fair amount of stuff. It ships with a package named DBI. You can do things like $rv = $sth->execute(@bind_values);. The documentation on it starts off with a convenient set of good examples which go like
You can write code in PHP that's perfectly secure, you can do just about anything in PHP you could do in Perl (props for being Turing-complete, I guess), and yes, it ultimately is the developers' responsibility to secure their applications, not PHP's. That doesn't change the fact that PHP is an ugly mash-up of a language with Bad Choices just lying around in a scrap heap on the ground begging to be used. It's just about as organized as a scrap heap, too... (insert generic rant about naming conventions, parameter ordering, and such).
The World Wide Web is dying. Soon, we shall have only the Internet.
The problem here, is that if you cannot depend on the framework for SOME stuff, why are you even using a framework? Thats like if in Java or .NET you had to constantly worrie about memory leaks (you actualy do, to some light extent, but thats beyond the point), then when someone complained about the framework not handling them, people would go "dont blame the framework, blame yourself!". The framework is supposed to handle these things.
.NET 1 and 1.1 had a very well known flaw of this kind. The datagrid, when a column was configured as invisible, would still render the HTML for the data in that column, but simply not display it. This allowed the data to be seen in the source, but not on the actual page. This lead several developers to hide columns to have secret data in memory to work with on the server side, thinking the user had no access to it. Of course, a GOOD programmer would think of that and use a different method to hide the data securely. That doesn't change that it was an insecure and poor design choice in the .NET framework, and it was fixed in .NET 2.0. So yes, the framework was to blame. Same with PHP's issues. And they are severe. The community however, make them 10x worse than they should be.
The absolute worse thing ever in PHP is how until recently, SQL injection could happen because there was incredibly poor prepared statement support. Good frameworks encourage the use of prepared statements to the extreme. It was possible to use in PHP4, but certain extras had to be added, and it was rare to hear about them in tutorials, etc (thus the blame was also greatly on the community). This, along with the far too common default setting of mapping post variables to variables directly were major things that I definately think CAN be blamed on PHP and its community.
The answer is yes. Obviously, developers are ultimately responsible for writing secure code, but that doesn't mean we can't damn programming languages that fail to encourage good coding practices. I'm including libraries and official tutorials in this.
Fact of the matter is, real security comes from having many layers. Having a programming language that directs you to safe practices and actively prevents you from creating unsafe code is the first line of defense. Yes, the programmer needs to educate him or herself on how to write secure code, but given that people are not perfect, the language should have a safety net.
There's a reason that we've moved away from languages such as C, except when necessary.
And from what I've seen, PHP has really encouraged bad programming practices. Preferring escaping SQL strings instead of proper parameterized queries, register globals, etc.
I've audited quite a lot of PHP, written an article on PHP security from the hackers perspective, and done quite a lot of PHP development, and I've never come across an security problem that you could blame the developers for!
/. , and won't repeat myself here).
It's always the developer assuming something about PHP or the PHP environment but getting it wrong; you can argue that the developer should know, but there are so many gotchas in PHP, you have to be an expert to be aware of them all. (I've listed some in a previous post on
This isn't right for any language, but a language which web applications run on?! The most hostile environment to develop for is not the place for a language that makes it so easy to trip up!
The fault, for the vast majority of PHP security problems, is completely Zend's. Zend needs to give security priority over backwards compatibility, and get rid of all of their problems that developers repeatedly trip up on.
// MD_Update(&m,buf,j);
almost every php book or tutorial I've seen does incredibly dumb and insecure things... creating sql queries by concatenating strings without escaping input data, not using htmlentities, using global variables... that's an sql inection or xss problem waiting to happen.
PHP makes it easy to do dumb things and harder to do the correct thing. There's a low barrier to entry, so many php "programmers" don't know what they're doing. (this also applies to javascript...).
Do you even lift?
These aren't the 'roids you're looking for.
C gives you enough rope to hang yourself with.
PHP gives you lego bricks. Most PHP users, for some inexplicable reason, try to eat them and choke.
PHP does not give you a pretty library with prepared statements, parameter binding, and such. There's a nice DB and MDB2 package available on PEAR, but PHP doesn't ship with those. It ships with the compile option --with-mysql.
Perl ships with a fair amount of stuff. It ships with a package named DBI. You can do things like $rv = $sth->execute(@bind_values);. The documentation on it starts off with a convenient set of good examples which go like
$sth = $dbh->prepare("SELECT foo, bar FROM table WHERE baz=?"); $sth->execute( $baz );
PHP 5.2 ships with PDO (PHP Data Objects) extension which can run with your example code provided you load the extension in your php.ini (yes, I know its a setting that is not done by default, but that argument doesn't hold water with PHP 5 anymore). PDO also supports the prepared statements and parameter bindings of which you speak, and along similar lines, you can also do transactions. You should be clear about which version of PHP your referring to as PHP 4.4 is no longer considered the main release and also has not been updated since August while PHP 5 was last updated in November.
Though I can still agree that not all the choices made in the development were the best. AFAIK, every language has human developers and humans are not perfect, but we do do the best we can and have to continually aim to improve ourselves and the work we create.
I dont mind prepared statements for when they are usefull, but they dont always work properly. And actually there are many cases where using them you actually lose power. Lets start with a simple example of the LIKE clause :
SELECT * FROM titles WHERE notes LIKE ?
For the unfamiliar, like clause allows me to do partial searches over strings (char/varchar in the sql world). The LIKE clause search string syntax is something of a simplified regular expression. This means that characters that usually have one meaning gain another one. For example the percentage sign becomes a wildcard (think dos/bash filename matching with '*', or regexp with '.*'). For example, all string starting with 'word' we would just search for 'word%'. Great, but how does prepare/binded statement know if the given percentage is to be escaped or not. It doesnt. So you end up doing own user parsing. You are back to square one. You need to still parse user input, so whats the point of binded/prepared statement? Another example is using power provided through fulltext index. Generally, string searching is slow. In SQL world we do an index, a cache to speed up looking. Strings have indexes, but that only speeds up searching for string that start with something (like in above example LIKE 'word%') but what if we want to search for something purely inside the string ?? then we could do LIKE '%word%' but thats slow, on the other hand, we could speed this up by various smart caching and indexing of the contents of the string. This smart indexing we call 'full text'. For example to see if a column contains some word or phrase we could just do
SELECT * FROM myData WHERE CONTAINS (column, ?)
all ok, right? NOPE, because it also could be :
SELECT * FROM myData WHERE CONTAINS (column, 'FORMSOF (INFLECTIONAL, ?)')
To explain slightly, the second examples tries to find words that are not exact, but very close. So for word 'good' another word 'best' could be used as an alternative (with a lower relevancy ranking). Great power?? Yes, but the first time the sql expects the query in the form CONTAINS ( notes , ' "word" ') notice single and double quotes while later its CONTAINS(notes, 'FORMSOF (INFLECTIONAL, word)') notice, no quotes allowed...
and dont even get me started with the
SELECT * FROM myData WHERE column IN ( ? )
The IN clause is a speed over a series of OR statements. I could write WHERE column = 1 OR column = 2 OR column =3 or I could just do it with WHERE column IN ( 1,2,3) . And now the question for the binding gurus. How do I do it with prepared statements ?? Do I create a loop and both generate the SQL and fill a flat array with the right amount of paramenters WHERE column IN ( ? , ? , ? , ? ) , or do I just send arrays within arrays.
SECOND : parameter binding through naming :
cant wait for when parameter binding can be done in a templated fashion, so that no longer order of the columns matters, currently the way you fill prepared statement with data matters by order of the data. It all should be done with associative arrays.
$sth = $__db->prepare ( "select * from myData where cond1 = ? and cond2 = ? " ) ;
$res =& $__db->execute ( $sth , array ( $userInput1 , $userInput2 ) ) ;
it should be done more like
$sth = $__db->prepare ( "select * from myData where cond1 = ?userInput1 and cond2 = ?userInput1 " ) ;
$res =& $__db->execute ( $sth , array ( "userInput1" => $userInput1 , "userInput2" => $userInput2 ) ) ;
There is no special need to input more -- if you want, use the first method just pass non associative array, and library should know to handle param binding in old way -- but for any larger querry, with dozens of parameters, this will be a big boon in readab
I used to work with the Zend team and they seem determined o pander to the least common denominator of hobbiests and not allow the language to grow up. Things like nested classes and strongly types variab;es which should have been implemented in the latest version are strongly fought against. They things as well as other would help enforce good coding standards. But I have been told by the Zend developers themselves that they like to leave it up to the developer to code badly and to me that makes the language just as much to blame. I think the industry has established by now what are good programming habits and methodologies and what aren't.
This is my sig. There are many like it but this one is mine.
When you're using like statements, you will have to pre-process things, yes. Most notably, escaping % and _ plus any other rules you want to implement (* to %, ? to _, explode on spaces with multiple LIKE statements to search on keywords, etc...).
Parameters are intended for user input. I certainly hoping you aren't allowing users to type functions in directly...
As for IN, I build up the placeholders using something like...
$placeholders = array_fill(0, count($search_params), '?');
$placeholders = implode(', ', $placeholders);
$query = "SELECT last_name, first_name FROM patients WHERE disorder IN ($placeholders) ORDER BY last_name";
Then bind the parameters when running the query. (I use ADODB for PHP.)
GLaDOS for President 2016! "Well here we are again. It's always such a pleasure." -- GLaDOS, 2011
For one of the servers I worked on this was the syntax for full text search. you would do CONTAINS ( column , param ) . The argument param was a string that contained additional properties for the full text search engine. One could add things like weights associated with words and phrases (hence double quotes), or ask to search for word variation (search for 'good' also matches 'best', since they are related). Ofcourse, this was all happening in one string, that param, so you had to, yet again, format your own string.
I am not advocating against using parametered sql calls, actually they are great, but I fear that on some level they are not much better than the magic_quotes=on, I fear as if they were an escape for lazy developers : use always, and your code will be unhackable. That was the premise of magic_quotes, it made developers feel safe, as if magically their code was unbreakable.
Now, for stored procedure calls, especially with parameters that double as both input and ouput, the parameter binding is the only way to go.
Cheers
The arguments:
PHP is secure as in it has the functionality to make secure sites.
PHP is insecure in that some of this is not implemented from the get go.
PHP is flexible as it does not force security on you - if for any reason you are running in an isolated environ or implementing something different attached to PHP.
By not being as strict in variable typing, etc. there are some things that can be done more directly in PHP then in other languages. Though it could cause hidden errors in good code as well.
There is stuff that can be fixed, Zend should get some of the hard housecleaning done (magic quotes, register globals, etc.) in a version # release (those who can't stick with 4 or 5 etc.) Though you then need to get the ISPs to upgrade and all the legacy scripts...
ASP, Java, Perl and Ruby people would like to see more stuff in their languages than in PHP (and will FUD PHP to promote thier cause good or bad).
I chose PHP because:
- it is on most webhosts and distro installers
- a lot of great code and/or projects are readily available in PHP.
- the language does everything I require and then some
- the syntax is VERY easy to read and understand - this includes my own code as well as learning from others.
- it is platform agnostic (no lock-in)
- it is not limited by licensing (if open source, which is ok for me) or vendor-control code restraints
- it works with many platform agnostic DBs also
- even the security issues are well documented and understandable and does teache you a lot more about web security than languages that just do it for you (or that you assume are secure).
So for me I know the drawbacks and I see the benefits, and the benefits are worth the extra effort.
In summary I see that it has worthy merits and also "warning labels", (such as this slashdot post illustrates) the devs will make up thier own mind on using it, get over it.
"Enjoy what you're doing! If it becomes drudgery, you're doing it wrong!" - Jim Butterfield
The first web scripting language that I did any sort of serious development with was Cold Fusion 1.0, late last century. It was simply amazing how much quicker the development of database-driven websites became. (Prior to that, I was compiling custom DLLs to load into IIS - or whatever it was called way back then.)
I very quickly made a whole series of small web applications to access our internal data - something that I later found out was called an "intranet".
Then, one day, when I was testing a form, I heavy fingered the single quote ' and the enter key on an input box and got some surprising results! The SQL statement got completely destroyed by including the quote in the input box. I actually thought this was fun, and typed in additional SQL to see if I could change the query. It was easy! I made the query do all kinds of weird things. This got me toying with forms that actually did inserts and adding random stuff to the query string and I realized how trivially easy it was to completely subvert all the smallish applications I had written. Thank the Lords of Kobol it was an internal site!
At any rate, I learned, in my safe sandbox, that securing a web application is not trivial nad is something you have to think about from the moment you sit down to code. I developed a bunch of functions to verify the existence of, escape, and validate every single piece of data that is ever passed from the UI to the database. You just have to do it, it's that simple.
Since those early days, I've done sites in Cold Fusion, ASP, JSP, PHP, Perl, WebCatalog and a couple of other oddballs, and I've always started by translating those functions to the new language, using the built-ins of the language when I could. You know what? All web scripting languages that are easy and powerful wind up being insecure in the hands of an inexperienced developer.
And let's be honest, if there was a secure and easy to use web scripting language, we'd all hate it because it tied our hands too much and made us do things a certain way. We, serious developers, love languages that let us do things the way we want to do them. Assembly developers feel confined by C, C++ developers feel confined by Java; HTML hand coders feel confined using Dreamweaver. So honestly, if they came out with SecurePHP, largely not backwards compatible for one thing, would anyone use it?
I know I'd WANT to, in theory, but would I? Would you?
Oh man your comment really bothers me. If you are relying on a PHP function to ensure user submitted data is trustworthy then you don't have PHP to blame if something goes Ka-blam due to a malicious user. I don't care what language you're coding in... if you trust user-submitted data without putting it through multiple rigorous tests, then you have nobody to blame but your naive self.
mysql_escape_string and mysql_real_escape_string should both work [...] but the former is deprecated as PHP 4.3.0 in favor of the latter; it also does not respect the current character set setting.
Does anyone else not see a problem with this? Oh first we had addslashes but a lot of people complained, then we added mysql_escape_string but we decided it didn't work (for whatever reason) so now we have mysql_real_escape_string so people should be happy now. Oh and we have a magic_quotes variable you can set to automatically do this for you, but it might not be enabled on every instance of php.
And then we have:
PEAR::DB is a nice database abstraction (somewhat like perl DBI). Although it's been superceded by PEAR::MDB2. PHP 5 has native PDO, which is also like DBI or DB, or MDB2, but each one has a slightly different syntax.
<sarcasm>Wow, these PHP developers really make it easy to do something simple like query a database! </sarcasm>
First I have a problem with lack of namespaces. Yes, you've heard it before but the above illustrates why it's a problem. If I instead had two libraries, mysql_escape and mysql_escape2 (bad names but bear with me), I could now have them use the same function names so I don't need to have mysql_escape_string and mysql_real_escape_string. To upgrade, I just change what library I include and I'm done. Having all these functions always accessible creates an inconsistent naming of functions.
I currently program in PHP as my real job.... I rarely use it in my personal web based projects preferring python or Perl (Possibly looking into Ruby at some point) because I've come to really dislike the language. However I also don't think it's as bad as some people make it out to be.
Security is primarily about education and not the language. I've been deploying public PHP applications for clients for years. In the early years problems were more abundant (registered globals, etc.), but in the later years (PHP5), the storm has calmed and common practices and patterns have been discussed, encouraged, and implemented so thoroughly that anyone making common mistakes these days simply hasn't educated themselves adequately.
And this isn't just the fault of the developer. Unfortunately there's too many resources and options available, all of which have differing and conflicting methods for accomplishing something. Letting an uneducated developer decide which option to pick, I would agree, is not desirable.
But let's be clear on something: I design, build, and deploy enterprise-grade PHP applications for multi-million dollar projects. If there's a security problem discovered, it is my or my team's fault that we didn't protect against it. It's my responsibility to be educated enough to diagnose and prevent security threats in an application. I cannot say to the client, "PHP is inherently insecure", and expect that reason to fly and absolve myself of all responsibility.
I clearly do not understand why this excuse is the predominant argument here. "PHP is inherently insecure" is simply not true. PHP certainly doesn't encourage proper programming practices from the beginning, but by the same token, I can't recall a programming manual that doubled as an education tool in design and security practices that, combined, allowed me to write bulletproof code from the very beginning.
For he today that sheds his blood with me shall be my brother.
PHP is an awful language, doesn't scale
How many times has this been said, and how many times do people need to point to examples like Wikipedia, YouTube (partially), Yahoo, Google, Facebook, and much more for proof of scalability?
And if you mean PHP doesn't scale architecturally, then you've demonstrated that you've never worked in an environment that did effectively scale PHP, or you simply failed at it. I'm going to guess both.
For he today that sheds his blood with me shall be my brother.
I'd argue PHP is actually worse than C, since C at least behaves consistantly and doesn't depend on the settings in some .ini file to get reasonable behaviour. But even if PHP is "no worse than C", that's still incredibly bad for a language designed specifically for web development. C is dangerous because its portable assembly. PHP has no excuse for being dangerous, it was designed specifically for a security sensitive task in an era where exploits had already become common place. The idea of exploiting software was quite foreign in the early 70's when C was born.