Anatomy of a SQL Injection Attack
Trailrunner7 writes "SQL injection has become perhaps the most widely used technique for compromising Web applications, thanks to both its relative simplicity and high success rate. It's not often that outsiders get a look at the way these attacks work, but a well-known researcher is providing just that. Rafal Los showed a skeptical group of executives just how quickly he could compromise one of their sites using SQL injection, and in the process found that the site had already been hacked and was serving the Zeus Trojan to visitors."
Los's original blog post has more and better illustrations, too.
Persistence is just a bad idea, it hides the real performance issues of how databases work, and limits how you can easily manipulate the data. A better idea is just to always use bind variables. Problem solved.
I still have more fans than freaks. WTF is wrong with you people?
I for one am sick and tired of these types of attack. Whoever, in their right mind thought it was a good idea to expose SQL query inputs on the Web?
Ever heard of input sanity checking? It was very popular in the say, 60's, 70's and 80's. It means you reject fields you don't expect to be there, instead of arbitrarily passing them onto the backend database. These types of attacks illustrate what is wrong with web security: developer convenience trumps common sense everytime...
Next time we see Ballmer hopping along shouting developers, maybe could he please add the words 'SECURITY BY DESIGN', please, pretty please?
SQL injection attacks are asinine because they are so prevalent, easy for the hackers AND easy to fix. We should name and shame every site, and every web-application stack that allows these attacks to take place.
nuf said.
A more simple way is to use a parametrized statement. No extra libraries if you are using Java, .NET, or PHP5.
Comment removed based on user account deletion
> Still upsets me how many developers are anti Stored Procedures
;) ) .
Using stored procedures is harder than just creating the SQL query.
You need to know even more about the DB.
Even it's harder if you have to get the DB guy to do stuff for you.
Yes you can run your own DB in development, but in the production environment, you may depend someone else (DBA) to set up all those stored procedures so that stuff works.
In contrast on sane programming languages using explicit SQL queries with "bind variables/parameters" can actually be EASIER than using SQL and doing the quoting for each variable yourself.
Such stuff used to be hard to do properly on PHP+MySQL (example - there's boneheaded stuff like register_globals, magic_quotes and also mysql_escape_string() vs mysql_real_escape_string(), good thing there's no mysql_the_actual_real_escape_string_this_time_no_kidding()
And many hosting sites still are stuck with the "old PHP" ways.
If your code is running at the correct privilege level, SQL injections should be completely irrelevant.
If your user is connecting with the correct credentials, they should only be able to see public data and their own records, nothing else.
This is implemented by using views in the database, and only allowing users rights to views, not tables.
If your website user is connecting with credentials that allow a crafted SQL query to see priveleged data, you have set everything up wrong
If you have set everything up correctly, even a successful SQL injection will only return data the user can see
Note: I even admit in my profile I'm a bad web developer.
I have JFGI, but most of the stuff I've found leads me to articles I don't fully understand how to implement. I mostly code simple websites for my school and friends that have little db interaction, but I'd rather learn to do it right from the beginning, so if anyone has some links to good articles for beginners to understand how to properly secure their SQL code, I'd be happy for the help.
An important change for education.
Unless you are trying to put Chris O'Connor into your database, and his name must be spelled correctly...
A house divided against itself cannot stand.
What about "The code was written and deployed when PHP3 was new and noone will give me any money for fixing it and probably blame me if anything breaks."?
I have written a lot of web stuff from 1995-2000 (first in C using Tom Boutell's cgic, then in perl, then in PHP) and yes, quite a bit is still in use today. What should I do about all those old vulnerabilities lying around (keep in mind that I mostly quit web development in 2001)?
learn from Scotty. always double your estimates... Especially when they ask for an honest estimate.
I'm up to a multiple of 16 now.
Deleted
Quick answer: A lot.
Long answer:
You are mistaking escaping with sanitising. These are two very different things.
Sanitising should occur as soon as possible, before the values are used. It involves validating and optionally filtering _each_ field, so that you know the data you are getting is exactly what you are expecting it to be. This is a lot of work, which is why a lot of people skip it, hence the large number of vulnerabilities in the wild. I suggest looking into libraries like Zend_Form to help with this.
Escaping on the other hand, is done just before the variable is used. This is because different output formats have different escape sequences. E.G for SQL you would use named variables and let the engine handle the escaping for you, but for HTML you would use something like htmlspecialchars().
Both sanitising and escaping are required for a secure application.
Use addslashes on all variables and make sure you set the character set and you are done. Go to sleep.
You define it up front in the project and stick to it.
Look at the skill set of the team. If you don't have a strong database guy, your logic will probably be in the app layer.
If the database is shared between apps/services/etc., then more logic needs to be enforced there. Data integrity triggers to prevent bad data from getting into the database from any side. Access to tables going through stored procs.
If you have to debug, you work through it regardless of where it is. Just like testing anything with multiple layers (gui, app layer, remote web services, database code, etc.), you test each layer individually but using the same call as a collective test. Eventually, you will isolate the layer with the issue. Dig in deep and root out the problem.
Your post is quite good. In fact, I'm sitting in the corner, thinking that (1) enhances the user experience, but does not by any means decrease any chance of SQL injection. Anyone with Firebug can overcome client side validation. (4) Is the essential point where the data is actually persisted. You can perform as many contortions as you want with your data, but only what is stored there will be retrieved. So it makes sense to start with making sure (4) is correct, because any manipulation on the data has the ultimate objective of being persisted in the database.
Let's hear from a grown-up then what is the role of (2) and (3), please, please. I've not yet made mysellf a strong opinion on protecting these layers, but seems to me that they are much less important than (4). And from a security perspective, (1) is completely irrelevant.
Stored procedures are not the cure-all for everything. They are good if you have only a few ways of doing things, but it's ridiculous to write a different stored proc for every single column that you want to sort by. Its stupid to write a new stored proc for every possible way of varying the query. Yes they do guaruntee some kind of type checking and parsing compliance, but you can do that with a prepared statement as well. Dynamic SQL is a lot more flexible, especially when the number of stored procs would be combinatorial in number. You just have to be smart, and know what to do. Try converting your values to the types you want. Make your own parser if there is no other way, but for example, in the .NET world you can use ADO.NET with the typed parameters on text queries and it's every bit as safe and efficient as a stored proc. I'm not sure how well or not this translates to PHP and MySQL but I think the db module has most of the same stuff, if I recall correctly.
Speak for yourself.
Simply searching on google fo the tail end of the URL shows exactly which sites are vulnerable and the provider of the sites... Now the entire database of restaurants is open to attack. If the author was trying to teach their client a lesson or two (or 50)--well, good job...
Monitor bandwidth usage on IIS6 in real-time: http://www.waetech.com/services/iisbm/
Putting the logic in stored procedures allows ME, the DBA, the guy with the SQL know-how, to tune the gawd-awful query that you, the pointy-clicky .NET monkey, is using to bring my server to its knees. NINE left joins again subqueries, each with a GROUP BY, then another GROUP BY applied over the query as a whole? WTF are you thinking? Fixing your code requires a new build & deployment cycle. Fixing a stored proc, I can do that with a simple DROP/CREATE script.
Yes, I'm bitter. I'm surrounded by pointy-clicky types who insist on procedural thinking when writing queries. Set theory? What's that?
Took me 2 minutes with Google to find other sites that are apparently using the same crappy code with the same vulnerabilities. "inurl:" does wonders.
Do you have ESP?
At this rate, I'd rather roll my own. Individual calls to bindValue when we've got HASHES in our language? Inability to reuse parameter names (where x=:foo and y!=:foo)? Fuck no!
bindValues($_POST) and letting the prepared query decide what parameters it's actually going to be using, rather than dicking around with every possible combination of ways to search for something in my table? Hell yes!
Tell your people to hire real programmers (because real programmers know TSQL and set theory just as well as you do) and stay the hell away from my business logic, thanks. Worry about replication, administration, and index tuning like you're supposed to. (The "A" in DBA, ya know...)
In the rare case that I can't figure out how to make a specific query fly - I'll ask for your help - but rest assured that the code stays in my control.
This includes coders who insist Hibernate is the be all and end all of database interfaces. 'I don't need to know SQL well because I have just abstracted that whole icky database thingy.' Meanwhile you get both shit code (because they are all silver bullet use the latest technology and agile technobabble to write factory and strategy patterns everywhere when a simple 'if' statement would have worked well) and shit database performance. No I am not frustrated nor bitter at blind lemming-like stupidity. OK, well a little... (and yes, hibernate and other ORMs are good, when used in moderation).
-- I ignore anonymous replies to my comments and postings.
Actually that's a real problem when you have to maintain versioned systems.
You can't track versions of stored procedures as easy as you can do it within the code. At least I'm not aware of a fair method to do this.
Another way to phrase what parent is saying is that the database is a shared resource. It only takes *one* poorly written ad-hoc query to screw performance for everyone interfacing with the database.
Another difficulty is to database schema changes. Want to change to a partitoned table? Can't really do that when the DBA has no idea how all of the client apps are interfacing with the tables using dynamic SQL. I have seen too many situations where the end result was that the DBA made the change, and waited until applications started breaking to figure out how people were (incorrectly) using the schema.
No, the logic belongs to the app. The database is there to store data, how we are going to use it is app-dependent.
Be careful with generalizations. It's a matter of requirements not some universal law.
If it makes sense to put it in the app and you don't particularly care about performance or security (because you trust your users) then put it in the app. I would have a problem with letting an untrusted user execute direct queries against my database though - the app has to log in somehow to the database with a username and password, and it's never hard to extract the username and password, and so suddenly a malicious user has access to do pretty much anything they want. If the app can delete a single order then the user can delete all orders etc.
Some advantages of putting business logic in the database are:
1. code reuse - your windows app, linux app, web app, and mobile app can all re-use the same business logic and just have to focus on the presentation layer.
2. control - you get to control exactly what the app can do, audit it etc, because all access goes through your stored procedures
3. performance - as fast as your network is, you aren't going to beat having the code running in the database process itself
4. performance again - left to their own devices, users will construct horrible queries that will sap the very life out of your database. If they have to go through your stored procedures then they can't wreak as much havoc on your server.
But at the end of the day the requirements might be that only a few trusted users are going to use this app, all access is via the web, and they want it finished yesterday, and so suddenly none of the above advantages really matter so much...
I agree. Just like any regular program, input must be reduced to an EXPECTED set of values.
This is a good advice, but not when applied to this example. You do not want to restrict what users type for their name in any way (as a bunch of replies already point out, GP has missed a few obvious things... and then, what if I'm Russian or Japanese, and want to use my original alphabet?). All that needs to be done is properly escaping any unsafe chars (and for those, the set is known and well-defined); leave everything else be.