PostgreSQL 8.1.4 Released to Plug Injection Hole
alurkar writes to tell us that PostgreSQL released version 8.1.4 today in order to combat a security flaw allowing a SQL injection attack. From the article: "The vulnerability affects PostgreSQL servers exposed to untrusted input, such as input coming from Web forms, in conjunction with multi-byte encodings like (Shift-JIS (SJIS), 8-bit Unicode Transformation Format (UTF-8), 16-bit Unicode Transformation Format (UTF-16), and BIG5. In particular, Berkus says that applications using 'ad-hoc methods to "escape" strings going into the database, such as regexes, or PHP3's addslashes() and magic_quotes' are particularly unsafe. 'Since these bypass database-specific code for safe handling of strings, many such applications will need to be re-written to become secure.'"
whitelisting, not blacklisting, is a good idea. Stop trying to define a set of 'wrong' data. Define a set of good data.
Most of the PHP apps I've ever had the (mis)pleasure to peruse make liberal use of this type of "escaping" rather than calling the provided "escape_string" functions. That never made any sense to me, but the practice appears to be quite common.
Mismatches between different character encodings seem to have been responsible for vast swathes of security vulnerabilities over the past few years. The sooner everybody moves to programming languages and software that use Unicode natively, the more secure we will all be.
Unfortunately, the languages receiving the most attention for web development have abysmal Unicode support. PHP and Ruby haven't a clue, although the next version of PHP is supposed to be much better in this respect. Python developers can at least handle things fairly well, although it's still a bit of a pain in the neck.
This vulnerability is probably going to cause quite a few problems for people, as it's a client issue that will probably need whatever adapter you use to be updated. Here is the user guide to the vulnerability for PostgreSQL. psycopg should be fixed shortly.
Bogtha Bogtha Bogtha
heh, heh, heh... I'll plug your injection hole, baby!
Must....not....make....joke....about...injection hole...being plugged...
Damn, too late.
=\
That's why I prefer Postgre. Oh, wait...
Oracle and MySQL suffer from similar vulnerabilites when going UTF8 -> database charset. The "answer" in Oracle is to use UTF-16 on the backend and a select 8/16-bit encoding in the front end if you want to support multiple locales. I'm not sure what the implications are for MySQL.
THIS THING CAN TURN ON A DIME, MACROSSZERO STYLE ALSO FUCK BETA, ~NYORON
If someone wants to try and help me with this problem, let me know, but I had trouble getting mysql to insert (actually doing COPY since it's about 10x faster) anywhere near what I can get with PG8.
I can understand how SJIS and BIG5 are vulnerable.
But in a UTF-8 string, no single byte will match a single quote besides the single quote character (0x27).
It seems to me that simply inserting a backslash before every single quote and backslash in a given string will have the desired effect, and that UTF-8 is not particularly vulnerable to this problem. (quite by design- it was invented by none other than Ken Thompson)
Either that article is misleading somehow, or else the postgres developers are simply putting in some safeguards for common errors in things such as php scripts.
I've only recently begun playing with PostgreSQL coming from Oracle. I've also been primarily a Java (JDBC) guy for the last couple years. I'm not sure I completely understand where this vulnerability lies. Would a Java PreparedStatement be vulnerable to this? Would the Postgres implementation of JDBC use 'addslashes()' to bind variables in a prepared statement? Or is this a higher level function? (I have not come across it myself, but like I said I'm still pretty new to Postgres).
I guess I see "affects PostgreSQL servers exposed to untrusted input, such as input coming from Web forms" and wonder if they're talking about some further functionality where postgres acts like a web server. My understanding of PreparedStatements is that they are bound at a very low level in the db to allow for maximum speed through caching etc...
"By the way, the dangling reference to a quote by one "Berkus" should be attributed to Josh Berkus." --Russ Nelson
PostgreSQL defaults to SQL-ASCII encoding, which is unaffected by this particular attack. Only clients which connect using a multibyte encoding would be affected.
Actually, this really isn't a vulnerability in the database server itself -- the update just intentionally breaks certain badly written applications in order to protect them from themselves. If PHP's addslashes() ends up creating valid multibyte characters that produce unexpected behavior, that's really PHP's problem -- Postgres is just doing what it's told.
Multi-Layered validation is the only way to go.
Client validation is only useful for round-trip bandwidth reduction, it's nice to have, but not secure in any way. It can stop the occasional accidental bad input. (e.g. entering strings when numerical data is called for, pop up a message box telling you not to do that), it won't stop anyone really interested in corrupting your data.
The app server should be validating everything being posted to it. Is this string too long, too short, not a string, wrong encoding, etc...
The DB server should ALSO be validating everything coming from the app server. Don't trust your application server, it could have a bug, it could have been hacked, it might not be your app server, who knows. Strict stored procedures with no r/w access to tables is a really the only way to go. (To: My Co-Workers, Using select * queries and running as dbo and/or sa is usually a sign that you're not doing it right)
Yes, it's paranoid thinking, yes, it's more work and yes, there is a slight performance hit, but it is secure and it's damn hard to break.
People who don't use prepared queries doesn't deserve any better than having someone to fuck up your database!
Often such are combersome or impossible with dynamic query generation, such as Query-by-Example forms where the terms and sort options depend on user input.
Many "prepared" thingies also depend on positional parameters, which can get messy. Would you like to use and maintain a function with 19 positional parameters? That is what it can feel like.
Further, certain kinds of prepared statements seem to have the same flaw. The PS api may "check" the types, but still passes a string to the database such that if you can find a way past the syntactical type checkers, you can pull off the same thing.
I do wish that most database API's offered a "read-only" mode such that the query being sent to the database is designated read-only. That way the worse a hacker can do is grab table data but not change anything. If you have logins restricted only to necessary tables, then the damage exposure is further limited.
Table-ized A.I.
PostgreSQL ignored invalid UTF-8 sequences, meaning a ' character at the end of a incomplete sequence could cause only one ' to be seen by the parser when escaped.
See http://www.postgresql.org/docs/techdocs.50 for the details.
dtach - A tiny program that emulates the detach feat
... because counting out 500 question marks to figure out why the hell your parameters don't match up is MUCH more fun than being paged at 3AM because the entire production database was wiped out.
If I have been able to see further than others, it is because I bought a pair of binoculars.
"'By the way, the dangling reference to a quote by one "Berkus" should be attributed to Josh Berkus.' --Russ Nelson" --SaDan
Haida Manga
Well, I was thinking that it might be possible if you bunched enough of them into the same transaction, had sufficiently small updates, used a single prepared statement over and over, turned fsync off, and had HUGE amounts of memory, then, perhaps?
./81/bin/psql test
You don't need to do all that.
The command:
time ruby -e 'puts "BEGIN;"; for i in 1..35000 do puts "INSERT INTO a VALUES(#{i});" end; puts "COMMIT;"' |
Gives me about 7-8 seconds with write caching off, fsync on. Yes, I turned off write caching with hdparm. This is on my PC, with an el-cheapo IDE drive and a modest 1GB of RAM.
If you think about it, why should it take a long time? It's all one transaction. It's basically just writing the data to a file (the log). The table "a" is just an int field, obviously. And with postgresql's MVCC, an INSERT is very cheap.
Social scientists are inspired by theories; scientists are humbled by facts.
'He also notes that the addslashes function was deprecated in PHP 4.0 due to security risks, but a "distressing" number of PHP applications continue to use the function.'
How come the php documentation doesn't mention this?
In the Python DB-API, SQL strings look like:
You create a dictionary (hash table) with a key "baz", pass that dict to the database along with your query, and it fills in the blanks. Your job as the programmer is to make sure that dict has all the keys in it to complete the query; it doesn't matter which order you assign them or if you don't use them all.
In fact, a very common case is to create on dict with all the values you'll need to execute a whole list of queries, and just keep passing the same dict rather than redoing it each time:
It's about as easy as you can possibly make it and has no disadvantages that I've ever encountered. So, I'd take the position that it's better to protect the server and forget about old ideas like positional parameters. There are extremely programmer-friendly solutions to this problem if you know where to look.
Dewey, what part of this looks like authorities should be involved?
'"\'By the way, the dangling reference to a quote by one "Berkus" should be attributed to Josh Berkus.\' --Russ Nelson" --SaDan' --LearnToSpell