Slashdot Mirror


PostgreSQL 8.1.4 Released to Plug Injection Hole

alurkar writes to tell us that PostgreSQL released version 8.1.4 today in order to combat a security flaw allowing a SQL injection attack. From the article: "The vulnerability affects PostgreSQL servers exposed to untrusted input, such as input coming from Web forms, in conjunction with multi-byte encodings like (Shift-JIS (SJIS), 8-bit Unicode Transformation Format (UTF-8), 16-bit Unicode Transformation Format (UTF-16), and BIG5. In particular, Berkus says that applications using 'ad-hoc methods to "escape" strings going into the database, such as regexes, or PHP3's addslashes() and magic_quotes' are particularly unsafe. 'Since these bypass database-specific code for safe handling of strings, many such applications will need to be re-written to become secure.'"

162 comments

  1. This is why... by ArchAngelQ · · Score: 2, Interesting

    whitelisting, not blacklisting, is a good idea. Stop trying to define a set of 'wrong' data. Define a set of good data.

    1. Re:This is why... by jrockway · · Score: 5, Informative

      It especially bugs me because it's easier to Do Things Right. The DBI manpage for perl doesn't even mention the sloppy way that nearly everyone uses... but they do it anyway! In nearly every database application / script I look at, people do things like $dbh->execute("SELECT * FROM foo WHERE bar=$bar AND baz=$baz") after "escaping" $bar and $baz. No, no, no!

      It's much easier to prepare a query handle and then execute it as needed:

      $sth = $dbh->prepare("SELECT a,b,c FROM foo WHERE bar=? and baz=?")
      $sth->execute($bar, $baz);


      Not only is it more efficient (if you're going to use the same query twice), it's secure by default. Let the database programmers handle the Hard Stuff (parsing) so that you can concentrate on your application.

      Speaking of which, is there a way to do this in PHP? I've never seen a PHP script that did anything like this (which is probably why bugtraq is 99% php SQL injection holes).

      --
      My other car is first.
    2. Re:This is why... by onlyjoking · · Score: 2, Informative
      Speaking of which, is there a way to do this in PHP?

      PHP5's mysqli extension enables you to use prepared queries.

    3. Re:This is why... by Anonymous Coward · · Score: 0

      The postgres library for php does have a prepare statement.

      Preparation fails colossally in certain edge cases. (for instance perl's db handler at one point years ago didn't realize that dates of the form 2006-05-01 needed quotes. The database happily did the math and stored the beginning of 2000) In the end, it's merely pawning the problem off to someone else, in hopes that they did it right (which in the case of this postgresql upgrade, means your hopes failed you).

    4. Re:This is why... by jrockway · · Score: 4, Insightful

      > didn't realize that dates of the form 2006-05-01 needed quotes

      There's a difference between fundamentally flawed thinking and a bug. Someone found this bug, it was fixed in 30 seconds, and suddenly everyone using DBI had a more robust solution. Software's never going to be "perfect", but it's likely that a group of people trying to solve a problem in general is going to have a better product than something you came up with after lunch :)

      --
      My other car is first.
    5. Re:This is why... by Dwonis · · Score: 4, Informative
      Speaking of which, is there a way to do this in PHP? I've never seen a PHP script that did anything like this (which is probably why bugtraq is 99% php SQL injection holes).

      Most people probably aren't aware of it, but several years ago, I wrote a few short scripts for PHP 4 that specifically address this problem. Currently-supported database backends are MySQL and anything that DBX supports, but it wouldn't take much to adapt it to PostgreSQL.

      It basically lets you write code like this:

      require_once "mysqlext.php";
      $link = mysql_connect(...);
      $results = mysql_execute($link, "SELECT a,b,c FROM foo WHERE bar=? and baz=?", array($bar, $baz));

      It doesn't have the performance benefits that real prepared statements have, but I still find it handy for typical PHP4 database work.

      The code is released under the MIT license, so feel free to use it.

    6. Re:This is why... by Slashcrunch · · Score: 4, Informative

      For PHP, Zend_Db has a way of doing this which is very similar to the way you do it in Perl and Java. It's quite nice. There are other ways of doing this as well :) // get a Zend_Db_Adapter (basically a DB connection)
      $db = getConnection(); // the sql with a placeholder for a parameter called 'id'
      $sql = 'select * from Foo where id = :id'; // anyparameters are defined in the array. in this case, just 'id'
      $params = array('id' => $id); // send the query
      $result = $db->query($sql, $params);

    7. Re:This is why... by OnyxRaven · · Score: 2, Informative

      PEAR::DB supports almost the exact same method.

      $data = array('one',2);

      (short)
      $result = $db->query('select * from table where foo=? and bar=?',$data);

      (prepare)
      $stmt = $db->prepare('select * from table where foo=? and bar=?');
      $result = $db->execute($stmt,$data);

      Works with mysql, pgsql, mssql... etc etc. MDB2 is the new version of this library which uses much the same syntax. Uses database-specific escaping/quoting automatically.

      http://pear.php.net/manual/en/package.database.php

      --
      --onyx--
    8. Re:This is why... by a.d.trick · · Score: 2, Informative

      Yes, but properly escaping everything is at least as important. Whitelisting and blacklisting can't be used in a lot of situations (for example text fields) without causing problems. The easiest way to do that is to use an existing library that handles most of that for you. The more you have automated, the less room there is for human error. Unfortunatly, PHP coders tend to trive in reinventing the wheel.

    9. Re:This is why... by moro_666 · · Score: 1

      you didn't give the db a 'date', but a string. the api itself is flawed if the language can't differ strings and dates (because they are different types and dates can have special effects a'la runtime translation to other timezones etc.)

      the mysql `standard` until 5.x was to have crappy unescaped parameters all over the place, let's hope this will improve now. i'm running my own wrappers anyway which emulate the DBI or java.sql like syntax and db specific escaping, so i couldn't care less.

      --

      I'd tell you the chances of this story being a dupe, but you wouldn't like it.
    10. Re:This is why... by Anonymous Coward · · Score: 0

      Is there a way to use names instead?

      Can you have prepared statements that are something like "select name from t1 where id=%id%" and use an associative array so that you don't have to count the question marks? I could see myself inserting a new clause in the WHERE and then everything would be off by one. It seems like a giant leap back in computing to make it dependent on the position in the string.

    11. Re:This is why... by Badanov · · Score: 1
      You are talking about placeholders, which versions of DBD-Pg have fixed to require queries structured as in your "good" example.

      I ran into probrems upgrading to the newer DB-Pg module precisely because the new requirement is that placeholders for queries must be submitted as your new example shows, not as:

      $dbh->execute("SELECT * FROM foo WHERE bar=$bar AND baz=$baz"); or

      $dbh->execute("SELECT * FROM foo WHERE bar='$bar' AND baz='$baz' ");

      This can't be the same thing as the bug that was fixed, could it?

      --
      Dawn of the Dead
    12. Re:This is why... by DrSkwid · · Score: 1

      The PostgreSQL extended query protocol doesn't require escaping of any value because it won't parsed as part of an sql statement, making injection a non issue. The only sql that is parsed is supplied by the source code.

      People often only get as far as writing the simple query part of the protocol which takes a stream of potentially multiple commands mixed in with data supplied by potentially hostile third parties.

      I've witnessed it these past two weeks while sniffing postgresql clients while writing my own as an exercise in Limbo

      Postgresql also allows the use of named queries, who stay prepared for as long as the connection is open (unless deleted deliberately).

      I don't know how long the parse & prepare phase takes but I know it is not zero so there could be some speed improvement too.

      I'm not convinced libpq uses it to full advantage but I didn't do in depth research, just a nod to say that if you do serious postgresql programming you could well do with looking at writing your own protcol client.

      --
      There are places where the networks are not touching,and there are places where they are-Boeing's Lori Gunter
    13. Re:This is why... by jsoderba · · Score: 1

      What's wrong with PEAR_DB?

    14. Re:This is why... by dkf · · Score: 1
      Is there a way to use names instead?
      Sure there is in SQL. "SELECT a,b,c FROM foo WHERE bar=:bar and baz=:baz;" is a piece of valid SQL that obtains the values to match against the bar and baz fields from some named context. What varies according to different drivers is what that named context is; apparently when using Zend_Db you bind using an associative array, and if you were using the Tcl bindings to SQLite you'd be using context variables. Check your documentation for how to do this with your code (and then wave goodbye to injection attacks on the database. :-)
      --
      "Little does he know, but there is no 'I' in 'Idiot'!"
    15. Re:This is why... by Colde · · Score: 1

      Actually with PEAR DB you can do _exactly_ that.

      PEAR DB however is not installed as standard, and is not installed on many webhosts. It's part of free PEAR library of good reusable code for PHP and is pretty easy to install if you are the admin of a server.

    16. Re:This is why... by kjart · · Score: 1
      Thanks for that. I must admit that as a relative perl noob I've done it the sloppy way before. The reason? I looked at the code of several peers and saw it done that way. I'm fairly sure that's how most sloppy practices spread - copying off of people who themselves copied off people, etc etc.

      This seems to illustrate the value of learning from a (good) book, etc rather than trying to learn things on the fly.

    17. Re:This is why... by rjstanford · · Score: 1

      It doesn't have the performance benefits that real prepared statements have, but I still find it handy for typical PHP4 database work.

      Hmm. How about this -- hash the string that's passed in. Use it as the key to a hashtable, the value of which is an integer that's incremented every time the function sees it. Once that reaches a certain threshold, prepare the statement normally but then store it in another hash table (keyed by the statement hash again) for reuse. You could eliminate the first step if you were pretty sure that you'd only have a limited number of queries and weren't worried about potential memory leaks.

      --
      You're special forces then? That's great! I just love your olympics!
    18. Re:This is why... by Anonymous Coward · · Score: 0

      PHP PDO does excatly this, other ways exist too ...

    19. Re:This is why... by caluml · · Score: 1
      I'm interested - why is it not good to:
      $foo = pg_escape_string("This is my 'blah$*& 'crazy string %^Q*&*&^Q");
      $res = pg_query("SELECT * FROM foo WHERE bar = '$foo' ");
      Or similar with inserts, etc.
    20. Re:This is why... by Anonymous Coward · · Score: 0

      Speaking of which, is there a way to do this in PHP?

      i use adodb's db abstraction class with php and it allows one to use bind variables. as an example...

      $db = &ADONewConnection('postgres');
      $db -> Connect($db_string,$db_owner,$db_pw,$db_name); //$db -> debug=true;

      $customer_name = $form->GetInputValue('customer_name');
      $customer_id = $db->getone("select nextval('t_customer_customer_id_seq')");

      $sql_insert = Execute($sql_insert, array($customer_id, $customer_name, $db->DBDate(time())));

      i agree - bind variables are a VCT (Very Cool Thing) TM

    21. Re:This is why... by Branko · · Score: 1

      I don't know how long the parse & prepare phase takes but I know it is not zero so there could be some speed improvement too.

      Preparation of SQL statement is potentially expensive and usually involves analysis (by DBMS optimizer) of many ways the statement can be executed (what indexes to use, and in what order, how to combine sub-queries etc.).

      This analysis can be more expensive then statement execution itself (especially for well designed SELECT statements).

      So if you execute your statement more then once, it is always a good idea to prepare it.

    22. Re:This is why... by jadavis · · Score: 1

      Is it possible to do that with JDBC? All I see in the PreparedStatement object is things like "setInt(int,int)".

      --
      Social scientists are inspired by theories; scientists are humbled by facts.
    23. Re:This is why... by DrSkwid · · Score: 1

      Of course, I was wrong about libpq. I've done them the service of actually reading about it again.

      PQprepare & friends

      Even php offers them.

      --
      There are places where the networks are not touching,and there are places where they are-Boeing's Lori Gunter
    24. Re:This is why... by soliptic · · Score: 1
      $sth = $dbh->prepare("SELECT a,b,c FROM foo WHERE bar=? and baz=?") $sth->execute($bar, $baz);
      Humour a DB neophyte. How is this secure by default? I still don't see any checking of the contents of $bar or $baz, or if $baz contains "; DELETE *", don't you still suffer the usual problems?

      Or does the execute() routine automatically safety-fy the variables passed? In which case, how does it know what is and isn't safe in a particular context?

      I just don't get it :(

    25. Re:This is why... by jrockway · · Score: 1

      > I still don't see any checking of the contents of $bar or $baz, or if $baz contains "; DELETE *", don't you still suffer the usual problems?

      No, because there is no "quoting" going on. Instead of making a string that you'd type into the SQL shell, you're directly telling the database what the query is. If $bar is "'\\\'"\''\""/\//'""''\DROP database foo'\""''\\'', then the database will be told to search for that exact string in the database, not to do whatever that long thing means when you type it into the shell.

      The reason many programmers "prefer" the first way is because they do their test queries at the database console, and assume that the database is programmed in the same manner. It's not.

      The first rule of secure programming is "Don't quote." and if you do DBI the right way, you will never have to.

      Hmm, slashdot won't accept this post due to to many "junk characters". So I guess that gives me an opportunity to explain the second rule of secure programming -- "Don't parse."

      The problem with parsing is that you create ambigious situations. If you have a "key: valuekey2: value2" format, what happens when you want to have a value that's "value not a key: not another value"? Difficult situation, and solving it violates the first rule!

      Solution? Let something outside your config file separate the keys and values. For example, you can make a configuration directory and make each key a file. Then the filesystem separates your keys from your values. (This is what DJB does in qmail.)

      Hope this helps :)

      Gah! Still too many junk characters. No wonder there's never intelligent commentary on slashdot.

      --
      My other car is first.
    26. Re:This is why... by Directrix1 · · Score: 1

      just use adodb.sourceforge.net for php. You get syntax like the following:
      $res = $db->Execute('select name,age from people where sex=? and city=?',array($sex,$city));

      --
      Occam's razor is the blind faith in the natural selection of least resistance and in universal oversimplification. -- EF
    27. Re:This is why... by Dwonis · · Score: 1

      Probably nothing. I've just never used it.

  2. Widespread problems... by ByTor-2112 · · Score: 2, Informative

    Most of the PHP apps I've ever had the (mis)pleasure to peruse make liberal use of this type of "escaping" rather than calling the provided "escape_string" functions. That never made any sense to me, but the practice appears to be quite common.

    1. Re:Widespread problems... by DivineOmega · · Score: 1

      It's a lazy coding practice really.

    2. Re:Widespread problems... by Anonymous Coward · · Score: 0

      I was surprised to see the newsforge article tell me that addslashes() was deprecated, so I checked out www.php.net/addslashes and discovered no mention of this deprecation at all. Not even a mention of what it was deprecated BY. At least the HTML specs let you know when to stop using something, even if you have to buy $200 worth of "learn html in 30 days" books to figure out what you're supposed to use instead.

    3. Re:Widespread problems... by mortonda · · Score: 1

      PEAR::DB does this.

    4. Re:Widespread problems... by Jac_no_k · · Score: 2, Informative

      I recently switched from coding for single company to joining a consulting firm. I'm shocked at how sloppy the commands sent to the database are. It drives me nuts and makes me want to fix all the code... but since I'm low man on the totem pole, my concerns are replied generally with lame excuses like "emulate the coding style of the original author", "we don't get paid much, so it's okay to be sloppy", or "we have a deadline to meet". And no, I can't find a new gig.

      Some of the sites I've worked with are vulnerable to this type of injection attack. From my perspective, this is widespread, a bit scary, and should be nice little eye openner as sites get hacked. This may actually be a blessing as I could start pointing what happens with sloppy code and not being given enough time clean everything up.

    5. Re:Widespread problems... by ultranova · · Score: 2, Informative

      Loads of fun that worthless whore is.

      If a whore is "loads of fun", then she's not worthless, since she can propably get a good price once the word spreads. Just because you are trolling is no excuse to be illogical.

      Now let's see if someone mods me Insightfull or Informative...

      --

      Forget magic. Any technology distinguishable from divine power is insufficiently advanced.

  3. Prepared Queries by Anonymous Coward · · Score: 0

    People who don't use prepared queries doesn't deserve any better than having someone to fuck up your database!

    1. Re:Prepared Queries by Tablizer · · Score: 3, Interesting

      People who don't use prepared queries doesn't deserve any better than having someone to fuck up your database!

      Often such are combersome or impossible with dynamic query generation, such as Query-by-Example forms where the terms and sort options depend on user input.

      Many "prepared" thingies also depend on positional parameters, which can get messy. Would you like to use and maintain a function with 19 positional parameters? That is what it can feel like.

      Further, certain kinds of prepared statements seem to have the same flaw. The PS api may "check" the types, but still passes a string to the database such that if you can find a way past the syntactical type checkers, you can pull off the same thing.

      I do wish that most database API's offered a "read-only" mode such that the query being sent to the database is designated read-only. That way the worse a hacker can do is grab table data but not change anything. If you have logins restricted only to necessary tables, then the damage exposure is further limited.

    2. Re:Prepared Queries by Anonymous Coward · · Score: 0

      I do wish that most database API's offered a "read-only" mode such that the query being sent to the database is designated read-only.

      Well, DUH. That's security basics really. If you don't want write access on a table then don't give it to the user! Postgres allows for such granularity.

    3. Re:Prepared Queries by Tablizer · · Score: 1

      I really meant at the API level (specific queries), not necessarily at the user level.

      The API would have to tell the database that a given query is to be read-only so that if a hacker sticks a DELETE in there, it will be rejected. But, there is no semi-standard mechanisms for such that I know of.

    4. Re:Prepared Queries by ComaVN · · Score: 1

      I think he means per-query permissions, so he doesn't have to use a separate user for select and insert/update queries.

      --
      Be wary of any facts that confirm your opinion.
    5. Re:Prepared Queries by gullevek · · Score: 1

      so, I even prepare those queries. Whats wrong with that. You are the programmer, you have to know what can come in and how to protect the system from wrong input.

      Weak excuses for not doing it right.

      --
      "Freiheit ist immer auch die Freiheit des Andersdenkenden" - Rosa Luxemburg, 1871 - 1919
    6. Re:Prepared Queries by TheLink · · Score: 1

      Security is a low priority.

      Years ago I proposed to the HTML and browser people that there be a "no-active" html tag, that marks enclosed content as nonactive.

      Example:

      <safetyon lock="randomstring" allowed="keyword1,keyword2,keyword3" />
      potentially evil content from uncontrolled party - e.g. comments
      <safetyoff lock="randomstring"/>

      keywords could be "textonly" "basic-html" "java" etc.

      It seems everyone is more interested in "GO" tags. And nobody wants a "STOP" tag.

      It's like having a thousand accelerator pedals and no brake pedal. The only way to stop is to make sure none of the accelerator pedals are pressed ;).

      Maybe I should try again :).

      --
    7. Re:Prepared Queries by oncebitten · · Score: 1

      It's been a while since I've done database stuff (pretty much Oracle), but, how about:

      set transaction read only

      or

      selecting off a database view. we used to do this when we wanted to enforce read only. i know nowadays there's updatable views, but i imagine there's a way to use the old behavior (i'm just too lazy to look it up).

    8. Re:Prepared Queries by Anonymous Coward · · Score: 0

      Use two database handles; one for reading and one for writing. Coincides nicely with postgres clustering b.t.w. The reading handle is connected to a user who can see the same namespace as the writing handle, except the reading handle user can only read data. Cumbersome, but currently possible with postgres, oracle (using private aliases), and mysql (which has only one namespace per db anyway). But I agree that the dbh->begin_work() call could come equipped with some parameter indicative of my intended use of the transaction.

    9. Re:Prepared Queries by Anonymous Coward · · Score: 0

      that's a stupid idea.

    10. Re:Prepared Queries by jsoderba · · Score: 1

      CDATA sections already do that. But you can't rely on CDATA, because not all UAs support it.

    11. Re:Prepared Queries by Anonymous Coward · · Score: 0
      Even if they did, I could still use
      ]]>
      <script type="text/evil">
      bad things here
      </script>
      <![CDATA[
      in my payload. The only fix is to prevent untrusted content from containing STAGO, but that's so easy (just replace one character with a general entity) I don't have much sympathy for anyone who can't be bothered.
    12. Re:Prepared Queries by Anonymous Coward · · Score: 0

      Many "prepared" thingies also depend on positional parameters

      Others don't, Python's DBAPI2 for example specifies the ability to use both positional and named parameters, that or you use higher-level ORM layers that'll manage all that stuff for you.

    13. Re:Prepared Queries by indifferent+children · · Score: 1

      Postgres also supports "set transaction read only". Thanks for mentioning it, I never knew that it existed.

      --
      Censorship is telling a man he can't have a steak just because a baby can't chew it. --Mark Twain
    14. Re:Prepared Queries by poot_rootbeer · · Score: 1

      I do wish that most database API's offered a "read-only" mode such that the query being sent to the database is designated read-only.

      This is easy to do. Create a schema with SELECT-only privileges on the DB objects. Use that schema when connecting to the DB via your API for read-only actions.

      If there's other circumstances where you need other DML commands to work, create separate schemae for them. If your data integrity is that important that you need every possible defense against SQL injection attacks, the overhead of closing and reopening connections to the DB occasionally is a small price to pay.

    15. Re:Prepared Queries by ummit · · Score: 1
      I do wish that most database API's offered a "read-only" mode...

      Nice idea, although there's plenty of danger even in read-only-land. For example, the example at http://shiflett.org/archive/184 is of an attacker who tricks a login scheme into letting him log in without a password, and the query there is merely a read.

      I agree that prepared statements can be cumbersome, but they're clearly much more the Right Thing to be doing than constructing ad-hoc queries by hand, with random strings pasted between single quotes.

      Your mention of "Query-by-Example forms" is apropos. Anybody know how to do those Right?

    16. Re:Prepared Queries by counterplex · · Score: 1

      Typically I've not allowed people to use the value field (i.e. where they specify the criterion) to also specify the operator for that comparison. I keep a separate drop-down with non-mathematical and non-SQL names that I check against a whitelist of valid operators and then translate them into SQL.

      As an example, for an age field i have the following operators: equals, contains, less than, greater than, etc. The "value" portion of each is identical to what's displayed on the pull down. However there's an internal method that maps "less than" to . No other operators other than those on the whitelist are allowed.

      Once all that legwork is done I actually construct a full statement using helper methods that is full of your friends and mine: placeholders. Create a list of bind parameters automatically and then call prepare and execute (perl style for this example).

      This has the benefit of make it safer at least from my pov. Any suggestions on improvement are very welcome!

      Also a final point: I've never even thought of using non-prepared statements .. I'm surprised with all the security-testing companies making such heavy use of sql injection to butter their bread that people are _still_ using non-prepared statements. Very scary!

      --
      $x = ($x * 10) % 10 >= 5 ? 1 + int $x : int $x
    17. Re:Prepared Queries by Vengeful+weenie · · Score: 1
      Java's JDBC does support this on the connection, but more as an optimization than security. The only solid way to secure the data is to grant permissions properly in the DB. If you have two roles (one for update one for read), then create two DB users.

      Really the weakness comes more from the application tiers which don't support good authentication mechanisms. The constant use of a single DB login for efficiency, while giving up all of the powerful data protection tools is the real problem.

    18. Re:Prepared Queries by TheLink · · Score: 1

      The attacker could close the CDATA tag, whereas with my proposal, the attacker has to supply the correct "randomstring" in order to reenable normal behaviour ("default allow").

      Given a suitable "randomstring" the odds of an attacker successfully closing the safety tag would be exceedingly low, even with future likely tags and browser features.

      The current situation is: if someone adds a new tag or browser feature, it is less likely that you'd be blocking that by default, and that is _expected_ behaviour.

      Whereas with my proposal, it is actually a bug in the browser if it doesn't restrict the enclosed content to only the allowed stuff (e.g. safe-html (no links or images) or even text only).

      It doesn't have to be specifically the way I mentioned but there should be a way to disable "default permit".
      I'd actually prefer the tags to be something like: <safety lock="..."> and </safety lock="..."> , however someone said that's not proper HTML/XML - closing tags aren't supposed to have extra bits?

      --
  4. Character encodings yet again by Bogtha · · Score: 2, Informative

    Mismatches between different character encodings seem to have been responsible for vast swathes of security vulnerabilities over the past few years. The sooner everybody moves to programming languages and software that use Unicode natively, the more secure we will all be.

    Unfortunately, the languages receiving the most attention for web development have abysmal Unicode support. PHP and Ruby haven't a clue, although the next version of PHP is supposed to be much better in this respect. Python developers can at least handle things fairly well, although it's still a bit of a pain in the neck.

    This vulnerability is probably going to cause quite a few problems for people, as it's a client issue that will probably need whatever adapter you use to be updated. Here is the user guide to the vulnerability for PostgreSQL. psycopg should be fixed shortly.

    --
    Bogtha Bogtha Bogtha
    1. Re:Character encodings yet again by Anonymous Coward · · Score: 0

      Actually, the problem is 'concatenating' strings into SQL in the first place - some escaping mechanisms are better than others, but ALL of them are going at it the wrong way. Programs should use bound parameters, so that it doesn't matter WHAT the string is - it gets treated as a string, not 'evaluated' after being (hopefully) made "safe".

    2. Re:Character encodings yet again by Anonymous Coward · · Score: 0

      Wait a moment - wasn't Ruby authored by a Japanese person?
      I'd be absolutely stupefied if the language itself doesn't have good built-in multibyte character support. Of course, the database libraries may be a different matter, if you're relying on bad query parameterization functions.

      And anyone hand-crafting SQL out of user-touchable strings deserves what they get...

    3. Re:Character encodings yet again by edwdig · · Score: 4, Informative

      Unicode isn't a character encoding, it's a character set. According to this unicode faq, there are 13 different encodings for Unicode. Switching to Unicode doesn't help the problem of character encodings.

    4. Re:Character encodings yet again by Estanislao+Mart�nez · · Score: 1
      Wait a moment - wasn't Ruby authored by a Japanese person? I'd be absolutely stupefied if the language itself doesn't have good built-in multibyte character support.

      There's a difference between having "multibyte character support" and what the GP wants, which is clean Unicode support in the language. Ruby does have support for at least Japanese multibyte encodings, but that's different.

      The model in question is one that Java comes pretty close to: all of your strings in your language are represented internally as Unicode, with the details of the representation hidden from you. Whenever you do character I/O, then, you specify what external encoding to use, and the language's libraries handle all the conversions for you. You never need to do any explicit character set conversions.

      Ruby doesn't do any of this for you. IIRC, there's a command-line switch for making it internally represent strings in multibyte for using Japanese-specific encodings, and if you want to try Unicode, you've got to use UTF-8 internally, and jump some strange hoops...

      Finally, the fact that Ruby's author is Japanese should actually make you unsurprised that it doesn't have good Unicode support. A lot of Japanese people are allergic to Unicode, and in particular, to the Han Unification in Unicode.

    5. Re:Character encodings yet again by Estanislao+Mart�nez · · Score: 1
      Switching to Unicode doesn't help the problem of character encodings.

      But what the GP has in mind is not just switching to Unicode, it's using a language implementation that has good Unicode support. This means that strings are internally represented as Unicode, and the language's character I/O libraries handle all conversion between external encodings and the internal representation. This means that program code doesn't do any conversions; all it does is specify a desired encoding when opening a character I/O stream.

      This is one of the things that, e.g., Java, does almost perfectly right. (The one dumb flaw is the primitive 16-bit char type.)

  5. Plug Injection Hole by fudgefactor7 · · Score: 5, Funny

    heh, heh, heh... I'll plug your injection hole, baby!

    1. Re:Plug Injection Hole by Anonymous Coward · · Score: 0
  6. Josh Berkus by Russ+Nelson · · Score: 1

    By the way, the dangling reference to a quote by one "Berkus" should be attributed to Josh Berkus.

    --
    Don't piss off The Angry Economist
    1. Re:Josh Berkus by SaDan · · Score: 2, Funny

      "By the way, the dangling reference to a quote by one "Berkus" should be attributed to Josh Berkus." --Russ Nelson

    2. Re:Josh Berkus by LearnToSpell · · Score: 2, Funny

      "'By the way, the dangling reference to a quote by one "Berkus" should be attributed to Josh Berkus.' --Russ Nelson" --SaDan

    3. Re:Josh Berkus by poot_rootbeer · · Score: 2, Funny

      '"\'By the way, the dangling reference to a quote by one "Berkus" should be attributed to Josh Berkus.\' --Russ Nelson" --SaDan' --LearnToSpell

  7. Someone needs to read the PHP manual... by chibi.nowheregirl · · Score: 0, Informative

    As long as you set the right multibyte string encoding in PHP via the multibyte string functions (specifically, the mb_internal_encoding function), the parser will catch the invalid multibyte sequence and fix it.

    Move along, folks. No need to panic.

  8. Binaries for Suse by electroniceric · · Score: 1

    Has anyone else found that Suse is really, really slow in releasing updated Postgres binaries? Are they tied to SLES releases? Anyone know anything?

    I know I'll probably get a million flames telling me to compile from source, but I'm not really that fond of supporting my own compilation job on a production server.

    1. Re:Binaries for Suse by Anonymous Coward · · Score: 0

      Read the article:
      Source packages of PostgreSQL 8.1.4, 8.0.8, 7.4.13, and 7.3.15 are available now, and Berkus says that binary packages should be available within 48 hours for "most platforms" either through the PostgreSQL project or through vendor updates.

  9. The jokes, they write themselves! by Kha+Na+Set · · Score: 4, Funny

    Must....not....make....joke....about...injection hole...being plugged...

    Damn, too late.

    =\

    1. Re:The jokes, they write themselves! by fireman+sam · · Score: 1

      Don't worry, you haven't made a joke yet. (At least one that was funny)

      --
      it is only after a long journey that you know the strength of the horse.
    2. Re:The jokes, they write themselves! by Kha+Na+Set · · Score: 1

      Wasn't referring to me ... look a few posts up and you'll see a joke has already been made.

  10. Test of vulnerability... by Anonymous Coward · · Score: 0


    ok.. here goes.. unicode


    hmmm... nope.. didn't work.

  11. Why is everybody still using this toy DB? by Anonymous Coward · · Score: 2, Funny

    That's why I prefer Postgre. Oh, wait...

    1. Re:Why is everybody still using this toy DB? by Anonymous Coward · · Score: 0

      Postgres: We're better than MySQL!

  12. Use placeholders! by mortonda · · Score: 4, Informative
    This is why I gripe and complain anytime I see someone doing sql calls without using placeholder routines, such as perl's DBI or PEAR::DB for php. From the technical doc that someone posted above:
    If your code is doing escaping "by hand", for instance by doubling quotes and backslashes, you really need to fix it to use the library routines instead. If you're avoiding the need for escaping at all, by sending variable strings as out-of-line parameters, then you've saved yourself a whole lot of trouble and can stop worrying.
    Start using a proper placeholder syntax and variable substitution for parameters when it comes to untrusted data. It solves a lot of problems.
    1. Re:Use placeholders! by TheLink · · Score: 1

      But is pear:db standard PHP? I thought there was this thing called PDO?

      It annoys me that PHP is a newer language but the devs did not learn from the mistakes of the older languages.

      PHP seems to be a language that makes it HARD to do the right thing, and easy to do the "nearly-right" (AKA wrong) thing - addslashes, magic quotes.

      I have to deal with tons of PHP code written the wrong way (by someone else), because at that point in time there was no real good right way. Even now, it doesn't seem like the official way is to use PEAR.

      --
    2. Re:Use placeholders! by Anonymous Coward · · Score: 0

      PHP does a lot of stupid things by default, but they're sort of starting to fix it. PDO should have been built in from the start, but it wasn't, which is why PEAR::DB exists. PHP5/PDO don't have the widespread adoption/availability that PEAR::DB does.

    3. Re:Use placeholders! by jsoderba · · Score: 1

      PEAR::DB is intalled on a hundred times as many systems as PDO. If DB's not there, it's trivial to install, which can't be said of upgrading to PHP 5.1 with PDO.

    4. Re:Use placeholders! by TheLink · · Score: 1

      Which proves my point that PHP doesn't make it easy to do the _right_ thing.

      It's not really standard is it, if with 5.1 the "future is supposed to be PDO?

      So do I modify tons of old broken PHP4 code to use PEAR::DB and then when PHP5.1 or whatever is ready, remodify stuff to do PDO? Will PEAR::DB still work on PHP5.1? If it does, then I'd ignore PDO ;). Of course the amount of 2nd round work involve depends on how easy it is to translate PEAR:DB to PDO - might only have to modify the 1st round DB lib.

      --
    5. Re:Use placeholders! by TheLink · · Score: 2, Interesting

      Heh, as they fix all the stupid things, it starts to look more and more like Perl ;).

      Take away the popular but bad PHPisms like addslashes, magic quotes, cgi parameters automatically entering variable namespace, the combining of normal arrays with associative arrays/hashes (makes it messier to distinguish numeric keys from the indices), and you end up with something that is more Perl-ish than PHP-ish.

      PHP and MySQL, what a combination... hehe.

      --
    6. Re:Use placeholders! by mortonda · · Score: 1
      PHP seems to be a language that makes it HARD to do the right thing, and easy to do the "nearly-right" (AKA wrong) thing - addslashes, magic quotes.


      Well, no argument there. But if you are going to use php, at least use placeholders!
  13. This is not specific to Postgres by Ayanami+Rei · · Score: 2, Interesting

    Oracle and MySQL suffer from similar vulnerabilites when going UTF8 -> database charset. The "answer" in Oracle is to use UTF-16 on the backend and a select 8/16-bit encoding in the front end if you want to support multiple locales. I'm not sure what the implications are for MySQL.

    --
    THIS THING CAN TURN ON A DIME, MACROSSZERO STYLE ALSO FUCK BETA, ~NYORON
    1. Re:This is not specific to Postgres by Anonymous Coward · · Score: 0

      Ugh... there are not a lot of good reasons to use UTF-16 (frontend or back) unless you have an app that doesn't understand anything else. It's usually wasteful, has a lot of embedded null bytes, slaughters 7-bit ASCII compatibility...

    2. Re:This is not specific to Postgres by NutscrapeSucks · · Score: 1

      > there are not a lot of good reasons to use UTF-16

      Unless you are using Java or Win32, in which case strings are UTF-16 and it's not worth trying to change them.

      --
      Whenever I hear the word 'Innovation', I reach for my pistol.
  14. Re:Guess its time by jaredmauch · · Score: 4, Interesting
    Not faster, (for my application). I currently insert about 35k rows per second, mysql just can't handle that last I tried. For the inserts of the raw data that I have (about 250 mil rows a day, and i only save every 1:10k) these opteron 252's are just too slow with the one index. I'd like to be able to index more than a single column.

    If someone wants to try and help me with this problem, let me know, but I had trouble getting mysql to insert (actually doing COPY since it's about 10x faster) anywhere near what I can get with PG8.

  15. I dont see how UTF-8 is vulnerable by Srin+Tuar · · Score: 3, Interesting

    I can understand how SJIS and BIG5 are vulnerable.

    But in a UTF-8 string, no single byte will match a single quote besides the single quote character (0x27).

    It seems to me that simply inserting a backslash before every single quote and backslash in a given string will have the desired effect, and that UTF-8 is not particularly vulnerable to this problem. (quite by design- it was invented by none other than Ken Thompson)

    Either that article is misleading somehow, or else the postgres developers are simply putting in some safeguards for common errors in things such as php scripts.

    1. Re:I dont see how UTF-8 is vulnerable by Anonymous Coward · · Score: 0

      Correct me if I'm wrong, but UTF-8 uses the regular ASCII code (0-128), then use certain bytes with high bits to indicate a multi byte character. What would hapen if you had one of those multi-byte markers followed by a single quote (no one said the input had to be valid UTF-8).
      Your replace, if it's not unicode aware, or if the unicode character + 0x27 combination is invalid, might decide the ' needs escaing and escape it. The database might then read it and interpret differently: that the multibyte character now is (semi-)valid and ends with the inserted escape.
      Your protection has now ben defeated.

      The solution is to use prepared queries, not to endlessly escape.

    2. Re:I dont see how UTF-8 is vulnerable by quantum+bit · · Score: 1

      Well, in theory UTF-8 shouldn't be vulnerable since the backslash is 128 and all UTF-8 encoded characters have the high bit set. So the backslash can't be part of a valid multibyte sequence. Whether it's an issue or not depends how PostgreSQL handles invalid UTF-8 sequences.

    3. Re:I dont see how UTF-8 is vulnerable by quantum+bit · · Score: 1

      Bah, stupid pseudo-html. Backslash is < 128 (0x5c)

    4. Re:I dont see how UTF-8 is vulnerable by Anonymous Coward · · Score: 0

      then use certain bytes with high bits to indicate a multi byte character

      Thats the problem. REAL utf8 uses a high bit in every byte of their multibyte characters. Postgres, however, apparently assumes that if byte 1 has the high bit set, that byte 2 must be part of the same character and doesnt process byte 2 at all. At least thats my guess, with server encoding set to SQL_ASCII, it doesn't do any multibyte processing at all and just treats each byte alone.

      Furthermore, using prepared queries merely moves the magic somewhere else. Postgres's SQL for preparing queries is:
      PREPARE plan_name (text,int,int) as insert into sometable (foo,bar,baz) values ($1,$2,$3);

      Simple enough, right? OK, now let's execute that:
      EXECUTE plan_name ('It\'s A Trap!!!',4,5); ... oh shit, the quote still had to be escaped. Whether you choose to do it by a backslash (more visible) or the more acceptable '', the problem is still the same, somebody (whether its you and your addslashes, or the libpg escape function) has to take the user input and make it "nice". If addslashes() is broken, then the answer is to fix it, or at least document that its broken and provide an alternative. Calling people stupid or lazy for using the obvious documented method is not the solution.

    5. Re:I dont see how UTF-8 is vulnerable by ummit · · Score: 1
      Well, in theory UTF-8 shouldn't be vulnerable...

      Exactly right.

      Whether it's an issue or not depends how PostgreSQL handles invalid UTF-8 sequences.

      And the answer is: badly. If you had a UTF-8 initial-byte character followed by a 7-bit character, it was discarding both of them, and proceeding. (Evidently they've fixed this now.)

      Who knew multibyte character handling was going to be such a mess! Writing proper error recovery (resynchronizing, etc.) is as hard as in a friggin' compiler!

    6. Re:I dont see how UTF-8 is vulnerable by Anonymous Coward · · Score: 0

      > EXECUTE plan_name ('It\'s A Trap!!!',4,5); ... oh shit, the quote still had to be escaped

      No, you're just using the command-line utility and plain ESCAPE. Interfaces can use
      PQexecPrepare and PQexecParams which send a char * to the server - no escaping of quotes
      needed.

  16. Hey Leisure Suit Larry... by Anonymous Coward · · Score: 0

    I just don't want to know what you would do at Mr. Lube.

  17. NEEDED: 8.1.4 Torrent(s) & bigger filecollect by ivi · · Score: 1

    Only up to 8.1.3 were listed here as we composed this:

        http://www.postgresql.org/download/btlist

    Oh, and it would be gerat to have just ONE torrent to d'load, eg, per platform.

    Alternatively, create an All-In-One ISO (preferably CD-ROM set -and- a DVD ISO)

    (Help us to save you bandwidth...)

    "Remember: It isn't released until its torrents are released" :-/

  18. Would like some clarification. by JLeslie · · Score: 2, Interesting

    I've only recently begun playing with PostgreSQL coming from Oracle. I've also been primarily a Java (JDBC) guy for the last couple years. I'm not sure I completely understand where this vulnerability lies. Would a Java PreparedStatement be vulnerable to this? Would the Postgres implementation of JDBC use 'addslashes()' to bind variables in a prepared statement? Or is this a higher level function? (I have not come across it myself, but like I said I'm still pretty new to Postgres).

    I guess I see "affects PostgreSQL servers exposed to untrusted input, such as input coming from Web forms" and wonder if they're talking about some further functionality where postgres acts like a web server. My understanding of PreparedStatements is that they are bound at a very low level in the db to allow for maximum speed through caching etc...

    1. Re:Would like some clarification. by Anonymous Coward · · Score: 0

      That would be up to the JDBC driver implementation. JDBC drivers have to do a lot... you say "I would like to set this javax.sql.Date value" and the driver has to turn it into a protocol database-speak -- so too with strings and escaping and so forth. Example, some databases might like to escape special characters with backslashes, others with a mess of escaping single-quotes. But you don't have to worry about that in a PreparedStatement.

    2. Re:Would like some clarification. by Anonymous Coward · · Score: 0

      You're safe. From the faq:
      # If application always sends untrusted strings as out-of-line parameters, instead of embedding them into SQL commands, it is not vulnerable. This is only available in PostgreSQL 7.4 or later.

      > My understanding of PreparedStatements is that they are bound at a very low level in the db to allow for maximum speed through caching etc...

      I don't think the JDBC driver automatically uses server-side prepared statements. From this:

      "There are a number of ways to enable server side prepared statements depending on your application's needs. The general method is to set a threshold for a PreparedStatement. An internal counter keeps track of how many times the statement has been executed and when it reaches the threshold it will start to use server side prepared statements."

      Hope that helps.

      captcha: injects

    3. Re:Would like some clarification. by Anonymous Coward · · Score: 0

      The JDBC driver shouldn't be vulnerable as

      (a) it sends parameters passed to PreparedStatement out-of-line from the query (by default, anyway); and

      (b) query string handling is done as Java Strings (i.e. UTF16) up until just before the query is actually sent to the server, when it is translated to the connection's encoding (by default UTF8). So even if the application constructs queries by hand, it's doing so as a Java String, and it does not need to be aware of the server encoding in use to get the quoting of parameter values correct. (But you should be using PreparedStatement anyway!)

    4. Re:Would like some clarification. by jadavis · · Score: 1

      Would the Postgres implementation of JDBC use 'addslashes()' to bind variables in a prepared statement?

      No. Addslashes is a PHP function that many people use. It's not recommended for use with any database.

      JDBC would use the correct string escaping routines in the postgresql client library, PQescapeStringConn. That is perfectly safe.

      In fact, it appears the only real changes they made were to break bad code that produced invalid multibyte sequences. They also broke the use of the " \' " (backslash + single quote) sequence in unsafe situations that involve multibyte, client-only encodings.

      Read more at the release notes:
      http://www.postgresql.org/docs/8.1/interactive/rel ease.html#RELEASE-8-1-4

      Everything the postgresql people did seemed quite reasonable and only breaks code that was broken to begin with. The announcement was a little cryptic and short on the technical details, but the release notes fill in the blanks. The reason they probably worded the announcement that way was to call attention to poor use of addslashes in php, or other naive escaping tricks. Just use the built-in client library escaping functions, and everything is fine.

      --
      Social scientists are inspired by theories; scientists are humbled by facts.
    5. Re:Would like some clarification. by dodobh · · Score: 1

      No, it won't affect you if you are using prepared statements. It may affect you if you are not using them.

      --
      I can throw myself at the ground, and miss.
    6. Re:Would like some clarification. by Anonymous Coward · · Score: 0
      JDBC would use the correct string escaping routines in the postgresql client library, PQescapeStringConn. That is perfectly safe.
      The JDBC driver doesn't actually use the C client library at all; it's pure Java.
    7. Re:Would like some clarification. by ummit · · Score: 0, Redundant
      Here's the issue in a nutshell.

      You're some "middleware" code, with the user above you and a SQL database beneath you. You've got this variable, $userinput, containing a string entered by the user. And you want to build a database query along the lines of

      select * from table where somecolumn = 'user input';
      where you want to plug the user-entered string in there to search for the user's input in the obvious way.

      How do you do this? The obvious (but naive) way is to construct your SQL query on the fly, using your language's form of string concatenation. (Here I'll use "+".)

      $query = "select * from table where somecolumn = '" + $userinput + "';";
      But what if $userinput happens to contain a single quote? That's the problem here.

      The obvious (but naive) thing to do is to go through $userinput checking for single quotes and escaping them in some way. You can either do this by hand, or call a function in your language that might seem to do it for you. But you're skating on thn ice here; you're playing games you shouldn't have to play; there might be cases you haven't thought of. One of them is obvious enough: if you're looking for ' characters and escaping them with \, that also means that, oh yeah, you also have to look for existing \ characters so you can escape them with \. Or you can call a function like PHP's "addslashes" which might seem to do this for you. But it turns out -- this is what the fuss is all about -- that there are other problems, if the diddling you're doing with ' and \ characters interacts badly with the diddling the database layer is doing with multibyte character encodings.

      So the much better thing to do -- from several standpoints! -- is not to do this funky ad-hoc concatenate-the-strings SQL query construction after all. Use a "prepared statement", which lets you tell the database engine where to plug in those strings you've already got lying around in variables. At first this seems more roundabout, at first it's a little harder to think about, but once you've got your head wrapped arond it, it feels better anyhow, not to mention the fact that it insulates you from attacks like this one.

      But yes, in answer to your question, if you're using prepared statements you should be safe, as long as your language's implementation of prepared statements isn't broken underneath.

      Finally, for those wondering why this is such a big deal, why unexpected ' characters don't just cause syntax errors and random failures, how it is that this can be turned from a bug into an actual exploit -- here's the deal. Suppose that the $userinput string contained the text

      foo' OR 'a' = 'a
      Now, when the naive string concatenation is done, the resulting query is
      select * from table where somecolumn = 'foo' OR 'a' = 'a';
      But this query is always true! The user entering that string contrived to change the behavior of the query. Obviously the user needs to know or guess what the queries you're generating look like and how you're generating them, and the user may or may not acactually be able to subvert your query into doing something truly incorrect or destructive, but the possibility is very much there.
    8. Re:Would like some clarification. by JLeslie · · Score: 1

      Yeah after re-reading the article I realized they appeared to be focusing on a PHP function, my bad... It seems strange to me that this could even be considered an injection hole on Postgres's end of things. Maybe someone can point out a case where parameterized PreparedStatement style SQL is NOT a good idea, or is it just a case where some languages have better 'off the shelf' support for them?

    9. Re:Would like some clarification. by jadavis · · Score: 1

      Yes, you're correct. I assume that the string escaping is done using an almost identical function though, right? Surely it would produce the same results as PQescapeStringConn.

      --
      Social scientists are inspired by theories; scientists are humbled by facts.
    10. Re:Would like some clarification. by jadavis · · Score: 1

      It seems strange to me that this could even be considered an injection hole on Postgres's end of things.

      It isn't really a postgresql hole, it's just that if the client sent an invalid multibyte string than postgresql would accept it in the past. Now it throws an error, which is better behavior. It has the potential to break backwards compatibility though, so anytime that is the case the PostgreSQL team takes it very seriously.

      Maybe someone can point out a case where parameterized PreparedStatement style SQL is NOT a good idea

      I think it's always a good idea, did you have a problem in mind? The main thing with escaping is, always use the recommended native database escaping functions. You can use them directly, but in many cases it's much easier to use parameterized statements which will call the native escaping function for you in a portable way (which is a big help if you want to support multiple databases).

      --
      Social scientists are inspired by theories; scientists are humbled by facts.
    11. Re:Would like some clarification. by JLeslie · · Score: 1

      I think it's always a good idea, did you have a problem in mind?

      I agree and no I didn't. I think that is what lead to my initial confusion about the issue, I couldn't conceive of a case where parameterization shouldn't be used so I thought that perhaps this 'hole' was a problem that affected parametrized statements. (ie. everyone should be using parameterized statements, so the only hole a db can have is in it's native escaping functions...) I understand the issue much better now, and for what it's worth I think the Postgres team has been more than reasonable in fixing this.

      Someone in another thread mentioned 'query by example' forms as not being suitable for prepared statements, but even those I've done before using parameters (there was some string parsing before hand to figure out operators, but I managed to get the '?' marks in in the end).

      which will call the native escaping function for you in a portable way (which is a big help if you want to support multiple databases)

      That's one of the things that I've really come to appreciate with JDBC. Playing with Postgres has been very fun simply because their was almost no problem moving over from Oracle. This isn't strictly because of JDBC either. I think I must have come to Postgres at the right time. I installed it on my local windows box first (flawlessly) which apparently wasn't a normal option for quite awhile. The pgAmin tool is quite handy and seemingly full featured. The support for foreign keys, constraints, stored procs, and all that stuff that I've really come to appreciate is all there! I didn't mean to turn this into a PostgreSQL fan rant, but there was so many bad things I heard about it before trying it out myself. Thus far, it would seem most of those things are myths that have been fixed (or perhaps never existed in the first place).

    12. Re:Would like some clarification. by Anonymous Coward · · Score: 0

      Essentially, yes.

      The critical difference is that all the strings being dealt with are Java String objects, which have a well-known encoding (UTF16). Any necessary escaping is done at that level, before transformation to the server's encoding. So there's no scope for malicious input "splitting" a multibyte character since that input is going to be UTF16 by the time it is given to the JDBC driver. (yes, UTF16 has sortof-multibyte characters with surrogate pairs, but the ranges used for those pairs are not used for other characters and won't be mistaken for a quote by the escaping routines).

  19. On the bright side... by quantum+bit · · Score: 5, Interesting

    PostgreSQL defaults to SQL-ASCII encoding, which is unaffected by this particular attack. Only clients which connect using a multibyte encoding would be affected.

    Actually, this really isn't a vulnerability in the database server itself -- the update just intentionally breaks certain badly written applications in order to protect them from themselves. If PHP's addslashes() ends up creating valid multibyte characters that produce unexpected behavior, that's really PHP's problem -- Postgres is just doing what it's told.

    1. Re:On the bright side... by Anonymous Coward · · Score: 0

      this really isn't a vulnerability in the database server itself

      Then, where exactly the vulnerability is? Does the server, itself, missbehave with UTF-8, creating remote exploitable security hole?

      Please! There is no need to show PestgreSQL in too good a light.

    2. Re:On the bright side... by ultranova · · Score: 1

      Then, where exactly the vulnerability is? Does the server, itself, missbehave with UTF-8, creating remote exploitable security hole?

      From what I understood (but I'm not an expert, so I could well be wrong), if you quote a string that's inserted into an SQL statement with certain naive functions (of which PHP's addslashes() is one), and that string contained malformed multibyte characters, the resulting SQL statement will have parts of that string considered as part of the SQL command, not data.

      It's a case of malfunctioning PHP function, which does not properly handle malformed multibyte strings. The vulnerability is in PHP. Well, actually, the vulnerability is in the brains of the programmer who used addslashes() instead of pg_escape_string().

      The PostgreSQL server itself is not vulnerable, the vulnerability is in some clients that can be tricked into giving the server commands that the client authors didn't intend. The server then follows these unintended commands, since it has no way of knowing if they were intended by the application author or some malicious blackhat.

      Of course, the real WTF is in that the PostgreSQL server does not check the evil bit on queries before executing them...

      Please! There is no need to show PestgreSQL in too good a light.

      True, the reality is good enough as is ;).

      --

      Forget magic. Any technology distinguishable from divine power is insufficiently advanced.

    3. Re:On the bright side... by seann · · Score: 1

      mysql_real_escape_string() had the same problem with multibyte characters and certain charsets.

      you don't need addslashes() for a bad day.

      --
      I'm a big retard who forgot to log out of Slashdot on Mike's computer! LOOK AT ME.
    4. Re:On the bright side... by CTachyon · · Score: 1
      mysql_real_escape_string() had the same problem with multibyte characters and certain charsets.

      How could that be? The point of mysql_real_escape_string() vs. mysql_escape_string() (or any other PHP quoting function) is that it uses the settings from the DB connection to know the correct encoding.

      Assuming it just thunks down to the MySQL client library like the docs imply, that means one or more MySQL programmers need to be taken behind the shed and beaten severely.

      --
      Range Voting: preference intensity matters
  20. Validate, Validate AND Validate by Joe+U · · Score: 4, Insightful

    Multi-Layered validation is the only way to go.

    Client validation is only useful for round-trip bandwidth reduction, it's nice to have, but not secure in any way. It can stop the occasional accidental bad input. (e.g. entering strings when numerical data is called for, pop up a message box telling you not to do that), it won't stop anyone really interested in corrupting your data.

    The app server should be validating everything being posted to it. Is this string too long, too short, not a string, wrong encoding, etc...

    The DB server should ALSO be validating everything coming from the app server. Don't trust your application server, it could have a bug, it could have been hacked, it might not be your app server, who knows. Strict stored procedures with no r/w access to tables is a really the only way to go. (To: My Co-Workers, Using select * queries and running as dbo and/or sa is usually a sign that you're not doing it right)

    Yes, it's paranoid thinking, yes, it's more work and yes, there is a slight performance hit, but it is secure and it's damn hard to break.

    1. Re:Validate, Validate AND Validate by tepples · · Score: 1

      Don't trust your application server, it could have a bug, it could have been hacked, it might not be your app server, who knows.

      And the database server hasn't been intruded? Your stored procedures could have a bug too.

    2. Re:Validate, Validate AND Validate by Joe+U · · Score: 1

      Why would an app server intrusion cause a db server intrusion? They shouldn't share any login account resources.

      Sure, the procedures could have bugs, the idea is to minimise the risks.

      Besides, if you lock out slelect, drop and delete (etc...) access to the app server, it makes it much harder to do damage on the DB server.

  21. Re:Guess its time by ZeekWatson · · Score: 0

    Pure BS. There is no way in hell postgres can come close to 35k inserts/s. To get 35k inserts/s you would need a high end cluster system ... running something other than postgres.

    Anyway this is /. mod parent UP!!!

  22. Re:I dont see how UTF-8 is vulnerableg by RuneB · · Score: 3, Informative

    PostgreSQL ignored invalid UTF-8 sequences, meaning a ' character at the end of a incomplete sequence could cause only one ' to be seen by the parser when escaped.
    See http://www.postgresql.org/docs/techdocs.50 for the details.

    --
    dtach - A tiny program that emulates the detach feat
  23. The Prepare Command by Qzukk · · Score: 2, Insightful

    ... because counting out 500 question marks to figure out why the hell your parameters don't match up is MUCH more fun than being paged at 3AM because the entire production database was wiped out.

    --
    If I have been able to see further than others, it is because I bought a pair of binoculars.
    1. Re:The Prepare Command by rjstanford · · Score: 1

      Dude, that's what toolkits are for.

      --
      You're special forces then? That's great! I just love your olympics!
  24. The power of 'GRANT' by KodeJockey · · Score: 0, Offtopic

    Here's a good example of a security flaw: people who extract the database to a flat file and leaves it their hard drive. 26 million veterans can't be wrong. No, seriously, a 10 minute seminar on user permissions should be required of anyone running a DB server. Like a driver's license.

    --
    i got ball this is my adress 108 20 37 av corona come n do it iam give u the sidekick so I can hit you wit it
  25. As an aside... by Ayanami+Rei · · Score: 1

    You can always use an 8-bit database charset and then use nchar or nclob columns when you want UTF-32 or UTF-16 support. So then your web-app or whatever has to be consciencious where non-ASCII is allowed so it gets converted/stored properly. Then it's only wasteful if the majority of the database content is localized/user submitted latin text. But uh, the case they were going on about concerned far-east locales where multi-byte is a must so I don't think you'd have much issue using a fixed-width encoding if that's your audience.

    --
    THIS THING CAN TURN ON A DIME, MACROSSZERO STYLE ALSO FUCK BETA, ~NYORON
  26. Re:Guess its time by einhverfr · · Score: 1

    Pure BS. There is no way in hell postgres can come close to 35k inserts/s. To get 35k inserts/s you would need a high end cluster system ... running something other than postgres.

    Well, I was thinking that it might be possible if you bunched enough of them into the same transaction, had sufficiently small updates, used a single prepared statement over and over, turned fsync off, and had HUGE amounts of memory, then, perhaps?

    Not that I would want to run 35k/s inserts with fsync off...

    --

    LedgerSMB: Open source Accounting/ERP
  27. Re:Guess its time by Anonymous Coward · · Score: 0

    > Pure BS. There is no way in hell postgres can come close to 35k inserts/s. To get 35k inserts/s you would
    > need a high end cluster system ... running something other than postgres.

    Not necessarily - if it supports a high-performance non-logging load utility, and the poster was talking rows and not actual sql statements.

    I commonly hit 100,000 rows/second with DB2 on a simple four-way SMP with its load utility. That's with a seriously slow table type too, I could probably hit 2-4x that speed on a simple table.

  28. Re:Guess its time by Anonymous Coward · · Score: 0

    Perhaps:

    1- he's using some kind of loader
    2- he's running a netezza server

    Does mysql have a loader?

  29. So here's my question... by Anonymous Coward · · Score: 0

    What's SQL?

  30. Re:Guess its time by jadavis · · Score: 2, Interesting

    Well, I was thinking that it might be possible if you bunched enough of them into the same transaction, had sufficiently small updates, used a single prepared statement over and over, turned fsync off, and had HUGE amounts of memory, then, perhaps?

    You don't need to do all that.

    The command:
    time ruby -e 'puts "BEGIN;"; for i in 1..35000 do puts "INSERT INTO a VALUES(#{i});" end; puts "COMMIT;"' | ./81/bin/psql test

    Gives me about 7-8 seconds with write caching off, fsync on. Yes, I turned off write caching with hdparm. This is on my PC, with an el-cheapo IDE drive and a modest 1GB of RAM.

    If you think about it, why should it take a long time? It's all one transaction. It's basically just writing the data to a file (the log). The table "a" is just an int field, obviously. And with postgresql's MVCC, an INSERT is very cheap.

    --
    Social scientists are inspired by theories; scientists are humbled by facts.
  31. addslashes? by Abstract · · Score: 4, Interesting

    'He also notes that the addslashes function was deprecated in PHP 4.0 due to security risks, but a "distressing" number of PHP applications continue to use the function.'

    How come the php documentation doesn't mention this?

    1. Re:addslashes? by Krunch · · Score: 1
      It does, somewhat.
      Use of this function is recommended instead of addslashes().
      --
      No GNU has been Hurd during the making of this comment.
  32. Re:Guess its time by Circuit+Breaker · · Score: 1

    Dude, you're most probably using the wrong database. Financial data or billing info (or otherwise telecomm industry?) I presume.

    Consider kdb+ from Kx systems. Properly used, it can do ~1mil rows/second, with multiple live hot failovers. It's pricey (several hundreds of K$), especially compared to MySQL/PGsql, but might be worth your buck.

  33. How to make SQL injection impossible by hypersql · · Score: 1
    Many developers write code like this:
    execute("SELECT ... WHERE NAME='"+name+"' ...
    Obviously, this is unsafe. I even wrote such code myself (baaaad). The problem is, many developers don't know how unsafe it is. Most know that they should use PreparedStatement, but don't do it for one reason (mostly laziness) or the other (preparing statements is slow in Oracle, index not used for 'LIKE ?' in some databases).

    There is a way to solve SQL injection problems: Disallow text literals. Or even, disallow literals (including numbers) at all. This could be a setting in the database that is on by default, and only off for certain applications (ad hoc query tools) or users (admins). What do you think about that?

    I'm thinking about implementing this feature in the database I write (http://www.h2database.com/):

    SET ALLOW_LITERALS 0 (no literals allowed)
    SET ALLOW_LITERALS 1 (only numbers, text not)
    SET ALLOW_LITERALS 2 (everything allowed)
    This would be a persistent setting, and only an admin can change it. But, maybe this is the wrong place to ask for comments on this?

    (Of course there are other security risks, like using 'customer id' in URL or hidden fields in a web application. Or relying on Javascript data validation. But I don't know what to do about those problems.)

    1. Re:How to make SQL injection impossible by rjstanford · · Score: 1

      Nice idea. I wouldn't worry about getting too complicated though -- 0 and 2 would be fine. Allowing 1 would just encourage people to start using numeric encoding in places where it shouldn't be kept.

      --
      You're special forces then? That's great! I just love your olympics!
    2. Re:How to make SQL injection impossible by rjstanford · · Score: 1

      I took the opportuniy to check out H2 - looks nice, and we'll be checking it out more. I always wanted to write my own DBMS, but never got around to it (many years as a DBA/Developer who loved internals). But anyway. I did see that you had an interesting comment on your VARCHAR documentation: Unicode String. Use two single quotes ('') to create a quote.. Does that imply that this will happen even when using prepared statements? I'd like to think not, but...

      --
      You're special forces then? That's great! I just love your olympics!
    3. Re:How to make SQL injection impossible by hypersql · · Score: 1
      Use two single quotes ('') to create a quote. Does that imply that this will happen even when using prepared statements?
      No, just when using it in a query directly. Data that is set via PreparedStatement.setX is not parsed. I think all SQL database work like that. The two single quotes is standard SQL, but there are some databases (PostgreSQL? I forgot) who supports a backslash as an escape character, but this is non-standard.
    4. Re:How to make SQL injection impossible by hypersql · · Score: 1
      would just encourage people to start using numeric encoding in places where it shouldn't be kept.
      I saw lots of that already. Views with embedded constants (... WHERE FORM_TYPE = 5000001025) for example. Reminds me to add support for constants in the database engine. I wonder if other databases support constants, and if there is a standard for that.
  34. PDO / PEAR::DB by GingerDog · · Score: 1

    Hi,

    PDO and PEAR::DB both provide ways of doing this under PHP.

    See http://pear.php.net/ and http://www.php.net/pdo for examples.

    David.

    --
    The Ginger Dog
  35. Re:Guess its time by Just+Some+Guy · · Score: 1
    There is no way in hell postgres can come close to 35k inserts/s.

    You're wrong. My company has a huge legacy FoxPro database, but for performance reasons we copy most of its tables to PostgreSQL as an hourly cron job. One of those tables is just over 7 million rows and I can copy it into PostgreSQL is slightly less than 5 minutes. 7071473 rows in 287 seconds yields 24639 inserts/s, and this is on a production server answering other queries at the same time.

    PostgreSQL 8.1.3, FreeBSD 6.1-STABLE, single Xeon, 3GB of RAM, one SCSI-320 drive, very little performance tuning. If we can hit 25K on that box, I have no doubt he can sustain 35K on beefier hardware.

    --
    Dewey, what part of this looks like authorities should be involved?
  36. Re:Guess its time by jaredmauch · · Score: 1
    Actually, I can get ~70k/sec when I do not have an index on the table but that means I have to sequentially scan the table every time I do a select/search.

    Here's the keys to it:
    Increase src/include/pg_config_manual.h BLCKSZ to the max (32768)
    Use COPY not INSERT (faster than transactions for the inserts too)

    Postgres isn't threaded so even though it's a dual proc dual core 252, I am still cpu bound. I really need some sort of lazy index/index chunker that can run in parallel to utilize the multiple processors more efficently so I can have more than a single index on the table.

  37. Re:Guess its time by kpharmer · · Score: 1

    > Postgres isn't threaded so even though it's a dual proc dual core 252, I am still cpu bound. I really
    > need some sort of lazy index/index chunker that can run in parallel to utilize the multiple processors more
    > efficently so I can have more than a single index on the table.

    Actually, if this is a critical part of your database, you're probably using the wrong product: postgresql is a great product, but this is definitely its weak spot.

    For something like this you're far better off with db2 or oracle. My preference is db2 since it lacks a larry ellison and can be far cheaper than oracle. What you'd get with db2 for example is:
        - query parallelism: work split across all cpus
        - load parallelism: loads split across all cpus
        - partitioning: rather than using btree indexes that won't work when you're selecting more than 5-15% of the data, partitioning works linearly when you're selecting 2%, 20%, or 80% of the data.
        - query optimization: better query planning means your queries need less tweaking to run the way they should
        - fine-grained memory control: to allow you to get the most out of memory, which is very valuable on databases like this. Sort memory, tablespace caches, etc
        - automatic summarization: db2 can automatically update summary tables for you
        - query rewrite: db2 can automatically rewrite incoming queries to hit summary tables when it makes sense

    DB2 Express is free for small systems. I think the limit is 2 cpus & 4 gbytes of memory, with no limit on data size. Beyond those limits you can get workstation edition which is (going from memory here) $1500/CPU list price plus $250/named user. If you want it on the net then it's $7500/CPU for unlimited users. That's list, and you can probably get 50% off.

    The only disadvantage with db2 in this configuration is that you'd want to use MDC for partitioning, which is slower to load than a regular table. You'd probably hit just 25-35k rows/sec with mdc. Right now DB2 viper is in beta, that allow you to use MDC, range partitioning, or hash partitioning. All in combination. This means that if you really wanted to scale up you could spread the data across a dozen two-way boxes with each one partitioning its local data. This isn't cheap to do (licensing db2 this way is expensive), but if you wanted to get adhoc queries against a 100 million row table returning results in 1-4 seconds - this is the way to do it. ;-)

    Oracle is similar, just 2x as expensive, more knobs to twiddle, better skills & third-party availability, and highly unpleasant sales staff to deal with.

    Of course, Netezza might also hit the spot using postgresql. It wouldn't have all the benefits of db2 or oracle, but has some. And not sure if they have a lot cost entry-level product either.

  38. Why choose either of those? by Just+Some+Guy · · Score: 2, Interesting
    ... because counting out 500 question marks to figure out why the hell your parameters don't match up is MUCH more fun than being paged at 3AM because the entire production database was wiped out.

    In the Python DB-API, SQL strings look like:

    select foo from bar where baz=%(baz)s

    You create a dictionary (hash table) with a key "baz", pass that dict to the database along with your query, and it fills in the blanks. Your job as the programmer is to make sure that dict has all the keys in it to complete the query; it doesn't matter which order you assign them or if you don't use them all.

    In fact, a very common case is to create on dict with all the values you'll need to execute a whole list of queries, and just keep passing the same dict rather than redoing it each time:

    # I know this example is lame.
    sqlparams = { 'zipcode': '12345', 'lastname': 'smith' }
    cursor.execute('insert into customers (lastname) values (%(lastname)s)', sqlparams)
    cursor.execute('insert into locations (zipcode) values (%(zipcode)s)', sqlparams)
    cursor.execute('insert into shipments (lastname, zipcode) values (%(lastname)s, %(zipcode)s)', sqlparams)

    It's about as easy as you can possibly make it and has no disadvantages that I've ever encountered. So, I'd take the position that it's better to protect the server and forget about old ideas like positional parameters. There are extremely programmer-friendly solutions to this problem if you know where to look.

    --
    Dewey, what part of this looks like authorities should be involved?
  39. Re:Guess its time by Anonymous Coward · · Score: 0

    This post of mine is troll/flamebait-ish, yes, but still honestly meant:

    Now look at what you've done. Confusing the MySQL fanbois with performance figures for PostgreSQL not even closely resembling their world-view. How will this end...? The horror!

  40. Re:Guess its time by Anonymous Coward · · Score: 0

    Or do:
    insert into t (
    select d1
    union all
    select d2
    uncion all
    select d3 ...
    )
    so you insert lots of data in one statement!

  41. Re:Guess its time by Anonymous Coward · · Score: 0

    > If you think about it, why should it take a long time? It's all one transaction. It's basically just
    > writing the data to a file (the log). The table "a" is just an int field, obviously. And with
    > postgresql's MVCC, an INSERT is very cheap.

    So, you hit 5,000 rows / second with a 4 byte row. That's considerably slower than 35,000 rows / second with real-world data. Granted - you did this with no performance tuning on a desktop.

    But, in a real-world scenario you're more likely to have:
      - twenty columns
      - a row-width of 50-200 bytes
      - a primary key providing a unique constraint
      - possibly several indexes
      - possibly several check constraints

    and would probably be lucky to hit 500 rows / second.

  42. Re:Guess its time by Just+Some+Guy · · Score: 1
    Now look at what you've done. Confusing the MySQL fanbois with performance figures for PostgreSQL not even closely resembling their world-view. How will this end...? The horror!

    Imagine if I started throwing out numbers for selects with bitmap scans - their little heads would explode. :-)

    Seriously, "PostgreSQL is slow" is about as current as "Slowaris" and the "17 minutes to copy a file" Mac troll. None of them have been true for years, if they ever were.

    --
    Dewey, what part of this looks like authorities should be involved?
  43. Re:LOAD DATA INFILE by nukeindia.com · · Score: 1

    Probably he is talking about something like MySQL's LOAD DATA INFILE. In MySQL, inserting data into a table directly from a delimited text file, without using SQL statements, is just as much fast, if not faster.

  44. Re:The Prepare Command - Use PDO by gavinroy · · Score: 1

    $query = $pdoInstance->prepare('INSERT INTO myTable VALUES ( :foo, :bar:, :val );');
    $query->bindParam(':foo', $foo);
    $query->bindParam(':bar', $bar);
    $query->bindParam(':val', $val);
    $query->execute();

    I don't see how that could be any less clear.

  45. Re:The Prepare Command - Use PDO... More clear by gavinroy · · Score: 1

    Wow, I need to drink my coffee. More clear. More clear... To think I even previewed and reread it.

  46. PostgreSQL doesn't follow the ISO SQL Standard by Anonymous Coward · · Score: 0

    By allowing back-slashes in strings, it opened
    itself up to this sort of vulnerability. I am
    waiting for the back-slash mechanism to go away
    completely. The only way to escape a single
    quote should be to double-it-up. 'Peter O''Brien'

  47. Re:Guess its time by einhverfr · · Score: 1

    You still need to speed that up by about seven times to get 35k/s inserts. This is why I suggested turning fsync off would be the only way to make this happen.

    However, this would be limited to bulk imports of data that didn't have constraints. For example, bulk imports of numbers from experimental measurements into a temporary table so that they could later (at one's convenience) be loaded into proper tables.

    Yes, I can imagine real-world applications for such rapid small inserts, but in general, they are few and far between.

    --

    LedgerSMB: Open source Accounting/ERP
  48. Websense wins again! by Anonymous Coward · · Score: 0

    You have to love Websense and their dedication in making sure that sites actually match the categories their products list them in.

    Access to this web page is restricted at this time.
    Reason: The Websense category "Peer-to-Peer File Sharing" is filtered.
    URL: http://www.postgresql.org/download/

  49. Re:Guess its time by ahodgson · · Score: 1

    Right now DB2 viper is in beta, that allow you to use MDC, range partitioning, or hash partitioning. All in combination. This means that if you really wanted to scale up you could spread the data across a dozen two-way boxes with each one partitioning its local data. This isn't cheap to do (licensing db2 this way is expensive), but if you wanted to get adhoc queries against a 100 million row table returning results in 1-4 seconds - this is the way to do it. ;-)

    Greenplum is doing something similar using PostgreSQL as a base:

    http://www.greenplum.com/products/bizgresMpp.php

  50. Re:Guess its time by jadavis · · Score: 1

    You still need to speed that up by about seven times to get 35k/s inserts.

    I was trying to show that it's not out of the question. I don't have particularly good I/O on my machine, and I have a slow processor. The disk does not need to be written until COMMIT time, and it can be written sequentially (thanks to the write-ahead log). If there are several processors and a connection per processor, I could imagine that this type of benchmark could approach the write speed of the disk.

    INSERTs aren't expensive in PostgreSQL. They can be if there are a lot of constraints and indexes, but in general they aren't. That guy could have been telling the truth. The poster who accused him of BS probably assumed that he meant 35k transactions/sec. 35k transactions/sec sounds unrealistic to me, since PostgreSQL can only COMMIT once per disk revolution (assuming "normal" non-RAID hardware).

    --
    Social scientists are inspired by theories; scientists are humbled by facts.
  51. Re:Guess its time by einhverfr · · Score: 1

    I thought the WAL was written before the tuple. THis would mean that the disk would receive a new write instruction for every insert, and if there was any delay in processing it, you would have to wait for another revolution. Thus it would be worth seeing how enabling write caching would affect your performance here. That might tell you if a battery-backed cache might be good enough to obtain these sorts of numbers.

    --

    LedgerSMB: Open source Accounting/ERP
  52. Re:Guess its time by jadavis · · Score: 1

    I thought the WAL was written before the tuple.

    The way I understand it, the tuple is written first at the end of the file containing that portion of the table in question. But it is written asynchronously, meaning that it may stay in the OS Buffer Cache.

    When the COMMIT comes, PostgreSQL writes sequentially to the WAL, but this time it writes synchronously.

    That way, if there is a power failure, the data pages in the buffer cache holding the tuples may be lost, but the WAL contains enough information to reconstruct any committed transaction.

    THis would mean that the disk would receive a new write instruction for every insert

    No, often INSERTs will stay in the OS buffer cache until COMMIT time.

    --
    Social scientists are inspired by theories; scientists are humbled by facts.
  53. Re:Guess its time by einhverfr · · Score: 1

    You are correct-- the WAL entries are flushed at the end of the transaction.

    --

    LedgerSMB: Open source Accounting/ERP
  54. Re:Guess its time by mattypants · · Score: 1

    One of our production systems averages around 500/s, with peaks at over 3000/s, with roughly that sort of table structure... it's a dual Opteron, H/W RAID10 across two channels, 6 disks and 4GB of RAM and a separate SATA WAL drive. Thing is, it's not just an insert, it's a call to a stored procedure that does some data hygiene, then a select and then either an insert or update depending on the result of the select. Very fast, very happy!