PostgreSQL Slammed by PHP Creator
leifbk writes "'The Web is broken and it's all your fault' says Rasmus Lerdorf, the creator of PHP. He talks about not trusting user input, and the brokenness of IE, which is all fine. Then he makes a statement about MySQL vs PostgreSQL: 'If you can fit your problem into what MySQL can handle it's very fast,' Lerdorf said. 'You can gain quite a bit of performance.' For the items that MySQL doesn't handle as well as PostgreSQL, Lerdorf noted that some features can be emulated in PHP itself, and you still end up with a net performance boost. Naturally, the PostgreSQL community is rather unimpressed. One of the more amusing replies: 'I wasn't able to find anything the article worth discussing. If you give up A, C, I, and D, of course you get better performance- just like you can get better performance from a wheel-less Yugo if you slide it down a luge track.'"
Honestly, just avoid this discussion by using flat files.
"If A equals success, then the formua is A=X+Y+Z. X is work. Y is play. Z is keep your mouth shut" - A Einstein.
"Rasmus Lerdorf, the creator of PHP ... said the current state of the Internet includes a litany of broken items, but with a little help from PHP there may well be some hope for the Web yet."
...
I wonder if he has ever consider using Perl
Hulk SMASH Celiac Disease
This guy is an idiot. PHP is a nice product though, if anyone can get past its inconsistent function naming schemes.
He also states:
He *just* learned that? Oh my, that's scary.
MySQL is made for speed compromising to act like a database where it does not break its own convenience. PostgreSQL is a database which will compromise for speed, if it does not break the database.
From someone who obviously is suprised that to secure something you need to make a safe-house and then be strict about what gets in, it seems that he missed the point on the MySQL/PostgreSQL thing.
Maybe by the next conference he'll grow up and state the new revelation "You have to use a database like PostgreSQL and use a warehouse schema to allow faster reporting."
====
Nor was this a "slam". PostgreSQL is not made for specifically web use. If anything, Lerdorf merely publicly demonstrated his own immaturity.
Have you read my journal today?
You are basing this on a rather incomplete account of my actual talk. I went through a series of optimizations of a sample Web application, and one of many steps was to try MySQL instead of PostgreSQL for that particular application. By profiling it with Callgrind it was obvious that in this particular case MySQL was significantly faster. I don't think this is news to anybody that MySQL is quicker at connecting and issueing simple queries, and I am not sure why me showing some Callgrind profiles and stating that MySQL is particular good at these things is frontpage slashdot material. Slow day?
And the "The Web is broken and it is all your fault" thing was just a bit of humour to wake people up for this 9am talk, but I guess it makes for good headlines.
I've used MySQL on several projects. At first because we didn't know any better, later because it was the thing we knew best, or because the project was already using it when I joined it. Inertia. We're using a 5.0.x now, on a setup where we replicate to six slaves, it's not small.
I knew that MySQL could do stupid things now and then, but at least it was our stupid thing. We have some experience with it, by now.
Recently though, some colleagues on another project had an issue with major data loss - an input script had put data into the database that wasn't really compatible with the data model.
Turns out that in a table with an auto-increment primary key named 'id', some of those ids occurred over 200 times. A primary key.
I don't care if there's options or ways to have it check that, even without "emulating it in PHP" (shudder) - anything that is even considering putting "SQL" in its name has to complain loudly when someone tries to insert such crap, and then abort. Not just silently accept it.
That's the eternal problem with MySQL - everywhere, the default action on wrong input is to silently continue, perhaps trying to read the mind of the programmer and turn the nonsensical value into some equally nonsensical default. Put a string into an int field? Let me guess what you meant... etc.
I've had it, I don't want MySQL anymore.
But with PHP and MySQL, you can hammer screws much faster! :D
The headline implies that Rasmus blames PostgreSQL for breaking the web which is not the case. The focus of his ire is web application programmers for putting too much trust in user input. I don't think anyone can truthfully argue with that.
His comment regarding PostgreSQL was:
"If you can fit your problem into what MySQL can handle it's very fast, you can gain quite a bit of performance."
As someone who uses both MySQL and PostgreSQL in production environments, I couldn't agree more. The key qualifier is "If you can fit your problem into what MySQL can handle". In order to argue that this statement is wrong you would have to argue that PostgreSQL is faster than MySQL in situations that are ideal for MySQL.
XJS*C4JDBQADN1.NSBN3*2IDNEN*GTUBE-STANDARD-ANTI-U
I got sick of the syntax dialects of every SQL engine, so I started writing my applications using Hibernate and haven't looked back.
I learned HQL (Hibernate Query Language) and just use whatever database is handy at the time.
I usually start with MySQL 5, and then if I need more muscle (Read: the boss wants to spend money), I can switch the entire application to Oracle in about two hours.
You want ACID...? Use J2EE transactions and Hibernate, and never worry about which database you use again.
PHP/FI was the second version, and it wasn't written in Perl. Neither was the first version. The first version of PHP replaced some Perl code which may be where this myth comes from.
"just like you can get better performance from a wheel-less Yugo if you slide it down a luge track."
I am sick and tired of seeing these sweeping, baseless statements on Slashdot. The body of a Yugo is much too wide to sit flat on the ice of a luge track.
Editors, please start doing some fact checking before posting this stuff.
#DeleteChrome
Basically, they needed to aggregate data from about 56 million rows in table, and required a self-join as well. I got the consulting contract because this was taking at least six days to complete.
Inputting the 56 million records took about a hour; this included creating three indices.
So far so good. At that point, to make in run faster, I wanted to pre-calculate and deformalize the data the self-join would give. I'd already included columns for this denormalized data in the table, so it was pretty much
A simple correlated subquery self-join in a update. Low and behold, MySQL doesn't allow this,. at all:
Ok, so instead of a subquery we can do a join, but that means we have to throw away the max() operation. Without the max predicate we're doing 1-to-Many joins on b where there is more than one row matching our criteria, and so we're potentially doing multiple updates (all but one of which gets "thrown away") to a row.
Ok, so far so good.
First time around, I included the demoralized column in an index, and of course the update changed the column values. If I dropped and re-created the index, MySQL took about four hours to re-index (four times the time it took to make the index when it BCP'd it in). But if I repaired the index, rather than dropping it, well, it never actually completed, becasue after two days I killed it. What the hell?
Finally, to display the data, I needed to do some date manipulation, a lot of it repeated. In pg, I'd have written the code once, in a user defined function. In MySQL, that requires compiling a shared library, so instead I repeated these rather long calculations in a select. Tedious and error prone. (In MySQL's favor, the built-in date functions are a lot cleaner than T-SQL's.)
Eventually I got a six-day or longer process down to three hours, but it wasn't pretty.
So long story short: a business goes with MySQL because it's "fast". At a certain point, it ceases to scale, and you have to perform "heroic measures", denormalizing and pre-calculating. The index repair is a mess. You can't easily encapsulate code in functions or, prior to 5.0, views. It's no longer fast, and your mission critical business requires calling a consultant to optimize what was perfectly good code before the table size grew.
Opinions on the Twiddler2 hand-held keyboard?