Top 5 Reasons People Dismiss PostgreSQL
Jane Walker writes "In an effort to dispel some of the FUD surrounding this impressive product, this article puts forth several of the most commonplace reasons for a user to dismiss PostgreSQL." From the article: "While PostgreSQL's adoption rate continues to accelerate, some folks wonder why that rate isn't even steeper given its impressive array of features. One can speculate that many of the reasons for not considering its adoption tend to be based on either outdated or misinformed sources."
...coauthored an excellent book on PostgreSQL that was just published by Apress. The title makes it sound like it'd be a bit light, but it takes you all the way up to writing stored procedures, writing C programs that hit the database, using all the utilities, and so forth. I'm using PostgreSQL as a Jabber backend and the book has already proved useful.
Too bad they didn't talk about hitting PostgreSQL from Ruby... but since most folks are using ActiveRecord to do that, it's probably not a big deal. And if you use the Ruby/C client, it's quite snappy.
The Army reading list
PostgreSQL is not necessarily the easiest beast in the world to get going. A few years back, I converted a chat/gallery portal system Ethereal Realms (http://www.erealms.org/ from MySQL to PostgreSQL, since at the time it was felt that features like proper referential integrity and stored procedures would really pay off.
The code was shortened considerably, was more stable overall and the OpenBSD port compiles properly without threading issues. However, despite all of those advantages and the database server being on a bigger server with more memory performance suffered considerably.
Want a good starting place in settings? The default documentation does little if anything to really help you on the matter, its like trying to learn the English language solely through the use of a dictionary.
There are tutorials available, some out of date and while Usenets can certainly help, you'll get wildly varying answers because some of the configuration options are more black magic then science. Rather makes it hard to get started when you don't know exactly where to start or how increasing a value will really affect performance as a whole. You are expected to load test the database before implementation which is not always possible for small hobby sites, and it can test patience.
With MySQL you had a few configuration files designed for use in various environments and it would show you how certain settings had changed. This is not the case with PostgreSQL in fact 32 connections is the default which is painfully inadequate for most peoples needs when dealing with a site. I'd personally love to see an application that detected your memory and other settings and came out with sane settings, at least with such an option you'd have a place to start.
Queries were slow, but then that was supposed to be MySQL's strength. Solution? Run explain/analyze on everything and tweak the hell out of every single query being generated. If you don't necessarily understand how the query is analyzed and run by PostgreSQL then there must be something wrong with you!
Vacuum? That concept alone can throw people in for a loop, especially when designing a system which is meant to be run by people with no technical experience. So you have to code in a serious amount of intelligence into the application or rely on Auto-Vacuum (not available at the time) which can slow performance down even more.
Because of vacuum, I had to design a process for the site to lock out all users. This had never been required under MySQL and took a while for the users to get used to. In certain cases, if the lock-out failed, the vacuum would cause permanent locks in tables which would quickly pile up scripts on the webserver side leading to extreme high loads or just crashing the machine.
PostgreSQL has a LOT of features and a lot to offer in general. However, there is a major learning curve required to get it going right. I've had other sites implement the code and whenever they hit the version which required PostgreSQL most gave up on the idea of migrating or complained endlessly on how things seemed sluggish. That is NOT a major selling point when trying to get the unwashed masses to adopt your product.
The inventors of Ingres left the company formed around it when it was bought by Computer Associates and started developing the successor to Ingres, hence: Postgres. Make sense?
I wish I had a kryptonite cross, because then you could keep Dracula and Superman away.
Absolute bullshit. I've used PostgreSQL myself in a mission-critical production app for the past 3 years. (I've since left the company, but the app is still in use.) I have been consistently impressed by the quality and performance. There was a strong push to use mysql when the project started since the company already had in-house knowledge of it. Performance was one of the concernes. So I ran the benchmarks. Read performance of simple selects was inconclusive: mysql won some, and postgresql won some. However, postgresql consistently won write tests and scaled better as I added more client threads.
Nested parentheses in SQL can cause an engine crash. " like ... (SELECT A INNER JOIN B) INNER JOIN ..." But the crashing is tolerable. Hand-holding the query optimizer is not. Quite often, the optimizer gets the query plan wrong. Sending special commands to disable internal features is often the only resort.
Bullshit again. Never ever seen that or even heard about it. Again, at my last job postgresql was part of a mission-critical application, and I've used it for a couple of projects before that too. The *only* time I've seen postgresql go down was when we ran out of disk space. And even then, it was not a crash but a clean shutdown. Give me a specific example of a query that you say caused postgresql to crash. Otherwise I'll assume you are yet another troll.
While it's true that PostgreSQL is more database than most corporate weenies need, it falls down in moderate write environments. It's best used for systems that write data very infrequently, otherwise it fragments quickly. The only solution to table and database fragmentation is dump & reload.
Bullshit 3x. The app I was talking about was used for tracking work in a 3d production pipeline with a staff of ~300 artists. There was *a lot* of writes. (every checkin, every render, etc.) By the end of the project the database grew to over 10G. And postgresql didn't even blink.
Vacuum is asinine. Any command that needs to be run periodically under threat of complete and total data corruption should not be. That's right. Only PostgreSQL makes you vacuum or else your transaction ids overflow. This is modern? I'm shocked.
And your point is? All it requires is a single entry in crontab. And you can still run transactions while it's vacuuming. Really, what is your problem with it? It is no different than running a cronjob to do a backup, or a similar maintenance. And since 8.0 and up, postgresql does autovacuum, so you don't even have to worry about that.
So, in short, from my experience PostgreSQL Just Works (TM). And unlike oracle it doesn't cost an arm and a leg, and doesn't require an army of DBAs and sysadmins to maintain it.
___
If you think big enough, you'll never have to do it.
Isn't this fixed in recent versions of MySQL anyways?
In the last thing I installed which is whatever came from the FreeBSD ports collection, it was still a problem. 4.1 I believe.
As MySQL gets a lot of flack for this poor design philosophy*, they have come up with "strict mode" as a solution. That's a configuration option that is more strict about reporting errors. It's a step in the right direction, but that makes it wildly imcompatible with many MySQL applications.
So it's "fixed" in the sense that you can optionally break backwards compatibility to fix it. I haven't heard many reports about people actually trying to use strict mode.
* MySQL has backtracked on design philosophy before. Remember back when an ACID transaction was evil and slow and MySQL would never implement transactions or triggers or anything else?
Social scientists are inspired by theories; scientists are humbled by facts.
MySQL's arguments, like it's features, always seem to be mutually exclusive.
MySQL's benefits:
* More Applications
* Has strict mode if you need it
* Easier to use by default
* Speed
* ACID
The problems are:
* Strict mode is not enabled by default
* If strict mode is easier because of all the benefits of data consistency, then MySQL is not easy by default
* PostgreSQL has more applications than does "MySQL Strict Mode".
* ACID is only on the InnoDB table type, and the highest basic data access speed is on the MyISAM table type
The list goes on. You can argue any one point and say it works just fine, and that you don't care about the other ones.
But PostgreSQL is a whole solution. When they say you have ACID compliance, it doesn't matter what tables you operate on. When they have a reasonably consistent SQL dialect, it can't be configured to be drastically different on a whim, making half the applications (or more) not even work. In short, the features in PostgreSQL don't have long lists of dislaimers and incompatibilities with other features. They just work as advertised.
Social scientists are inspired by theories; scientists are humbled by facts.
Either you want transactional safety or you don't. If you don't use MySQL with MyISAM tables and have fast count(*), if you do use InnoDB (or better use PostgreSQL), but then there's no single count(*) that can be stored with the table since every transaction may see a different count(*).
The cool think about MySQL is that you do have the option (with Table-Engines). The cool thing about PostgreSQL are its advanced featured... One example: Hands up, who here knows what a partial index is?
In PostgreSQL you can setup indexes that only cover a part of the table (for example if you have an active flag on a table and most queries are on active entries, you can have a partial index only on rows that have active=true, and this can speed up things *significantly*); alas most folks even have not even heard about partial indexes, and that is why they do not appreciate PostgreSQL.
This is just one example.
My reply was apt. The parent said that there was little reason to switch, I gave some reasons. I stated a fact (that MySQL thinks Feb 31st is a date), which is not bashing. It looks like bashing because it makes MySQL look bad.
I don't have a problem with other databases. The only database that, to me, stands out as particularly bad is MySQL. That's because their marketing deceives many people.
Someone may see:
1. "MySQL supports strict SQL compliance mode"
2. "MySQL is easy to use from the default install"
3. "MySQL is screaming fast"
4. "MySQL has transactions"
5. "MySQL has more applications written for it than PostgreSQL"
But...
#1 conflicts with #2 and #5 because strict is off by default, and there are fewer "MySQL Strict Mode" apps than PostgreSQL apps.
#2 conflicts with #4 because the default install and CREATE TABLE create MyISAM tables which do not support transactions.
#3 conflicts with #4 because the "screaming fast" reputation is from MyISAM tables. InnoDB is somewhat similar in performance to PostgreSQL.
Everything about MySQL pigeonholes you into a subset of the features that MySQL AB advertises, and then makes it as difficult as possible to roam between those features. In MySQL, if you need to change the type of a table, for instance if you want transactions, you can't do that without disrupting running applications. You need to pause and resume all the applications accessing that table.
Compare to PostgreSQL, if you need to, for example, change the disk that a table is stored on (of course PostgreSQL doesn't have different table types), you can make all the necessary DDL modifications in a transaction-safe way, and the application will never know the difference.
I know MySQL has uses. I use it (sparingly). Slashdot uses it, for good reasons I'm sure. But "has it's uses" does not mean "immune from criticism".
Social scientists are inspired by theories; scientists are humbled by facts.
There are plenty on benchmarks of MySQL vs. Postgres. The Postgres people seem to dispute these but they show MySQL killing Postgres.
Every single MySQL/PostgreSQL benchmark I've seen has either been absurdly trivial or benchmarked with MyISAM which is an apples to gorillas comparison. And they're all old benchmarks as well. Unlike MySQL's apparent philosophy of "1) Do the minimum to be able to claim we have feature XYZ 2) Make it fast 3) Maybe think about making sure it works sanely", the PostgreSQL community's top-most concern is data quality; performance is always seconday to that. Because of that there had been a lot of room for improvement through the 7.3/7.4 timeframe.
That's no longer the case. People commonly see 30%+ speed improvements from 7.4 to 8.0, and 2x that from 7.4 to 8.1. And more improvements are on the way. 8.2 is looking to have on-disk sorts that are 4x faster than today, as an example.