Top 5 Reasons People Dismiss PostgreSQL
Jane Walker writes "In an effort to dispel some of the FUD surrounding this impressive product, this article puts forth several of the most commonplace reasons for a user to dismiss PostgreSQL." From the article: "While PostgreSQL's adoption rate continues to accelerate, some folks wonder why that rate isn't even steeper given its impressive array of features. One can speculate that many of the reasons for not considering its adoption tend to be based on either outdated or misinformed sources."
MySQL is pre-installed by most webhosts, and does the job for most tasks.
First post?
...coauthored an excellent book on PostgreSQL that was just published by Apress. The title makes it sound like it'd be a bit light, but it takes you all the way up to writing stored procedures, writing C programs that hit the database, using all the utilities, and so forth. I'm using PostgreSQL as a Jabber backend and the book has already proved useful.
Too bad they didn't talk about hitting PostgreSQL from Ruby... but since most folks are using ActiveRecord to do that, it's probably not a big deal. And if you use the Ruby/C client, it's quite snappy.
The Army reading list
"Postgre" is three times as long as "My".
Then again, the P in LAMP has always been about the scripting language, not the database.
MySQL and PHP have been quite the dynamic duo of the internet.
That, and PostgreSQL took longer to have a native Lose32 port.
The fact that you can bring Python right into PostgreSQL for good stored procedure justice seems to go unnoticed.
Get thee glass eyes, and, like a scurvy politician, seem to see things thou dost not.--King Lear
Indeed. And once most people are familure with MySQL and the various tools and language support, there tends to be little reason to switch. PostgreSQL is a better database product, but many (all?) of the features that it's cheering section continue to tell us all about whenever the issue comes up, are simply not ones that the majority of MySQL users want or need. Maybe PostgreSQL fans should target Oracle usres.
"Who are in control, they are not in control of anything - they don't even control themselves!" - Glen Beck
I think first and foremost is that is web developers who don't understand SQL, and so go about happily re-inventing its functionality in their web apps.
99% of the problems that web developers face have already been solved for them, but they think that SQL is just a data dump, and thus see no reason to use Postgres, because they think that MySQL does everything they need. In reality, their apps would be faster to write and easier to maintain if they used SQL features.
It's kind of like perl-syndrome, but on a larger scale.
The biggest reason I've found personally why people don't use postgres is because they've never heard of it. Everybody and their dog has heard of mysql, but I've never found somebody who knows about postgres who isn't actually using it. mysql, for whatever reason, just has better marketing.
g -else". If I want a toy database, I'll use sqlite. If I want a real database, I'll use postgres. There really isn't much room in between the two.
Why that is I'm not entirely sure, since ever since I discovered postgres, mysql has been relegated to the role of "use-only-when-a-stupid-web-app-can't-use-anythin
Dlugar
Computer Go: Writing Software to Play the Ancient Game of Go
I don't hear those reasons when people dismiss PostgreSQL. The ones I hear are:
Their database (MySQL, Access, Oracle, whatever) is working for them good enough to justify switching. Me? I'm using PostgreSQL, and I won't switch, even though Oracle now has a free version. Too much work to fix something that's not broken. And while I might never be able to use MySQL in my main project because it lacks some features I really need, it's good enough for lots of people.
please excuse my apathy
It has to do with mindshare and previous history. Way back in the day... 1997, postgres was difficult to setup for some people. It was not the default choice included in many setups at ISPs. If you wanted to have an interactive web application at an ISP, on a unix machine, the most common option was MySQL. (On a windows machine it would be an ODBC connection to an access database, or a MS SQL server) Once something has achieved a significant mindshare and some momentum it is difficult to overcome. (But not impossible, especially if you do a better job, just takes time)
Why do people dismiss PostgreSQL
It's because LAMP sounds so much cooler than LAPP!
"Nine times out of ten, starting a fire is not the best way to solve the problem." - my wife
I have pretty simple requirements for a database, I don't need triggers, stored procedures, or any of that stuff. What I need for web applications is a database that I can efficiently search, and that means fulltext indexes. Sure, there's plugins for Postgres that add fulltext indexes, but they require ungodly complex setup and tuning. With MySQL it's two keywords.
#1 because we're lazy ass sys admins who learned mysql first and don't want to bother learning another software package that does more or less the same shit . sad . true .
PostgreSQL is not necessarily the easiest beast in the world to get going. A few years back, I converted a chat/gallery portal system Ethereal Realms (http://www.erealms.org/ from MySQL to PostgreSQL, since at the time it was felt that features like proper referential integrity and stored procedures would really pay off.
The code was shortened considerably, was more stable overall and the OpenBSD port compiles properly without threading issues. However, despite all of those advantages and the database server being on a bigger server with more memory performance suffered considerably.
Want a good starting place in settings? The default documentation does little if anything to really help you on the matter, its like trying to learn the English language solely through the use of a dictionary.
There are tutorials available, some out of date and while Usenets can certainly help, you'll get wildly varying answers because some of the configuration options are more black magic then science. Rather makes it hard to get started when you don't know exactly where to start or how increasing a value will really affect performance as a whole. You are expected to load test the database before implementation which is not always possible for small hobby sites, and it can test patience.
With MySQL you had a few configuration files designed for use in various environments and it would show you how certain settings had changed. This is not the case with PostgreSQL in fact 32 connections is the default which is painfully inadequate for most peoples needs when dealing with a site. I'd personally love to see an application that detected your memory and other settings and came out with sane settings, at least with such an option you'd have a place to start.
Queries were slow, but then that was supposed to be MySQL's strength. Solution? Run explain/analyze on everything and tweak the hell out of every single query being generated. If you don't necessarily understand how the query is analyzed and run by PostgreSQL then there must be something wrong with you!
Vacuum? That concept alone can throw people in for a loop, especially when designing a system which is meant to be run by people with no technical experience. So you have to code in a serious amount of intelligence into the application or rely on Auto-Vacuum (not available at the time) which can slow performance down even more.
Because of vacuum, I had to design a process for the site to lock out all users. This had never been required under MySQL and took a while for the users to get used to. In certain cases, if the lock-out failed, the vacuum would cause permanent locks in tables which would quickly pile up scripts on the webserver side leading to extreme high loads or just crashing the machine.
PostgreSQL has a LOT of features and a lot to offer in general. However, there is a major learning curve required to get it going right. I've had other sites implement the code and whenever they hit the version which required PostgreSQL most gave up on the idea of migrating or complained endlessly on how things seemed sluggish. That is NOT a major selling point when trying to get the unwashed masses to adopt your product.
When I started programming a website, I knew I needed a database. I also knew absolutely nothing about php, sql, or even what they stood for. I was using a Perl based hacked link farm that used a flat-text database storage. Someone then pointed me to a php link farm that used MySQL. The installation text that came with the app was so easy to follow for a newbie, I had the link farm up and running in no time. I went to Books-A-Million a few weeks later, and found many books on PHP, MySQL, php/mysql - and nothing on PostgreSQL. When I finally did read up on RDBMS, found out that PostgreSQL did some functionality that MySQL didn't (at the time); I already had most of my site designed in php/mysql. I looked more into sub-queries in PostgreSQL, but the community structure was so scattered and non-newbie friendly, I decided to stick it out with MySQL (and havn't regretted it once). So my reasons for preference have nothing to do with wanting a windows version, different language, or other such assumption. Instead, my reasons are:
* as everyone says, the name is catchy: MySQL
* when I first was introduced to it, and to this day, Michael 'Monty' Widenius takes a personal interest in his work, and is a real down to earth guy ( had the pleasure of emailing him once) [you can probably still see him posting on the mysql dev lists these days..though I havn't followed it in a couple of years]
* Extensive script language support for web development
* Books for newbs and professionals (many books)
* I like their website more..always have
My shallow reasons..
My Thoughts, Kyndig
Do you really only have one user using the database at any given time? If you do only have one user, speed probably doesn't matter at all.
Benchmarks like that should be run with a couple of hundred active users all doing different things (mix of updates, insert, deletes, and selects) -- a real world use-case.
You will quickly find that scores change.
We had a MySQL versus PostgreSQL battle once. The MySQL people put together a benchmark showing MySQL was nearly 10 times faster. The benchmark was a single user going through the steps.
We then took the exact same thing and put Apache in front and benchmarked with 5 active users -- now they were about even. At 10 users PostgreSQL was about 5x faster. At 100 users MySQL was unable to finish the test. PostgreSQL was serving each of the 100 users at a rate comparable to how it dealt with 10 active users.
Benchmarks for benchmark sake is useless. Benchmarks that model your actual use case are quite valuable.
Rod Taylor
...because I don't know how to pronounce it.
Is it "Post Grace"? "Post Grey"? "Poss Grey"? "Poss Gres"? "Progress"? "Platypus"? "Post Raisin Bran"?
Whatever it is, it sounds vaguely French, which is just suspect to begin with. And I'm not dredging up the whole Iraq/UN thing either, although if I have to invoke Freedom Fries to make a point, I've got the mayonnaise ready.
Give me a RDBMS that I can pronounce, and I'll use you in my software.
MySQL. Easy. "My SQL". Doesn't get much easier than that, plus it sounds sorta friendly.
MS SQL - same thing, slightly different spelling. Maybe not as friendly, but you can put it in a Powerpoint to your boss and not sound like an idiot.
Oracle. Now you're talking. Even has a bit of mistique to it, a bit of enigma.
DB2. Not as sexy, but still undeniably pronouceable.
Sybase. Sock it to me.
What PostgreSQL - however the hell you say that - really needs is a new name. Forget features, forget marketing, forget RDBMS death match performance comparisons. Nobody cares. MySQL lacked tons of features for years, and we all used it then and continue to use it now. Why? You can pronounce it. Simple.
Many internet users are novices and do not understand how to utilize search engines.
Now, about a year ago, I had a client that wanted a web site back-end written. Now, I wasn't sure what the future of that site was going to be, whether I was going to be involved, etc. I also knew that it would be probably be run on inexpensive shared hosting solutions.
Guess what I chose? MySQL and PHP. The reason was because those are always available. It gives my client the flexibility to move it to any hosting solution. PostgreSQL simply is not everywhere. In my case, I run my own servers and can afford to have to understand it. But my client needs a hosting solution that does all the work for him (including back-ups). There's something to be said for using "the standard".
And you know what? I originally chose PostgreSQL because it was ACID compliant, but I have to say that MySQL sucks a lot less than it used to. It defaults to tables that support commit/rollback. It supports sub-transactions (which PostgreSQL v7 doesn't support, not sure about 8). It (FINALLY) supports sub-selects. If you're still turning up your nose at MySQL, it really isn't as bad as it used to be.
Sometimes it's best to just let stupid people be stupid.
Absolute bullshit. I've used PostgreSQL myself in a mission-critical production app for the past 3 years. (I've since left the company, but the app is still in use.) I have been consistently impressed by the quality and performance. There was a strong push to use mysql when the project started since the company already had in-house knowledge of it. Performance was one of the concernes. So I ran the benchmarks. Read performance of simple selects was inconclusive: mysql won some, and postgresql won some. However, postgresql consistently won write tests and scaled better as I added more client threads.
Nested parentheses in SQL can cause an engine crash. " like ... (SELECT A INNER JOIN B) INNER JOIN ..." But the crashing is tolerable. Hand-holding the query optimizer is not. Quite often, the optimizer gets the query plan wrong. Sending special commands to disable internal features is often the only resort.
Bullshit again. Never ever seen that or even heard about it. Again, at my last job postgresql was part of a mission-critical application, and I've used it for a couple of projects before that too. The *only* time I've seen postgresql go down was when we ran out of disk space. And even then, it was not a crash but a clean shutdown. Give me a specific example of a query that you say caused postgresql to crash. Otherwise I'll assume you are yet another troll.
While it's true that PostgreSQL is more database than most corporate weenies need, it falls down in moderate write environments. It's best used for systems that write data very infrequently, otherwise it fragments quickly. The only solution to table and database fragmentation is dump & reload.
Bullshit 3x. The app I was talking about was used for tracking work in a 3d production pipeline with a staff of ~300 artists. There was *a lot* of writes. (every checkin, every render, etc.) By the end of the project the database grew to over 10G. And postgresql didn't even blink.
Vacuum is asinine. Any command that needs to be run periodically under threat of complete and total data corruption should not be. That's right. Only PostgreSQL makes you vacuum or else your transaction ids overflow. This is modern? I'm shocked.
And your point is? All it requires is a single entry in crontab. And you can still run transactions while it's vacuuming. Really, what is your problem with it? It is no different than running a cronjob to do a backup, or a similar maintenance. And since 8.0 and up, postgresql does autovacuum, so you don't even have to worry about that.
So, in short, from my experience PostgreSQL Just Works (TM). And unlike oracle it doesn't cost an arm and a leg, and doesn't require an army of DBAs and sysadmins to maintain it.
___
If you think big enough, you'll never have to do it.
Reason number 6 is the damnned Postgres zealots that feel the need to bash everyone else's database rather than promote their own. I use MySQL and Postgres on a regular basis. I'm proficient in both. And to the dismay of Postgres users everywhere, there are times which *gasp* MySQL is better suited. "Oh, you are probably a lame programmer and use it for trivial web stuff". Not true! I look at a project and each databases strenths. It has nothing to do with the seriousness of an application. When I was writing VoIP billing software, we'd sometimes see 4-5 million CDR's (call detail records) in a single day. Our first iteration actually used Postgres and choked on that many records. We had to make some compromises with MySQL. We had an additional field for Unix epoch time because of MySQL's lacking (at the time) date and time math. There was a tradeoff. It was deemed that having billing invoices generate in 5 seconds (as opposed to 5 minutes) was more important than programmer time. Welcome to the real world. Another project I had was for writing worker punchcard system. Six months of records only topped out at 50,000 records and we decided Postgres' procedural languages would be a great help to us. Lose the zealots and attitude and maybe you'll have a greater user base.
If an officer ever threatens to taze you, say you have a pacemaker.
Either you want transactional safety or you don't. If you don't use MySQL with MyISAM tables and have fast count(*), if you do use InnoDB (or better use PostgreSQL), but then there's no single count(*) that can be stored with the table since every transaction may see a different count(*).
The cool think about MySQL is that you do have the option (with Table-Engines). The cool thing about PostgreSQL are its advanced featured... One example: Hands up, who here knows what a partial index is?
In PostgreSQL you can setup indexes that only cover a part of the table (for example if you have an active flag on a table and most queries are on active entries, you can have a partial index only on rows that have active=true, and this can speed up things *significantly*); alas most folks even have not even heard about partial indexes, and that is why they do not appreciate PostgreSQL.
This is just one example.
Vacuum kills performance. Some uses maybe OK with loosing 50% or more while VACUUM runs. In some uses it's unacceptable. In our case (a lot of inserts with majority of selects going for the newly inserted records) performance degrades within 6-8 hours after running VACUUM & friends. VACUUM takes ~20 minutes to complete which is completely unacceptable during the day and we can't delay it till night.
No, AUTOVACUUM is not an answer because it kicks in unexpectedly and makes random queries run unexpectedly slow at unexpected times. Usual VACUUM makes all queries run slow at predetermined time. Not a very appealing choice.
More reasons:My point is that the tool is not always the problem. Now if C++'s integer arithmetic had an issue, that is another story, but the programmer simply was not good.
;-)
No, the tool IS part of the problem in such cases. If the programmer was not that good at C then C was the wrong tool and thus part of the problem. The programmer should've taken the time to study up on C, or picked a different tool. There are times when the tool is not appropriate for the job--you probably shouldn't use C if you need to do heavy text processing and need to get the job done fast (use Perl instead), or if you are less experienced and want a language that supports sound object-oriented programming maybe try Python, etc.
MySQL was not designed as a robust relational database, and its creators didn't seem to be intent on making it so, or else they'd have designed it differently. It was designed as a very quick and quite dirty SQL frontend/ISAM backend system to support small, informal databases (or so it seems): Basically, its heritage is to be like the old Ashton-Tate dBase but using SQL to query the tables. Since then it has lost that focus and now we have large websites storing millions of records in mySQL.
MySQL is a great tool if used as intended, however it definitely IS a problem if your accounting system uses it for example. People started doing crap like that and complained about mySQL's lack of features, thus we have things tacked on like innoDB tables and such to add this robustness.
PostgreSQL was not always as super-robust as it is now, and in its present form its source code is probably almost unrecognisable from 10 years ago, however its architecture was more sound and thought out from the start, as its heritage was as an academic project. Its challenge was not to add features as was the case with mySQL--PgSQL was designed for extensibility. PostgreSQL had to catch up in performance and stability, which it has done in spades.
Personally, I always use UNIX timestamps (seconds since 1970). They can be directly added, sorted, and converted into any timezone, and its very portable. But thats just me. (Yes, UNIX timestamps do nothing before 1970, etc, etc).
It seems somehow wrong that your business logic has to perform low-level validation of basic datatypes, and it is cumbersome and error-prone to deal with unrecognisable representations. Only the geekiest of geeks could tell me whether 1984293617 falls on a Thursday without runing it through some kind of conversion program (simple as that may be for a geek). What about people who point-and-click their way through some report designer--they're gonna have to deal with some giant integer in a column entitled "something-date". The other problem is that it is not very precise for some applications that need sub-second timestamp values.
Personally, I like PostgreSQL because it accepts ISO standard formats, you don't need to do anything to convert timezones--you simply specify the time zone when you insert or query and it issmart enough to figure it out when you query fordatain Eastern time zone and it was inserted in Pacific timezone. Furthermore, it knows Feb 30 isn't valid, and knows when leap years occur, and can format the date in many different ways with simple built-in functions, can be accurate to the millisecond and won't crash and burn in 2038.
FYI, I believe the "seconds since UNIX epoch" representation of date/time values is a SIGNED integer, so they are in fact good for earlier dates than 1970 (they are good to some time in late 1901 in fact). That is still a pretty limited range and why early systems didn't use that representation inmany cases (couldn't store birthdates for a lot of people who were still alive in the early 1970s becasue they were born before 1901). It is still a problem in some applications ad that is why 32-bit "UNIX-style" time is discouraged.
I think it's a shame that people resort to such kludges without adequately lookig for more appropriate alternatives...but that's just me
I can't be bothered to learn something new when it seems everything supports MySQL.
I'm glad you don't work for me with that attitude. I'd rather work with someone who is interested in learning new things and will bring some creativity to the job. People of your mentality have to be careful they don't fall into the "false laziness" trap--using some tool or technique or techology because you are too lazy to learn something new, only to end up doing load of extra work to avoid the shortcomings of your inappropriate design choices. The result is scads of legacy code at higher layers of an application to handle things like datatype verification, basic referential integrity and so on.
All the various different executables to do different tasks rather then one shell like MySQL, a permission system which seemed from my limited usage more perverse then MySQL's
I've never found it to be a major struggle to use PostgreSQL, though being a more full-featured database it will naturally be a bit more complex to manage.
I'm puzzled about the "all the various executables" part too, since many of them were invocable from the psql shell anyways. Also, it sounds like you've not lookded at PostgreSQL for awhile because its permissions system has undergone a lot of work--certainly it can be complex but it is very flexible and powerful, and honestly it gets rid of most excuses you had to execute all your database operations under the database superuser (or some other single user account) in your backend code.
I have better things to do with my time, like write cool code that uses MySQL.
You might want to examine how you used your time...if you had spent a few hours or a couple days learning something new for a change (like PostgreSQL) then it might've saved weeks or hours of frustration trying to use mySQL for too-complex tasks.
MySQL might have grownup a lot in recent years, but at its heart it was meant for much more modest tasks, like storing guestbook entries, record collections, as a temporary datastore/embedded database, high-performance querying of relatively static ad/or non-critical data and so on.
Comment removed based on user account deletion