The link to the supposed IBM study returns an error. Which is too bad, because I'd love to critique the study - Tim Perdue's benchmarking work isn't the first to show that MySQL's table locking paradigm breaks down under high load. Remember that he's been studying alternatives because SourceForge, using MySQL, has had problems in this area. And remember that Slashdot has helped fund the integration of the Berkeley DB backend and MySQL for the same reason.
It's not all myth, folks. Table locking sucks, page-level locking (Sybase) is much better and row-level locking (Oracle, Interbase, Postgres) much better still in high concurrency systems.
Well, arsdigita.com is generally a very lightly-loaded site. It's normally only visited by aD staff, clients (and potential clients), and folks who use the company's open source toolkit. They probably didn't bother to configure it to accept large numbers of connections, and it is probably running on a shared server.
The tachnology scales fine, they operate some busy sites.
When the OpenACS site was slashdotted a few months ago ("Why not MySQL?") Ben, the author, was busy working on the system, which happened to be his personal server at the time (dual P400). Not only did the system survive being slashdotted, but he didn't even notice the load and had no idea that his piece had hit slashdot until someone e-mailed him the news.
(OpenACS is AOLserver+Postgres+aD's toolkit, i.e. the same technology but PG-based rather than Oracle-based).
People who trivialize the McD case simply haven't taken a close look at the facts - 3rd degree burns are no fun.
A barrista at my local caffeine dealer had an accident with one of their large coffeemakers last week and ended up with 2nd degree burns (large blisters) from her right hand to her elbow, really ugly, not to mention painful. Will she sue? Nope - the company's covering her medical expenses, etc.
The woman in the McD case was much more severely burned with a smaller quantity of much hotter coffee, and as the previous poster's pointed out would've originally settled for medical expenses if McD's hadn't been such pricks about it.
Great, now we have a SCIENCE FICTION NOVEL being cited as authorative source of rebutal to the full-time work of hundreds or thousands of climatologists from around the world on the subject of climate change?
What relevance does volcanic activity under Antartic ice have on the fact that warmer water in the ARTIC (roughtly 12,000 miles to the north) is causing the melting of the polar icecap?
There's no reason to be skeptical of such models (or claims) because they don't exist.
I challenge you to supply a reference to a single peer-reviewed scientific paper claiming that melting of the polar icecap will cause a rise of 10-100 meters in the depth of the sea.
It is thought that the rise in temperature of the planet over the next few decades will cause rise in sea level on the order of meters, not tens of meters. And this won't be due to the melting of the polar icecap alone not by any means. After all, the sea ice of the icecap displaces about 90% of its volume when it is floating.
As someone earlier in another thread pointed out, it is water trapped as ice on land (such as Greenland and Antartica) that will provide most of the contribution to any rise in sea level that occurs.
Proving a false premise to be false, as you've done, doesn't prove global warming to be a myth.
Those convinced that the tests were rigged might find this interesting:
With "Proprietary 1," we split the data and the index across two
drives, to boost performance. Memory parameters were tweaked as
well to allow more simultaneous users to connect.
Now... Postgres doesn't have any in-the-database way to split indices and data across two drives, a major shortcoming for those with huge databases in particular (and something that will change in the future).
Performance generally is improved by this strategy because the two platters can independently access the index and data, minimizing latency caused by head movement.
If they were trying to cheat, they wouldn't've done this split while at the same time leaving dat a and indices on the same platter while running Postgres.
Since Great Bridge's focus is to make Postgres competitive the commercial database world, it's rather funny to see these Slashdot threads so focused on MySQL, and Monty's taking the benchmark results almost personally.
They took pains to do some tuning on "Proprietary 1 8.1.5" (i.e. Oracle) but ran MySQL "out of the box". It's clear their major targets are Oracle and MS-SQL, not MySQL.
As to why they didn't use the latest "alpha" version of MySQL, I know for a fact that they started this benchmarking effort quite some time ago. Note that they ran Oracle 8.1.5, not 8.1.6. Also note that they ran Postgres 7.0, rather than the current 7.0.2.
And, for that matter, RH 6.1 rather than RH 6.2.
This reflects when the tests were run, not any intentional effort to exclude the latest and greatest MySQL (or Postgres or Oracle, for that matter). In fact, given the above data, most Slashdot readers should be able to pin down when they started benchmarking to at least the nearest week!
Right up front, I see this in his list of things that Postgres is poor at:
...running long multiple transactions where you get a conflict at the end (in this page/row locking is better)
Since Postgres implements row-level locking, this comment makes no sense to me. And of course MySQL isn't at all good for running long multiple transactions - it doesn't support transactions at all. What's his point?
We here at MySQL has always tried to design very fair test that no one can misinterpret or lie about.
Their comparison table of Postgres vs. MySQL features has had many errors from the beginning, and despite requests from the Postgres folk Monty never corrected them. Thus Postgres users like myself remain unconvinced that fairness is uppermost in their mind.
The way to set up the databases in a test is also very crucial for the performance of the database. The article doesn't
mention anything about this or even with which ODBC driver they used the different databases
I just peeked quickly at MySQL's benchmark page and see absolutely no mention as to how they set up Postgres or the various commercial databases. It seems strange that Monty makes this complaint given that his page lacks this data, too.
MySQL also have two ODBC drivers, one
slow with debugging and one fast. It should be very nice to know how they actually did use MySQL. To get any
performance from Oracle, on has also to tune this a lot; The ODBC driver for Oracle has also very bad performance; This
is a common known fact; No one runs a critical system with Oracle and ODBC.
A poor ODBC driver - besides being a strange thing to brag about - might account for poor performance in single-user tests, but it doesn't account for the fact that MySQL progressively gets worse under load. Any driver penalty ought to be constant per query. MySQL falls apart under load, this is no secret, and the Great Bridge benchmarks should surprise no one.
If the ODBC driver were the cause of MySQL's degradation under low, then the tests with (presumably) Oracle would presumably also show it degradating similarly under load. Which, of course, they don't. The "real" RDMBS systems scale reasonably well, while MySQL performance drops quickly as the number of simultaneous users increases.
As to why MySQL perhaps performs poorly in these tests under light load, another explanation might well be that even MySQL's benchmarks show Postgres beating MySQL on a JOIN test. Most real-world work with databases involve queries which JOIN tables, and the real-world benchmark suite used very likely penalizes a database engine which performs poorly on JOINs. As is the case with MySQL.
Of course, the "fair" MySQL tests on their benchmark tests are single-user tests. This doesn't represent the running environment most folks putting up web pages aspire to. When I put up a web site, I'm hoping for many simultaneous users, not just one. MySQL's "fair" tests hide the fact that MySQL falls apart under load by "fairly" not including tests with multiple users.
One thing that also is interesting is that they don't mention which PostgreSQL version they are using. It's very unlikely
that they did actually test PostgreSQL 7.0 as this has at least one very fatal bug in the index handling which made it
useless for benchmarks (at least when we did a test run on it).
They ran it using standard Postgres 7.0, which I and many other folks associated with OpenACS have been running for months with no problem.
I hope Monty had the common courtesy to submit a bug report on this "fatal bug" they ran into.
Threads also gives MySQL better
scalability than processes, that PostgreSQL uses, so we are very confident about the future.
Probably not particularly true under Linux, where the threads implementation isn't exactly stellar.
It will speed start-up overhead, but anyone in their right mind using an RDBMS in a web environment uses persistent, pooled connections anyway and start-up time is in practice not an issue. The AOLserver database API supports persisten pooled connections in C, Tcl, Java and Python (the latter two are in pre-release form) in a database-agnostic way, so it's particularly not an issue if you choose your webserver intellegently.
The net result is that the posted test is about as wrong as you can do a test, the important thing is just to get the
people that reads that page to understand that.
Given the crap that passes for comparative testing on MySQL's page, and the extremely misleading comments regarding the usefulness of transactions, foreign key constraints, etc, Monty should reflect upon the old saw "he who lives in a glass house should cast no stones".
Scales to multiple processes?
Excellent in that regard, that's why it performs so well with 100 users.
Scales well in a multi-processor environment?
Due to its architecture currently that's mostly an OS issue.
300GB databases?
Not its forte, probably want to buy Oracle today for such huge databases. PG does support large tables (>2GB) but IMO not very efficiently. Of course, for filesystems supporting >2GB files this caveat disappears.
Also, there's no tablespace or similar feature to use multiple filesystems in a single database. This shortcoming is being addressed.
keep transaction time reasonable for 1000's of users?
In Postgres writers never block readers, ala Oracle (absent explicit user-level locks, of course). This was implemented in 6.5, and is the major reason why performance under high concurrency is so vastly superior to earlier versions.
On the other hand, I doubt if there are any Postgres installations running thousands of simultaneous users.
Online backup?
A limited "yes", the pg_dump utility will create a consistent database if it will fit into a single dump file. pg_dump is one area that is being worked on and improved, and improved on-line backup facilities are planned.
Proven data integrity?
It can hose, but heck, Oracle partially hosed me last week so what do you expect?:) I've never had a situation arise where I've not been able to retrieve my data, but there are theoretical holes in the current implementation. Full write-ahead logging is being implemented right now, and not only will this even further increase reliability but should improve speed (if you put the log on another platter separate from the database that is).
triggers, stored procedures, JDBC etc?
Not sure about DRDA but you can always write your own and submit it to the project if not. ODBC and JDBC are supported, as are triggers and stored procedures (in a variety of languages, not only SQL).
Full RDBMS support
If you mean transaction semantics and referential integrity, yes.
SQL'99 core compatibility
Currently, SQL'92 is the goal. I'm sure SQL'99 will become the goal when the time's ripe. The referential integrity stuff is closer to SQL'99 than Oracle is (my one and only contribution to the project was to help implement RI).
Were you using PG 7.0, or PG 6.5? The optimizer's been improved in PG 7.0, and in my case one particularly annoying query that had a very poor plan chosen by PG 6.5 had a very good plan chosen for PG 7.0.
You can turn off various join strategies in PG 7.0 via "set" commands, not as flexible as Oracle's hints notation to the optimizer but it can help in situations like you describe. At least you get SOME control over the optimizer.
The review says "proprietary" because they're not allowed to release the names. It's true, I've read the contracts (I use Oracle as well as Postgres, depending on whether or not I'm picking up the tab). It's not their fault.
Clearly, if they were open source, their names could be revealed. They have to use SOME word to describe them and owe their users an explanation as to why they don't name names, don't they?
These contract terms really suck, and that's where you should be venting your anger, rather than accusing the benchmark authors of "FUD" just because they adhere to contract terms.
They explained this right in the article, too, maybe you should've read it instead of falling asleep.
A year ago - that's about the timeframe in which some VERY serious memory leaks were uncovered and fixed, at the prodding of some folks using it in a web environment.
When you've got memory leaks, your server will go down, it's just a matter of time. Once the leaks were fixed, the folks I know using AOLserver+Postgresql who were forced to reboot periodically found their problems disappeared.
We at OpenACS (http://www.openacs.org) have been running our Postgres installations for months at a time with no need to reboot.
It truly has improved dramatically in the last 15 months or so.
Am I a biased Postgres fanatic? Not exactly - when I evaluated 6.4 in January, 1999 I decided it was useless for web work, far too slow and crash-prone. When 6.5 was released it was so dramatically improved that I changed my mind.
And, the OpenACS web toolkit project intends to support both Postgres and Interbase (now that the latter's Open Source), so we're not Postgres-only bigots.
I'm a nationally published nature photographer in my spare time. Sometimes I do a little street shooting in my neighborhood while walking off for my morning coffee.
One morning, I found a beheaded Barbie about two blocks from my house.
The first partial cut of support for large types (not just blobs) is already available to developers. It's intended as a proof-of-concept for the particular approach being taken (the author wrote PL/pgSQL and did 90% of the work on foreign key support, among other things). But it is a good indicator that they're on track to have this out in V7.1, which will be available late this summer.
The comment about gaining "mastery of the code" must be taken in context. The current team of developers are not the academic folks who originally developed it at Berkeley, and came together relatively recently (over the past three or so years).
Obviously, being stone-cold to the code at the beginning it took awhile for them to learn and, as they put it, master it.
So the statement made when 6.5 was released was meant as a milestone accomplishment, if you will.
"We decided to take up and fix Postgres. In order to meet this goal we needed to collectively (not individually) master all the code. With 6.5, this milestone has been accomplished."
The original authors and others from UC Berkeley who worked on it in earlier years have moved on to other things. The current team took the code, learned it on their own, and as of 6.5 had made vast improvements and fixed many bugs. 7.0 - released today - fixes even more bugs and makes referential integrity and various other features available.
Regarding memory leaks that cause periodic crashing of the RDBMS, those were almost entirely fixed by 6.5, which was released a year ago.
The/. article reference you mention was NOT entitled "Why MySQL sucks". It was simply a statement as to why the OpenACS team choose Postgres instead. We understand that MySQL fits a narrow niche quite well. We understand what that niche is, and more importantly isn't, much more so than do the developers of MySQL, judging from their documentation (foreign key constraints are almost always only used for documentation, so enforcement really isn't needed? Feh!)
You might as well make an effort to quote the title correctly, even if you didn't understand what the piece was saying.
That site, BTW, is running on Postgres V7.0 Beta (virtually the same version as today's PG release). It was slashdotted. My, oh my, enough to bring strong servers to their knees 'til they beg for mercy. Not this time - the dual P400 running AOLserver, Postgres and OpenACS never had to take off the sweats, much less work hard. It was running several other sites at the time, as well as being used for development that day by its owner, who wasn't even aware of the slashdot traffic until someone pointed out that his piece had (unexpectedly) been mentioned here. He then poked around and realized that the machine was serving lots 'o pages.
Each page served out of that server involves several hits on a nearly 10,000 line datamodel stuffed into Postgres. There is ZERO caching of queries done on that server at the moment. Nada. Each page built dynamically out of the database.
Those who don't think modern Postgres is fast enough for this kind of task ought to step back and think again. Those who think MySQL provides anything like the data safety of a true RDBMS ought to read a basic book like Gray's "Transaction Programming".
The server with the photos on it which responded with "server busy, please come back later" is Philip's personal server, photo.net. It's usually extremely busy even without being slashdotted.
The error message is configurable in the web server (AOLserver). The fact that the error message was returned simply means that AOLserver was told to only accept N simulataneous threads, and to tell folks to go away when that threshold was reached.
Doesn't mean the thing crashed. It also doesn't mean that N couldn't be increased...
Tim's right, today Postgres is the best Open Source RDBMS out there (mySQL is not a true RDBMS and you can't even begin to talk about failover without atomicity), but Interbase will be "it" in a few months. Bitmead's wrong. Interbase implements more of SQL 92 than Postgres does, or will for at least several months. Outer joins, for instance, which have been under development for 16 months now and aren't even close to completion. Interbase is also faster, and once the threaded version is released later this year will probably scale better than Postgres. Both scale better than the non-RDBMS mySQL, with its yucky table-level locking and poor join performance - just read the posts in this thread to understand all the hacks mySQL users implement to get around such performance bottlenecks! There's really no excuse for such performance problems in a product that doesn't even support transactions.
Having said all that, I use Postgres for web work and within its limitations it works great. I haven't decided whether or not I'll switch to Interbase when it becomes Open Source - I've recently helped port a large web toolkit to Postgres and if outer joins ever do get implemented I'll be more than satisfied. And the extra speed of Interbase in the web environment I program in is somewhat offset by the poorer performance of the external driver used to access it vs. the internal drivers used to access Postgres or Oracle (we use AOLserver).
You worked with Stonebreaker? Cool. In those days, though, Postgres worked at a table-locking level, I believe. Maybe page locking but I don't think so. Multi-variate concurrency control or MVCC, which is what the new scheme is called in Postgres, was only implemented about 15 months ago. It is equivalent to Interbase's multi-generational concurrency control or whatever they call it.
The benefits of row-level locking vs. page-level locking? Same as the benefits of page-level vs. table-level, or table-level vs. locking an entire disk drive, etc.
Finer granularity means fewer backends are blocked for any given write operation that affects data occupying the same or less space than the granule.
Pure row-level locking would mean that only readers reading the particular rows being updated would block on a write operations. As opposed to the reading of any row within a table if you only implement table locking. Etc etc. Surely the benefit of this is obvious in a high-concurrency environment. For the single user running PSQL there is, of course, no benefit.
Postgres (and AFAIK Interbase) can go a step further. Since you worked with Stonebreaker, you probably know that writing a new version of a tuple in a table in Postgres doesn't overwrite the old tuple in the table. A new tuple is made which is associated with the current transaction. That means the old, previously-committed tuple is available for other readers, which means they don't have to be blocked until the transaction writing the table has committed or rolled back.
Writers don't block readers. Conceptually easy, harder than it sounds in practice, but much easier with a tuple-storage scheme such as is implemented by Postgres than it is with certain other schemes. Let me hasten to add that there are drawbacks to the Postgres tuple-storage scheme, as well! But the high degree of concurrency made possible by MVCC is a huge win in certain application spaces, such as high-traffic web sites.
Check your facts! Interbase is far more feature rich than PostgreSQL. Among other things Interbase has a procedural language which allows Stored Procedures and Triggers.
You check your facts.
PostgreSQL has a nice little procedural language called PL/pgSQL that works great, and lets you add procedures written in SQL, Perl, TCL and C if you feel the need.
Postgres also supports triggers.
Interbase is a great little database system. So are recent versions of postgres.
And there are partial ports of Philip Greenspun's web development toolkit for each, though the Postgres version's far more complete and up-to-date. It can be found here.
The link to the supposed IBM study returns an error. Which is too bad, because I'd love to critique the study - Tim Perdue's benchmarking work isn't the first to show that MySQL's table locking paradigm breaks down under high load. Remember that he's been studying alternatives because SourceForge, using MySQL, has had problems in this area. And remember that Slashdot has helped fund the integration of the Berkeley DB backend and MySQL for the same reason.
It's not all myth, folks. Table locking sucks, page-level locking (Sybase) is much better and row-level locking (Oracle, Interbase, Postgres) much better still in high concurrency systems.
Well, arsdigita.com is generally a very lightly-loaded site. It's normally only visited by aD staff, clients (and potential clients), and folks who use the company's open source toolkit. They probably didn't bother to configure it to accept large numbers of connections, and it is probably running on a shared server.
The tachnology scales fine, they operate some busy sites.
When the OpenACS site was slashdotted a few months ago ("Why not MySQL?") Ben, the author, was busy working on the system, which happened to be his personal server at the time (dual P400). Not only did the system survive being slashdotted, but he didn't even notice the load and had no idea that his piece had hit slashdot until someone e-mailed him the news.
(OpenACS is AOLserver+Postgres+aD's toolkit, i.e. the same technology but PG-based rather than Oracle-based).
No, Federal Law forbids discrimination by a very small set a criteria - not the Constitution.
Discrimination is mostly legal - I can discriminate against smokers, for instance. And idiots who think they know the Constitution but really don't.
As an employer I just can't discriminate on a small, enumerated list of criteria (sex, race, religion, etc).
People who trivialize the McD case simply haven't taken a close look at the facts - 3rd degree burns are no fun.
A barrista at my local caffeine dealer had an accident with one of their large coffeemakers last week and ended up with 2nd degree burns (large blisters) from her right hand to her elbow, really ugly, not to mention painful. Will she sue? Nope - the company's covering her medical expenses, etc.
The woman in the McD case was much more severely burned with a smaller quantity of much hotter coffee, and as the previous poster's pointed out would've originally settled for medical expenses if McD's hadn't been such pricks about it.
Great, now we have a SCIENCE FICTION NOVEL being cited as authorative source of rebutal to the full-time work of hundreds or thousands of climatologists from around the world on the subject of climate change?
What relevance does volcanic activity under Antartic ice have on the fact that warmer water in the ARTIC (roughtly 12,000 miles to the north) is causing the melting of the polar icecap?
There's no reason to be skeptical of such models (or claims) because they don't exist.
I challenge you to supply a reference to a single peer-reviewed scientific paper claiming that melting of the polar icecap will cause a rise of 10-100 meters in the depth of the sea.
It is thought that the rise in temperature of the planet over the next few decades will cause rise in sea level on the order of meters, not tens of meters. And this won't be due to the melting of the polar icecap alone not by any means. After all, the sea ice of the icecap displaces about 90% of its volume when it is floating.
As someone earlier in another thread pointed out, it is water trapped as ice on land (such as Greenland and Antartica) that will provide most of the contribution to any rise in sea level that occurs.
Proving a false premise to be false, as you've done, doesn't prove global warming to be a myth.
Climatologists correct for such measurement errors.
Not hardly enough to account for the benchmark results.
Performance generally is improved by this strategy because the two platters can independently access the index and data, minimizing latency caused by head movement.
If they were trying to cheat, they wouldn't've done this split while at the same time leaving dat a and indices on the same platter while running Postgres.
Since Great Bridge's focus is to make Postgres competitive the commercial database world, it's rather funny to see these Slashdot threads so focused on MySQL, and Monty's taking the benchmark results almost personally.
They took pains to do some tuning on "Proprietary 1 8.1.5" (i.e. Oracle) but ran MySQL "out of the box". It's clear their major targets are Oracle and MS-SQL, not MySQL.
As to why they didn't use the latest "alpha" version of MySQL, I know for a fact that they started this benchmarking effort quite some time ago. Note that they ran Oracle 8.1.5, not 8.1.6. Also note that they ran Postgres 7.0, rather than the current 7.0.2.
And, for that matter, RH 6.1 rather than RH 6.2.
This reflects when the tests were run, not any intentional effort to exclude the latest and greatest MySQL (or Postgres or Oracle, for that matter). In fact, given the above data, most Slashdot readers should be able to pin down when they started benchmarking to at least the nearest week!
If the ODBC driver were the cause of MySQL's degradation under low, then the tests with (presumably) Oracle would presumably also show it degradating similarly under load. Which, of course, they don't. The "real" RDMBS systems scale reasonably well, while MySQL performance drops quickly as the number of simultaneous users increases.
As to why MySQL perhaps performs poorly in these tests under light load, another explanation might well be that even MySQL's benchmarks show Postgres beating MySQL on a JOIN test. Most real-world work with databases involve queries which JOIN tables, and the real-world benchmark suite used very likely penalizes a database engine which performs poorly on JOINs. As is the case with MySQL.
Of course, the "fair" MySQL tests on their benchmark tests are single-user tests. This doesn't represent the running environment most folks putting up web pages aspire to. When I put up a web site, I'm hoping for many simultaneous users, not just one. MySQL's "fair" tests hide the fact that MySQL falls apart under load by "fairly" not including tests with multiple users.
They ran it using standard Postgres 7.0, which I and many other folks associated with OpenACS have been running for months with no problem.I hope Monty had the common courtesy to submit a bug report on this "fatal bug" they ran into.
Probably not particularly true under Linux, where the threads implementation isn't exactly stellar.It will speed start-up overhead, but anyone in their right mind using an RDBMS in a web environment uses persistent, pooled connections anyway and start-up time is in practice not an issue. The AOLserver database API supports persisten pooled connections in C, Tcl, Java and Python (the latter two are in pre-release form) in a database-agnostic way, so it's particularly not an issue if you choose your webserver intellegently.
Given the crap that passes for comparative testing on MySQL's page, and the extremely misleading comments regarding the usefulness of transactions, foreign key constraints, etc, Monty should reflect upon the old saw "he who lives in a glass house should cast no stones".Were you using PG 7.0, or PG 6.5? The optimizer's been improved in PG 7.0, and in my case one particularly annoying query that had a very poor plan chosen by PG 6.5 had a very good plan chosen for PG 7.0.
You can turn off various join strategies in PG 7.0 via "set" commands, not as flexible as Oracle's hints notation to the optimizer but it can help in situations like you describe. At least you get SOME control over the optimizer.
The review says "proprietary" because they're not allowed to release the names. It's true, I've read the contracts (I use Oracle as well as Postgres, depending on whether or not I'm picking up the tab). It's not their fault.
Clearly, if they were open source, their names could be revealed. They have to use SOME word to describe them and owe their users an explanation as to why they don't name names, don't they?
These contract terms really suck, and that's where you should be venting your anger, rather than accusing the benchmark authors of "FUD" just because they adhere to contract terms.
They explained this right in the article, too, maybe you should've read it instead of falling asleep.
A year ago - that's about the timeframe in which some VERY serious memory leaks were uncovered and fixed, at the prodding of some folks using it in a web environment.
When you've got memory leaks, your server will go down, it's just a matter of time. Once the leaks were fixed, the folks I know using AOLserver+Postgresql who were forced to reboot periodically found their problems disappeared.
We at OpenACS (http://www.openacs.org) have been running our Postgres installations for months at a time with no need to reboot.
It truly has improved dramatically in the last 15 months or so.
Am I a biased Postgres fanatic? Not exactly - when I evaluated 6.4 in January, 1999 I decided it was useless for web work, far too slow and crash-prone. When 6.5 was released it was so dramatically improved that I changed my mind.
And, the OpenACS web toolkit project intends to support both Postgres and Interbase (now that the latter's Open Source), so we're not Postgres-only bigots.
"It looks like they're even using it on www.aol.com: "
Like - fuck, duh, shithead.
Let's just say that their Open Source webserver is well-tested by the authors in a very high-traffic environment.
Why did you think they open-sourced their server, because they don't use it????
Actually, it's not, because the principles of transaction semantics, data integrity, etc are as central to standard SQL as scalar arithmetic is to C.
ACIDity might be said to be IMPLIED by the SQL standard, i.e. correct implementation of the semantics of insert, update etc requires it.
Strangely, the entire RDMBS world except for the MySQL folk and their fans understand this.
One morning, I found a beheaded Barbie about two blocks from my house.
Hate to think about Ken's fate...
The first partial cut of support for large types (not just blobs) is already available to developers. It's intended as a proof-of-concept for the particular approach being taken (the author wrote PL/pgSQL and did 90% of the work on foreign key support, among other things). But it is a good indicator that they're on track to have this out in V7.1, which will be available late this summer.
The comment about gaining "mastery of the code" must be taken in context. The current team of developers are not the academic folks who originally developed it at Berkeley, and came together relatively recently (over the past three or so years).
/. article reference you mention was NOT entitled "Why MySQL sucks". It was simply a statement as to why the OpenACS team choose Postgres instead. We understand that MySQL fits a narrow niche quite well. We understand what that niche is, and more importantly isn't, much more so than do the developers of MySQL, judging from their documentation (foreign key constraints are almost always only used for documentation, so enforcement really isn't needed? Feh!)
Obviously, being stone-cold to the code at the beginning it took awhile for them to learn and, as they put it, master it.
So the statement made when 6.5 was released was meant as a milestone accomplishment, if you will.
"We decided to take up and fix Postgres. In order to meet this goal we needed to collectively (not individually) master all the code. With 6.5, this milestone has been accomplished."
The original authors and others from UC Berkeley who worked on it in earlier years have moved on to other things. The current team took the code, learned it on their own, and as of 6.5 had made vast improvements and fixed many bugs. 7.0 - released today - fixes even more bugs and makes referential integrity and various other features available.
Regarding memory leaks that cause periodic crashing of the RDBMS, those were almost entirely fixed by 6.5, which was released a year ago.
The
You might as well make an effort to quote the title correctly, even if you didn't understand what the piece was saying.
That site, BTW, is running on Postgres V7.0 Beta (virtually the same version as today's PG release). It was slashdotted. My, oh my, enough to bring strong servers to their knees 'til they beg for mercy. Not this time - the dual P400 running AOLserver, Postgres and OpenACS never had to take off the sweats, much less work hard. It was running several other sites at the time, as well as being used for development that day by its owner, who wasn't even aware of the slashdot traffic until someone pointed out that his piece had (unexpectedly) been mentioned here. He then poked around and realized that the machine was serving lots 'o pages.
Each page served out of that server involves several hits on a nearly 10,000 line datamodel stuffed into Postgres. There is ZERO caching of queries done on that server at the moment. Nada. Each page built dynamically out of the database.
Those who don't think modern Postgres is fast enough for this kind of task ought to step back and think again. Those who think MySQL provides anything like the data safety of a true RDBMS ought to read a basic book like Gray's "Transaction Programming".
The server with the photos on it which responded with "server busy, please come back later" is Philip's personal server, photo.net. It's usually extremely busy even without being slashdotted.
The error message is configurable in the web server (AOLserver). The fact that the error message was returned simply means that AOLserver was told to only accept N simulataneous threads, and to tell folks to go away when that threshold was reached.
Doesn't mean the thing crashed. It also doesn't mean that N couldn't be increased...
Having said all that, I use Postgres for web work and within its limitations it works great. I haven't decided whether or not I'll switch to Interbase when it becomes Open Source - I've recently helped port a large web toolkit to Postgres and if outer joins ever do get implemented I'll be more than satisfied. And the extra speed of Interbase in the web environment I program in is somewhat offset by the poorer performance of the external driver used to access it vs. the internal drivers used to access Postgres or Oracle (we use AOLserver).
You worked with Stonebreaker? Cool. In those days, though, Postgres worked at a table-locking level, I believe. Maybe page locking but I don't think so. Multi-variate concurrency control or MVCC, which is what the new scheme is called in Postgres, was only implemented about 15 months ago. It is equivalent to Interbase's multi-generational concurrency control or whatever they call it.
The benefits of row-level locking vs. page-level locking? Same as the benefits of page-level vs. table-level, or table-level vs. locking an entire disk drive, etc.
Finer granularity means fewer backends are blocked for any given write operation that affects data occupying the same or less space than the granule.
Pure row-level locking would mean that only readers reading the particular rows being updated would block on a write operations. As opposed to the reading of any row within a table if you only implement table locking. Etc etc. Surely the benefit of this is obvious in a high-concurrency environment. For the single user running PSQL there is, of course, no benefit.
Postgres (and AFAIK Interbase) can go a step further. Since you worked with Stonebreaker, you probably know that writing a new version of a tuple in a table in Postgres doesn't overwrite the old tuple in the table. A new tuple is made which is associated with the current transaction. That means the old, previously-committed tuple is available for other readers, which means they don't have to be blocked until the transaction writing the table has committed or rolled back.
Writers don't block readers. Conceptually easy, harder than it sounds in practice, but much easier with a tuple-storage scheme such as is implemented by Postgres than it is with certain other schemes. Let me hasten to add that there are drawbacks to the Postgres tuple-storage scheme, as well! But the high degree of concurrency made possible by MVCC is a huge win in certain application spaces, such as high-traffic web sites.
This beta is in binary form, only, thus the provisions again reverse engineering, etc.
My understanding is that the Open Source release will come Q3.
You check your facts.
PostgreSQL has a nice little procedural language called PL/pgSQL that works great, and lets you add procedures written in SQL, Perl, TCL and C if you feel the need.
Postgres also supports triggers.
Interbase is a great little database system. So are recent versions of postgres.
And there are partial ports of Philip Greenspun's web development toolkit for each, though the Postgres version's far more complete and up-to-date. It can be found here.