First "Real" Benchmark for PostgreSQL

I do not think it means what you think it means. by Control+Group · 2007-07-09 08:17 · Score: 5, Insightful

however the test-hardwares of the other DB systems are somewhat different

Which makes the results pretty much useless. But, being the intrepid slashdotter I am, I went ahead and R'ed the FA anyway, in case I could glean some useful information from it.

Which revealed that the linked article doesn't actually contain any information whatsoever about Oracle* or MySQL, much less benchmarks on named hardware.

So...what am I supposed to get out of this, again? Or is this just supposed to be some kind of PostgreSQL love-in, so I should take my wet blanket elsewhere?

*Well, the second link contains someone claiming that Oracle is only 15% faster...but without providing any actual data.

--

Reality has a conservative bias: it conserves mass, energy, momentum...

Benchmarks are like skidmarks by Anonymous Coward · 2007-07-09 08:20 · Score: 0, Funny

They're all shitty.

Mod parent way up! by khasim · 2007-07-09 08:23 · Score: 2, Insightful

You cannot compare benchmarks without SOMETHING standard between them.

Okay, if they can't match the hardware (why not?) then focus on price points. I notice that they're looking at "$65,500 for the hardware". That's a LOT of hardware at today's prices.

I'm sure MySQL would (and will) come back with a "benchmark" on hardware costing $10,000.

There is nothing "real" about this "benchmark".

Re:Mod parent way up! by Anonymous Coward · 2007-07-09 08:40 · Score: 2, Interesting

I can't speak for complex queries, but here are some simple findings from my testing:

Inserting 20 million rows, all simple inserts, only one primary key (int) with autoincrement for mysql and a sequence for postgres:
Avg Mysql time per 1000 inserts: 3 seconds
Avg Postgres time per 1000 inserts: 15 seconds (and gets worse over time)

That was after increasing the caches and disabling fsync on postgres too.

I also did a delete then insert for both (to flush out already existing rows), with similar results.

Ended up just inserting into a mysql table with no keys/indexes at all for maximum speed (average of .4 seconds per 1000 inserts), but didn't test that with postgres. Decided I only needed a static table anyway, so mysql was more ideal for it's speed in my situation. Sure with I had used postgres on a few more complicated projects though...

Posting AC to avoid flames.
Re:Mod parent way up! by Vellmont · 2007-07-09 08:53 · Score: 4, Insightful

You cannot compare benchmarks without SOMETHING standard between them.

The thing that's standard is the benchmarking software.

If I were to buy a database server, do I really care which component of the solution is providing me with the great performance, or do I just want the performance? At the end of the day the only thing that really matters is the performance that comes out of the box.

It doesn't really matter if "Postgresql" is faster than "MySQL", because they always run on a certain physical computer. What matters is "I need to accomplish X,Y and Z. I have A dollars to spend. Which solutions accomplishes X, Y and Z the best within my budget? You can't separate the software from the hardware and get an answer that's very meaningful.

This benchmark isn't the last word on anything. Even a benchmark run on the exact same hardware means very little if you have a 2 core machine instead of 8.

--
AccountKiller
Re:Mod parent way up! by Control+Group · 2007-07-09 08:57 · Score: 1

You can't separate the software from the hardware and get an answer that's very meaningful

I take your point in general, but this statement is somewhat misleading. While you can't separate software from hardware entirely, you can be in a situation where you have a given supply of hardware, and need to know how best to use it - which amounts to much the same thing.

In that situation, knowing how each piece of software performs on a specific platform may be excellent information for you to have.

--

Reality has a conservative bias: it conserves mass, energy, momentum...
Re:Mod parent way up! by Vellmont · 2007-07-09 09:10 · Score: 1

you can be in a situation where you have a given supply of hardware, and need to know how best to use it - which amounts to much the same thing.

Sure, but you're talking about a specific piece of hardware. Sometimes you DO have a given hardware box and need to find software that works well on it. But how is a benchmark run on a totally different piece of hardware going to help you?

Benchmarks like these might give you kind of general ideas about the software, like "postgresql is in the same class as Oracle in terms of performance", but that's about it. It's a fuzzy answer, but in general I think that's all you're going to get from any Benchmark. I think that's what's really going on here, nothing more.

Solving real world problems are rarely as simple as the benchmarks. It's even tougher with something like a database where there's no standard intensive operations like Quake III, or lame mp3 encoder, or even gzip that people tend to care about.

--
AccountKiller
Re:Mod parent way up! by Actually,+I+do+RTFA · 2007-07-09 09:33 · Score: 1

I notice that they're looking at "$65,500 for the hardware".

Yes, but they are comparing it to $74,000 for Oracle's hardware. The part that worried me is that "all benchmark runs were extensively optimized by the Sun performance team, with the help of performance experts from the databases represented." And they said they spent 6 months optimizing it. My goal is to find the sweet spot that combines: Money, Performance, and Development (and Code Maintance) costs. If it's possible to get great performance out of something by making the code hard to develop and maintain, that's worth a lot less to me.

--
Your ad here. Ask me how!
Re:Mod parent way up! by Doctor+Memory · 2007-07-09 09:40 · Score: 5, Interesting

Inserting 20 million rows, all simple inserts, only one primary key (int) with autoincrement for mysql and a sequence for postgres:
Avg Mysql time per 1000 inserts: 3 seconds
Avg Postgres time per 1000 inserts: 15 seconds (and gets worse over time) OK, now do a seven-table join, including a self-join with a correlated subquery (MySQL does those now, right?). I think everybody knows by now that MySQL is pretty much untouchable as long as all you're doing is simple single-table stuff. Kind of like comparing a pickup truck to a moving van: if all you're doing is moving a couple of boxes around, then the pickup kicks. But when you need to move serious loads, then it's the pickup that gets to sit by the curb...

--
Just junk food for thought...
Re:Mod parent way up! by jaredmauch · 2007-07-09 09:54 · Score: 5, Interesting

AC, sorry, but I have a postgres install working where I get 70k inserts/second or more with a single index on the table during the day. The first insert of course is faster as the index doesn't exist yet. I'm not sure what you're doing, but I can tell you that I have tuned postgres by increasing some simple parameters. If you're using some Linux package, you're likely not seeing the benefits that are possible by stuff like changing the block size parameter in the source. Yes, it's kinda lame you have to do this, but at the same time, it's not too unreasonable. I'd like it if I could set this larger.
This is on 'decent' hardware running Solaris 10 (amd64). Obviously you need to tweak stuff like wal size, checkpoints, etc.. But getting this type of performance is not hard to do. I can scan an hours worth of data in a short amount of time. Each one of these 'hourly' tables contains roughly 30-32M rows. this is nothing to sneeze at from what I can tell. I haven't had a reason to re-evaluate mysql to see if there are enough tweaks to make it perform similarly, but if you're getting the crappy insert rate that you're talking about, you clearly need to change something as you're doing it wrong if you truly care about performance. E-Mail me if you're interested in my postgresql config files. I'm happy to share to minimize the FUD out there.
Re:Mod parent way up! by jguthrie · 2007-07-09 10:02 · Score: 1

If all you're doing is stuffing fixed-size records into a single table, why not just use a flat file and not screw around with all the SQL parsing crap that just slows you down? If you need to process it with SQL tools, later, you can always import it into an actual database for report generation and such.
Re:Mod parent way up! by jedidiah · 2007-07-09 10:19 · Score: 3, Informative

If you do that to even a "fast and robust" RDBMS server you are bound to be bludgeoned by the DBA.

--
A Pirate and a Puritan look the same on a balance sheet.
Re:Mod parent way up! by jedidiah · 2007-07-09 10:22 · Score: 1

If you need one of Sun's key luminaries in order to get that level of performance it is also of limited value. I don't want to have to drag Burleson or Niemiec out to my shop just to get my database to run well. I need to be able to get the database to run well on my own. My company won't spring for Niemiec's pool boy, nevermind Niemiec himself.

--
A Pirate and a Puritan look the same on a balance sheet.
Re:Mod parent way up! by ctr2sprt · 2007-07-09 11:40 · Score: 4, Informative

You can do way better than that with PostgreSQL, at least, and I suspect with MySQL as well. I wrote a benchmark similar to yours, but a good bit more complex. I had two tables, one of which was seeded and another which was populated by the benchmark. The benchmark table had six columns (int, timestamp, 4x bigint), a primary key (int + timestamp), four check constraints (on the bigints), a foreign key constraint (int, to the seeded table), and two indexes (one int, one timestamp). I would do a commit every 75k rows, with 24 such commits per iteration and 30 passes per benchmark run, so 54 million rows total. I also used a thread pool, and there are two reasons for that. First, some amount of parallelism improves DB performance. Second, it more accurately simulated our predicted usage patterns of the database. We ran my benchmark against PSQL and IBM DB2.

The results were interesting (at least, I thought they were). First, PSQL can only handle about 10 threads doing work at once. Past about 10 threads, the DB completely falls apart. DB2, however, could handle more busy threads than Linux could, with a very gradual (and linear) degradation in performance past about 25 threads. I stopped testing at 100 threads. Second, PSQL's inserts per second (IPS) rate cut in half by the end of the bechmark. DB2 followed a similar trend until about 5 million rows, at which point IPS went up to where it started and stayed there without moving. Third, DB2 was I/O-bound, whereas PSQL was CPU-bound. I suspect it's why DB2 was able to handle an order of magnitude greater concurrency: more threads just meant the CPUs had something to do while waiting on the disks. However, it does mean that PSQL might do better with faster CPUs, whereas DB2 would not (it'd just be able to handle more threads).

And the numbers: DB2 averaged 1100 IPS, PSQL 600. Note that for the first million rows or so PSQL was faster: it just eventually dropped down to ~400 IPS after ten million rows or so, killing the average. Of course, since this table would never have fewer than 54M rows - actually, it would typically have 160M - the IPS I got at the end was the one that mattered. Also, this was on a pretty weak server, at least for this kind of workload. With more (and faster) cores, more memory, and more spindles, I'm pretty sure you could increase those numbers by 50% or more. With tuning, perhaps that much again.
Re:Mod parent way up! by megalomaniacs4u · 2007-07-09 11:57 · Score: 3, Informative

As the DBA for Postgres DB where we do that for a website that searches over several million rows (with daily updates of up to 25% of the DB) in under 10 seconds. I can say that a similar query is ugly as hell but it only took a couple days tweaking to get the average case to under 3 seconds - certain parameters max out at 20 seconds.

Certainly postgres plays nice with self joins and natural joins as the query turned out faster than trying an iterative stored proc so we just access a view from the proc.
Re:Mod parent way up! by Anonymous Coward · 2007-07-09 12:01 · Score: 1, Informative

Heh. I should have mentioned my testing was on a 800 Mhz Athlon desktop I use as a spare "server" (wife won't let me put up a rack for real servers yet :(

It was also on a server loaded down by apache2/samba/X/mysq/psql and whatever else I've installed to play with over the past 2 years. So while not realistic as far as a production environment, I thought I'd share my experience since I thought postgres would have handled at least close to mysql. Never imagined the performance would be 5 times slower though. I'll shoot you an email for your config files though, since now I'm curious :)

Also, to the person who suggested I use a flat text file, I only inserted with no indexes for speed, I built the indexes again afterwards. Furthermore, I was already importing from a flat text file, and I wanted to do better than a linear search and also the DBs compress the data which is helpful in keeping disk space usage down (mysql did roughly 7 GB text to 3.6 GB MYD file).
Re:Mod parent way up! by Gorshkov · 2007-07-09 12:03 · Score: 2, Interesting

If you need one of Sun's key luminaries in order to get that level of performance it is also of limited value. I don't want to have to drag Burleson or Niemiec out to my shop just to get my database to run well. I need to be able to get the database to run well on my own. My company won't spring for Niemiec's pool boy, nevermind Niemiec himself.
You're missing the entire point of what he's saying.

When you see benchmarks run and comparisons made between different databases that are conducted by a single person or company, all things are seldom equal. The people involved have more expertiese with this db package and less with that ... things are not set up equally well.

For these published tests, Sun spent a lot of time optimizing postgress to make it fly .... just as the people at oracle spent a lot of time optimizing *their* database configuration for the tests that *they* publish.

So, all things ARE equal, in relation to the database tuning & setup - both were done by experts with the respective packages, and we don't have to wonder if some yahoo screwed things up, intentionally or otherwise, by missing a basic configuration option.
Re:Mod parent way up! by jedidiah · 2007-07-09 12:31 · Score: 3, Funny

I've done OLAP that would make your database server hardware melt and make a hole in your datacenter floor.

--
A Pirate and a Puritan look the same on a balance sheet.
Re:Mod parent way up! by moderatorrater · 2007-07-09 12:37 · Score: 2, Insightful

Great analogy. Just remember that there are far, far fewer moving vans in this world for a reason and that they sit next to the curb more than the pickups.
Re:Mod parent way up! by Anonymous Coward · 2007-07-09 12:50 · Score: 0

I wouldn't let Burleson touch my instance with a ten foot pole.
Re:Mod parent way up! by thePowerOfGrayskull · 2007-07-09 13:17 · Score: 1

Maybe try with autocommit turned off in postgres? (I am assuming you used MyISAM table type in mysql).
Re:Mod parent way up! by timmarhy · 2007-07-09 13:26 · Score: 2, Funny

that would be because they are testing serious db applications, not your fucking toy shit. $65,000 on a server is no big deal at all.

--
If you mod me down, I will become more powerful than you can imagine....
Re:Mod parent way up! by prog-guru · 2007-07-09 13:57 · Score: 1

Don't do 1000 inserts in a row, the pgsql copy function kicks it's ass. I use it to do several thousand rows every 5 minutes in less than 5 seconds on each run all day (telephone records). mysql was the biggest support problem ever with the same dataset.

--
chris@xanadu:~$ whatis /.
/.: nothing appropriate.
Re:Mod parent way up! by nacturation · 2007-07-09 14:34 · Score: 1

Great analogy. Just remember that there are far, far fewer moving vans in this world for a reason and that they sit next to the curb more than the pickups. If you don't need the moving van, you can always get by with the pickup -- as long as you don't mind stopping traffic for hours if you eventually do need to make the switch to a moving van. Then hopefully your driver is rated to operate that class of vehicles. :)

--
Want to improve your Karma? Instead of "Post Anonymously", try the "Post Humously" option.
Re:Mod parent way up! by chthon · 2007-07-09 18:39 · Score: 1

Interesting, since DB2 was probably developed on a platform which is optimised for IO, while the x86 platform really sucks at IO, and the main postgreSQL development was done there.
Re:Mod parent way up! by arivanov · 2007-07-09 18:39 · Score: 4, Interesting
Who cares if MySQL does them or not. Show me the developers that can both develop applications and do SQL. That is a dieing breed. Most developers nowdays go for a really trivial schema and an abstraction layer. At that point the only thing that matters is row speed on simple table operations and there MySQL or in-memory OO database frameworks with a simple backing store wipe the floor. This is the reality of life. And it is not going to get better. If you look at the books on the market the only book that used to teach "proper" SQL (with joins and the lot) strictly from the context of application development was the old DB2 bible. It has not been reprinted since the late 90-es. All the rest that is out there is either heavily slanted toward the app side or towards the DB side (usually the latter). Add to that the fact that many universities try to teach "real life software engineering skills" instead of proper data structure and data manipulation classes and the picture is complete: http://www.joelonsoftware.com/articles/ThePerilsof JavaSchools.html. Add to that the fact that DBDs when you actually corner them to ask something meaningfull answer with SQL technobabble like in your post. To the average developer it sounds like fortran. And if it looks like fortran, walks like fortran and talks like fortran it gotta be fortran. From the point of view of a average software engineer SQL and especially stored procedures look like a blast from the past. He expects to see objects, constructors, destructors, private and public structures. And what does he see? He sees something that looks like written by his grandparents. As a result he turns around and starts doing delete/insert/last_insert_id instead of replace and sequential deletions in software instead of foreign keys. I have tried in the past to work with developers who write commercial apps on top of SQL to optimise their code. And I have wanted to scream all along. In 95% of the cases you deal with either one of the following:
- A nice schema designed once upon a time properly by a proper DBD that is vandalised in the application abstraction layer because the developers are are sorely pissed off by the endless wingeing of the SQL server and/or its abissmal performance. So they take the matters in their own hands and violate ACID by cashing and bypassing restrictions in the app. Sooner or later someone comes around and says - WTF, why don't we rewrite this all in software and sod off the expensive database. And surprise surprise it ends up being done in MySQL.
- An abissmal schema or no schema at all where all restrictions are done in the app. That is MySQL country all the way.
MySQL is a result of the way current software development is taught and done. Unless Jo the Average Developer starts understanding how to use SQL in his application (and he does not) and unless SQL data representation grows up to modern non-fortran-like OO semantics MySQL will proliferate. And if you think that MySQL is bad think twice. There are the object persistence frameworks and in-memory crap that follow in its wake.
--
Baker's Law: Misery no longer loves company. Nowadays it insists on it
http://www.sigsegv.cx/
Re:Mod parent way up! by Anonymous Coward · 2007-07-09 19:07 · Score: 0

Uhm.. no, you seem to be missing the point of the parent you replied to as well.

"If" MySQL does 10k inserts per second on a specific hardware (IPS) that was possible only by enabling x, y and z using instruction from the manual, and Postgres can do 50k IPS on the same hardware by also tuning x, y and z but also having to recompile the whole thing after modifying the source code on a few files.. then it may not be that interesting.

Putting all that aside, a benchmark should reflect what the default installation (maybe with tweaking necessary parameters based on the DB nature) would do in a specific scenario.

Inserting 2 million rows a second to a table with no indexes or primary keys that consists of 3 columns means squat to me. Inserting x^10 number of rows by spending 6 months optimizing the damn thing with near code owner skills is just as bad if not worse.
Re:Mod parent way up! by Anonymous Coward · 2007-07-09 19:19 · Score: 0

Can I touch your instance with my ten foot pole?
Re:Mod parent way up! by Hathor's+Dad · 2007-07-09 20:42 · Score: 1

Is this a bit like, "back then they didn'y have the server power or the ease of language that mandated 'one true way'" ....When you have something that "works" (ie MySQL & your app) then there is little need to put more of the app into the DB as ...it already works....
Re:Mod parent way up! by Gorshkov · 2007-07-09 20:46 · Score: 1

Putting all that aside, a benchmark should reflect what the default installation (maybe with tweaking necessary parameters based on the DB nature) would do in a specific scenario.
Isn't that what I just said? What sorts of optimizations do you thing we've been TALKING about?

As far as benchmarking default installations is concerned ..... I'm guessing that you don't do a lot of database work. There is not a database out there that comes "pre-tuned". They ALL need to be adjusted, futzed with, experimented on, and played with in order to get the right configuration for your specific combination of hardware and db load.
Re:Mod parent way up! by arivanov · 2007-07-09 21:04 · Score: 4, Insightful

That is the problem - it does not in real life. Application works in developer hands, goes out in the field and breaks (seen that one time too many). Millions if not billions of dollars have been put to make sure that RDBMS transactions are atomic and preserve data integrity. No application level interface abstraction has ever afforded the expense and could ever afford the expense to do that. In every single instance I have looked at application developers replacing SQL ACID with "bake-their-own" system I have found cases of data integrity violations. In modern multithreaded (or web server based) apps the most common result from this is race conditions which are probably the hardest to debug problem in software.
The other common problem in using application level abstractions is performance. Once again - works in developer hands, goes in the field, gets real data loaded in it and all hell breaks lose. Similar reasons to ACID as the next biggest investment after data integrity in a database is in its ability to fine-grain lock data objects. If a developer tries to replace RDBMS locking in the application layer, he usually ends up with higher granularity lock that is more contended. In addition to that to avoid race conditions, developers usually deliberately create a bottleneck by muxing all RDBMs access to a single thread and a single access point to simplify locking. In fact probably one of the most beneficial uses of MySQL is its ability to support server-based fine-grained locks that are not tied to a specific data object. You can use these in global semaphors and global locking even in cases where POSIX locks do not work (f.e. across clusters).
Overall, yeah, MySQL and your app "already works". For Proof of Concept - maybe (in fact I use it myself). For real stuff - no, not really, unless you put a lot of work in the application layer. I have done that on quite a few occasions and the performance gains can be staggering compared to ACIDising your brain with a proper RDBMS, but the effort is hardly worth it in most real life scenarios. It also makes it considerably less maintainable.

--
Baker's Law: Misery no longer loves company. Nowadays it insists on it
http://www.sigsegv.cx/
Re:Mod parent way up! by bytesex · 2007-07-09 21:09 · Score: 1

Did you prepare that statement with the functioncall to the sequence. If not, why not ? The sequence functioncall address has to be precompiled for it to be comparable with what mysql does. Also, are you sure you did use 'int' for your primary key in postgres (and not 'number' or 'serial' ?) Did you try to turn autocommit off with postgres and then commit every 1000 inserts ? If not, why not - every serious database loader does it like that.

--
Religion is what happens when nature strikes and groupthink goes wrong.
Re:Mod parent way up! by Anonymous Coward · 2007-07-09 21:37 · Score: 0

Yea, but your OLAP would die from the molecular acid...

Dallas: I haven't seen anything like that except, uh, molecular acid.
Brett: It must be using it for blood.
Parker: It's got a wonderful defense mechanism. You don't dare kill it.
Re:Mod parent way up! by Anonymous Coward · 2007-07-10 00:19 · Score: 0

activist.
Re:Mod parent way up! by Ded+Bob · 2007-07-10 02:17 · Score: 1

Heh. I should have mentioned my testing was on a 800 Mhz Athlon desktop I use as a spare "server" (wife won't let me put up a rack for real servers yet :(

You should have had this in a prenuptial: "Permission by husband to do anything geeky to it". :)
Re:Mod parent way up! by rho · 2007-07-10 02:38 · Score: 1

More or less true, but still not good. Putting database work in your code is common, but it is bad. It's worse when you have to make changes. Bugs, security problems, inefficiencies--these result from poor planning at the application level.
It's hard to plan for a proper DB. It's really hard when you're working fast and alone. But long term you'll be thankful for it.
MySQL is a result of the way current software development is taught and done. Unless Jo the Average Developer starts understanding how to use SQL in his application (and he does not) and unless SQL data representation grows up to modern non-fortran-like OO semantics MySQL will proliferate. And if you think that MySQL is bad think twice. There are the object persistence frameworks and in-memory crap that follow in its wake.
It's not really the SQL that's so hard. You can get some pretty baroque looking queries if you try, but most JOINs and suchlike are easy to wrap your head around. It's the data modeling where people so often fall down, treating a DB as a spendier spreadsheet. Certainly doable, and perhaps even the correct thing to do, but when you start seeing "parameter1, parameter2, parameter3" as columns, you probably need to revisit your notes on normalization.

--
Potato chips are a by-yourself food.
Re:Mod parent way up! by kpharmer · 2007-07-10 04:16 · Score: 2, Insightful

> unless SQL data representation grows up to modern non-fortran-like OO semantics MySQL will proliferate

You do realize that in activities like reporting sql and its set-based operations are far, far, far faster and easier to work with than oo implementations, right?

You can set up a typical star-schema and have certain tools (like microstrategy) immediately recognize it and generate queries for you. These queries will typically perform just fine and allow very powerful and fast drill-downs, drill-across, etc.

And oo approach involving the marshalling of millions of objects related to entries in the database would take forever to build to this kind of flexibility and would run slow as molasses. Might make the oo purists happy, but the customers would never use the product. At least against non-trivial amounts of data.
Re:Mod parent way up! by ldholtsclaw · 2007-07-10 04:20 · Score: 1

How about a 17 table join including multiple joins with COALESCE to the same table (3 times)? MySQL seems to handle it just fine using the InnoDB with full transaction concurrency. It's also beaten out PostgreSQL time and again on production data. Prior to MySQL finally getting their stored procedures to work without crashing, I tried migrating a production database to PostgreSQL on a client system and, within two weeks, the posting procedures had gone from 45 seconds per batch to 2.5 hours due to the [lack of] speed of inserts (MySQL has never exceeded 30 seconds on the same data load). Needless to say, it was a disaster and had MySQL not managed to get the stored procedure bugs fixed in time we would have lost a client over the issue.

Sure, MySQL's stored procedure support is laughable but, at the end of the day, a client just wants to get their work done. MySQL accomplishes this and we've never had any issues regarding slowdown of the inserts or transactional problems either -- and this is with updates and inserts into about 10 different tables within batch transactions.

Maybe I'll try PostgreSQL again someday but I've already tried to do so 5 times and each and every one has always ended up being a disaster.
Re:Mod parent way up! by Doctor+Memory · 2007-07-10 06:00 · Score: 2, Insightful

Most developers nowdays go for a really trivial schema and an abstraction layer. At that point the only thing that matters is row speed on simple table operations and there MySQL or in-memory OO database frameworks with a simple backing store wipe the floor. Until, of course, they don't. All it takes is a couple of users who want to actually get information *out* of the database ("How many widgets do we typically sell in Poughkipsie in March? And when I say 'Poughkipsie', I mean the greater Poughkipsie metroplex.") and you're stuck building indexes and making joins in your code. Eventually your code either becomes unmaintainable, or collapses under its own bulk. Agile/XP developers like the DRY axiom: Don't Repeat Yourself. Why write code to do what the database already does?
DBDs when you actually corner them to ask something meaningfull answer with SQL technobabble like in your post. To the average developer it sounds like fortran. And if it looks like fortran, walks like fortran and talks like fortran it gotta be fortran. Um, are you sure you want to be bringing up FORTRAN as a counterexample when discussing performance? I'm not an HPC enthusiast, but I don't recall seeing Java or Python mentioned in the same sentence as Linpack or STREAMS. It's all FORTRAN (77, some 90) or "C/C++" (the C++ is silent). Just another case of "pick the right tool for the job".
From the point of view of a average software engineer SQL and especially stored procedures look like a blast from the past "Those who do not understand Unix are condemned to reinvent it, poorly" — Henry Spencer. Relational databases are based on set theory, and have proven their worth over the last thirty years. Neither the old CODASYL or the new object databases could compete. Anyone who claims to be a "software engineer" and can't understand SQL is a poser. If they can't learn old tech, how are they going to learn new tech? There is an "SQL for Dummies", you know...
unless SQL data representation grows up to modern non-fortran-like OO semantics MySQL will proliferate In that case, why even use MySQL? Fall back to MyISAM and do all the work yourself! Hell, you are anyway, why even stick an SQL abstraction layer in there? Some fast serialization logic and you're good to go. For toy apps (like those presented in programming books), you can get away with stuff like that. When you get into real projects (something that requires more than one developer more than a year to do), hand-rolled "abstraction layers" that can't guarantee consistency or even simple reliability (quick: what does your data layer do with uncomitted writes when the kernel panics?) just don't cut it.

--
Just junk food for thought...
Re:Mod parent way up! by ctr2sprt · 2007-07-14 04:11 · Score: 1

I don't know much about the inner workings of PSQL, but DB2 is highly parallel when it comes to I/O. This description of DB2's process model is pretty informative, though a bit overwhelming. In particular, you may want to look at some of the diagrams and the bit at the end about asynchronous page cleaners and I/O servers.
Re:Mod parent way up! by blasiusmaximus · 2007-07-24 10:51 · Score: 1

Have you considered using the PostgreSQL COPY function? It is about an order of magnitude faster then INSERT. The only caveat is that even COPY needs to update INDEXes if you have any, so with many indexes, you will see a more modest improvement.

The best way to truly compare by CaptainPatent · 2007-07-09 08:24 · Score: 3, Interesting

Because Sun systems will always be different from the x86 based cores that run MySQL and Oracle, I think the best way to compare such software would be by constructing servers of equal price and seeing how PostgreSQL fares. The true question on any business person's mind is "how much to implement?"

--
Well, back to rejecting software patent applications.

Re:The best way to truly compare by Doctor+Memory · 2007-07-09 08:37 · Score: 3, Interesting

Sun systems will always be different from the x86 based cores that run MySQL and Oracle Umm, wrong both ways. Oracle runs really well on Sun SPARC hardware (and I suspect MySQL at least runs), and Sun also makes x86-based servers (built with AMD's Opteron chips). It shouldn't be any trouble to benchmark all three on the same hardware.

Well, no technical trouble, anyway — I doubt Oracle would like to have its performance compared to two free-as-in-beer competitors. Even if it comes out on top, people will still be tempted to think "Jeez, with the money I save on Oracle licenses, I can buy a faster server and make up the speed difference"...

--
Just junk food for thought...
Re:The best way to truly compare by mooingyak · 2007-07-09 08:41 · Score: 1

This is wrong on a whole bunch of levels:

1. Postgres runs on, among other things, linux, and windows.
2. SOLARIS runs on, among other things, x86.
3. MySQL and Oracle run on, among other things, Solaris on a Sparc.

There is a basis for identical comparisons. I've done it.

OTOH, you got this one right:

The true question on any business person's mind is "how much to implement?"

--
William of Ockham had no beard. The most likely explanation is that it was chewed off by squirrels every morning.
Re:The best way to truly compare by jhines · 2007-07-09 08:45 · Score: 3, Informative

If you read the details, while being Sun machines, they are Opteron based, so yeah they compare.
Re:The best way to truly compare by Anonymous Coward · 2007-07-09 09:25 · Score: 1, Informative

The JEE side used Opterons, the DB side used UltraSPARC: Sun Fire T2000 Server, 1 chip 8 cores 1.2GHz UltraSPARC T1.
Re:The best way to truly compare by PCM2 · 2007-07-09 09:52 · Score: 1

Wait ... you're saying MySQL and Oracle only run on x86? What rock have you crawled out from under?

--
Breakfast served all day!
Re:The best way to truly compare by Chris+Burke · 2007-07-09 10:11 · Score: 1

Well, no technical trouble, anyway -- I doubt Oracle would like to have its performance compared to two free-as-in-beer competitors. Even if it comes out on top, people will still be tempted to think "Jeez, with the money I save on Oracle licenses, I can buy a faster server and make up the speed difference"...

Well yeah, especially since the primary metric for TPC-C isn't TPM (transactions per minute) but TPM/$. If the cost of Oracle means you could throw more hardware at the free DBs and get better overall performance, it'd be reflected right there in the scores.

--

The enemies of Democracy are

Re:on the playground... by eneville · 2007-07-09 08:24 · Score: 1

boys use mysql

men use PostgreSQL,

and _____ use MSSQL.

flame on... =) flaming is ok... but stupid. and even more stupid is people who can code only for one db platform.

--
Why UNIX?

Re:tags by CaseCrash · 2007-07-09 08:24 · Score: 1

Yeah, why the hell is this tag (and others like it) showing up today? It's not coming from jokes in the comments. So what's up with these useless tags?

--
No, that link you posted to a web comic we've all seen a hundred times is not "obligatory."

Re:on the playground... by morgan_greywolf · 2007-07-09 08:25 · Score: 1

wussy bedwetter MSFT fanbois?

--
My blog

Really Long Tag... by andrewd18 · 2007-07-09 08:25 · Score: 0, Offtopic

Uh... WTF?

Re:I do not think it means what you think it means by Ngarrang · 2007-07-09 08:26 · Score: 3, Insightful

To paraphrase an old saying:

There are lies, damned lies and benchmarks.

--
Bearded Dragon

Re:on the playground... by Anonymous Coward · 2007-07-09 08:26 · Score: 1, Informative

and real DBA's use informix..

like BSD.. no it's not dead yet.

Re:on the playground... by nytrokiss · 2007-07-09 08:27 · Score: 1

What about IBM's DB? I hear it has 30% of the market however i have never seen it in action!

Bad firehose! by greg1104 · 2007-07-09 08:27 · Score: 5, Informative

Why this emaciated post made it while mine didn't I'll never know...here's how I submitted this story:

The current version of PostgreSQL now has its first real benchmark, a SPECjAppServer2004 submission from Sun Microsystems. The results required substantial tuning of many performance-related PostgreSQL parameters, some of which are set to extremely low values in the default configuration — a known issue that contributes to why many untuned PostgreSQL installations appear sluggish compared to its rivals. The speed result is close but slightly faster than an earlier Sun submission using MySQL 5 (with enough hardware differences to make a direct comparison of those results unfair), and comes close to keeping up with Oracle on similarly priced hardware — but with a large software savings. Having a published result on the level playing field of an industry-standard benchmark like SPECjAppServer2004, with documentation on all the tuning required to reach that performance level, should make PostgreSQL an easier sell to corporate customers who are wary of adopting open-source applications for their critical databases.

Re:Bad firehose! by MrNaz · 2007-07-09 08:33 · Score: 5, Insightful

I like yours better. The Slashdot editors need to have their balls cut off if they think the post that beat your onto the front page is better. Feel free to mod me down any time for bitching about this, but seriously, this post is SO much better than the one that made it.

--
I hate printers.
Re:Bad firehose! by MythMoth · 2007-07-09 09:04 · Score: 1

The accepted version is preferable to my eyes.

No offense, but yours is too verbose. I want to read a submissions that reassures me that there is Good Stuff Here [link], not one that stands as a substitute for the original.

--
--- These are not words: wierd, genious, rediculous
Re:Bad firehose! by A+nonymous+Coward · 2007-07-09 09:11 · Score: 1

Maybe yours was second. When they got the first one, it was good enough, no point in waiting for a better one that might not ever arrive.

--
Infuriate left and right

Re:on the playground... by CaseCrash · 2007-07-09 08:29 · Score: 5, Funny

boys use mysql men use PostgreSQL, and _____ use MSSQL. and people who like to collect a paycheck use MSSQL.

--
No, that link you posted to a web comic we've all seen a hundred times is not "obligatory."

What are the tuning parameters? by Anonymous Coward · 2007-07-09 08:30 · Score: 2, Interesting

For those of us who don't have dozens of hours to do the necessary research, can some postgresql gurus sum up some of the most significant tuning parameters so us mere mortals can see similar performance gains?

I realize that a large part of the answer is going be "it depends on application, your hardware, and you query types", but surely there must be some general tips that we can follow given various typical setups. MySQL, for example, ships with several different configuration files: One suitable for a small installation, one for a mid-sized installation, one for large installation, etc.

What tuning can someone do to tune postgresql's default (conservative) config file to make it perform better?

Re:What are the tuning parameters? by egoots · 2007-07-09 09:29 · Score: 1

hear, hear... mod the parent up.
Re:What are the tuning parameters? by Anonymous Coward · 2007-07-09 12:06 · Score: 2, Informative

Google is your friend...

http://www.varlena.com/GeneralBits/Tidbits/perf.ht ml

http://www.revsys.com/writings/postgresql-performa nce.html

There is also good material in chapter 4 (Performance) of "PostgreSQL" by Douglas and Douglas (Sams Publishing Developer's Library)
Re:What are the tuning parameters? by greg1104 · 2007-07-10 01:05 · Score: 1

http://www.westnet.com/~gsmith/content/postgresql/ has what you're looking for--"5-Minute Introduction to PostgreSQL Performance"--and points you in the right direction to dig deeper from there.

Re:on the playground... by Kalriath · 2007-07-09 08:32 · Score: 2, Informative

DB2? It's only useable on large mainframes (big iron, so to speak) from what I understand. Generally speaking only the REALLY large shops would use it, so I wouldn't be surprised you'd never seen it - neither have I. We're a pretty big organisation where I work, and we have a mix of Oracle, MSSQL, and Sybase servers.

--
For a site about things like basic rights, Slashdot users sure do like to censor "dissent".

hardware by Deadplant · 2007-07-09 08:34 · Score: 0, Offtopic

The plural of hardware is hardware, not hardwares.
Does slashdot not have a "check spelling" feature on the submission page?

Re:hardware by morgan_greywolf · 2007-07-09 08:48 · Score: 2, Funny

nope. It's h4rdw4r3z!

--
My blog
Re:hardware by Anonymous Coward · 2007-07-09 08:51 · Score: 0

Mine seems to have a "spell checkings" button.
Re:hardware by Deadplant · 2007-07-09 08:59 · Score: 2, Funny

No! gad damn it!
"h4rdw4r3" is however an acceptable alternate spelling.
Re:hardware by rainer_d · 2007-07-09 09:18 · Score: 1

> The plural of hardware is hardware, not hardwares.
> Does slashdot not have a "check spelling" feature on the submission page?

You must be new here.

--
Windows 2000 - from the guys who brought us edlin

Re:I do not think it means what you think it means by KillerCow · 2007-07-09 08:34 · Score: 5, Informative

I think that somebody sent the wrong link and (surprise!) the editors didn't even follow it to check.

Here's a more useful one: All SPEC jAppServer2004 Results Published by SPEC

The benchmarks aren't standardized enough for any useful comparison. The hardware and configurations vary in almost every one.

Re:I do not think it means what you think it means by Ungrounded+Lightning · 2007-07-09 08:35 · Score: 4, Insightful

however the test-hardwares of the other DB systems are somewhat different

Which makes the results pretty much useless.

Not necessarily.

It's essentially useless for separating out how much of the performance difference is the result of the software's design, implementation, and tuning versus how much is due to the platform differences.

But such tests CAN be used to examine the performance of competing ENTIRE SYSTEMS, to inform choices between them.

They say: "Oracle on does THIS well, PostgreSQL on can be tuned so it does THAT well on the same benchmark."

This lets administrators (presuming they have access to the hardware info) get a bang-for-the-buck comparison.

For the rest of us, the interesting point is that PostgreSQL, running on its team's idea of realistic hardware, can produce performance in the same ballpark as Oracle running on Oracle's choice of hardware.

(Whether the necessary remaining data (what are hardwares x and y? how was PostgreSQL tunde) is published now, later, or never, is a separate issue. B-) )

--
Bantam Dominique roosters crow a four-note song. Once you've heard it as "Happy BIRTHday" you can't NOT hear it that way

Re:on the playground... by Anonymous Coward · 2007-07-09 08:36 · Score: 0

Runs on Linux too...

Re:on the playground... by Kalriath · 2007-07-09 08:36 · Score: 2, Insightful

I dunno, I kinda like MSSQL. Hell, I use it alongside MySQL servers for my own projects (that, and having support for multiple platforms in your product is kinda a good idea). Sure, it's got horrific licensing (nowhere near as bad as Oracle's, though) but other than that, it's pretty good and reliable. I get the impression that the core of it wasn't written by MS way back when, though. And it sure wasn't built by the Windows team.

--
For a site about things like basic rights, Slashdot users sure do like to censor "dissent".

Re:I do not think it means what you think it means by Ungrounded+Lightning · 2007-07-09 08:37 · Score: 1

That should have read:

They say: "Oracle on {hardware x} does THIS well, PostgreSQL on {hardware y} can be tuned so it does THAT well on the same benchmark."

--
Bantam Dominique roosters crow a four-note song. Once you've heard it as "Happy BIRTHday" you can't NOT hear it that way

SQL sucks by Anonymous Coward · 2007-07-09 08:37 · Score: 0

SQLite and Postgres are current my choices. What I really want are language bindings that preserve the structures, syntax and semantics of the language I'm using without the performance penalty of abstraction layers.

Re:SQL sucks by Anonymous Coward · 2007-07-09 08:58 · Score: 0

What I really want are language bindings that preserve the structures, syntax and semantics of the language I'm using without the performance penalty of abstraction layers.

It's called programming. You should try it.
Re:SQL sucks by mwanaheri · 2007-07-09 08:58 · Score: 1

then you're a good candidate to use an object-database. However, I don't know yet how well they scale. Apart from that, managing concurrent transaction might be a pain. I'm playing with db4o in my next project. Interesting database, but I will restrict it to single-user mode.

--
Idha khatabahum lijahiluna qalu salaman

Benchmarking Custom Versions by Doc+Ruby · 2007-07-09 08:38 · Score: 0

Has anyone used the PostgreSQL open source to refactor the DB to support just a subset of SQL and features (the most popular stuff that eg. "LAMP" uses), then benchmarked it vs the default distro, to show higher performance?

For that matter, has anyone merged any open source Java server container with PostgreSQL for higher performance of that use case, in an integrated architecture without network and other overhead for messages, and more atomic transactions?

--

--
make install -not war

Re:Benchmarking Custom Versions by Alphager · 2007-07-09 10:13 · Score: 1

Has anyone used the PostgreSQL open source to refactor the DB to support just a subset of SQL and features (the most popular stuff that eg. "LAMP" uses), then benchmarked it vs the default distro, to show higher performance?
I don't see how stripping out nused features would lead to a big increase in performance. Unused code just sits there in the RAM and does nothing to slow you down (_IF_ the application is well-designed)

For that matter, has anyone merged any open source Java server container with PostgreSQL for higher performance of that use case, in an integrated architecture without network and other overhead for messages, and more atomic transactions? Again, i think this would not increase the performance by a notable amount. The bottleneck nowadays lies in I/O; the two copy-commands needed to pipe something through the tcp/ip-stack are negligable.
Re:Benchmarking Custom Versions by Estanislao+Mart�nez · 2007-07-09 10:37 · Score: 1

Has anyone used the PostgreSQL open source to refactor the DB to support just a subset of SQL and features (the most popular stuff that eg. "LAMP" uses), then benchmarked it vs the default distro, to show higher performance?

You talk as if this were a trivial thing to do just because you have access to the sources.

--
Are you adequate?
Re:Benchmarking Custom Versions by Doc+Ruby · 2007-07-09 10:48 · Score: 1

Do you know how relational algebra and propositional calculus are implemented? There are many logic "ifs" and other evaluations in the code path to accommodate optional features. It's one reason why caching queries and results is so efficient. Those computations wouldn't even have to be run on uncached logic if the optional features were stripped out. It's not a matter of extra code stored in RAM, but more complex code paths.

And again with the Java/SQL integration: the extra layers to return from a Java scope, then jump the CPU to the SQL code, is inefficient. In tight loops, the least context switching shows real efficiency gains. Any simplification, even reducing handles to pointers for CPU address calculation, can improve performance. If the task is simple. Which is what most Web DB/object transactions are.

DB performance is a very tough game. Tiny gains result in many more sales. PostgreSQL is not as popular as others partly because there haven't been good performance metrics, so there hasn't been a big investment in performance metrics, so it stays relatively unpopular. But if it were tailored for the simplicity of most Web apps, it might have not only good performance, but the kind of marketable gains that Oracle and others will not revise themselves to produce.

--
--
make install -not war
Re:Benchmarking Custom Versions by Doc+Ruby · 2007-07-09 11:00 · Score: 1

No, I never said it was trivial, or any degree of difficulty or ease. I merely asked if it had been done, since the prohibition from proprietary source does not stop it.

You are projecting your inferences on what I said. Projecting a binary worldview, where coding is only either "impossible" or "trivial". All I said was it's not impossible, and wondered if anyone had done it.

--
--
make install -not war

Good, but it can be improved. by khasim · 2007-07-09 08:39 · Score: 5, Interesting

Get people from each group, give them the requirements and 5 different dollar amounts.

Let each team setup their systems how ever they want at each price point. Some will go with clustered servers. Some will go a single monster server. They know their products best so they'll be the ones best suited to choosing the configuration.

Then run the benchmarks. And keep hammering on them until AFTER the next patch release.

Yeah, it might run fast, but still be a bitch to patch/upgrade.

At $5,000 you might find that a cluster of MySQL boxes beats everything.

At $10,000 maybe something else is best.

$25,000

$50,000

$100,000

etc.

And finally, break it. Break it bad. What happens when something goes wrong? Oracle might cost a lot, but if they can come through with your data they might just be worth it.

If nothing else, you'll get the "best practices" nicely demonstrated by each group. :)

Re:Good, but it can be improved. by suv4x4 · 2007-07-09 09:27 · Score: 5, Funny

Get people from each group, give them the requirements and 5 different dollar amounts.
Let each team setup their systems how ever they want at each price point. Some will go with clustered servers. Some will go a single monster server. They know their products best so they'll be the ones best suited to choosing the configuration.

We gave each team $10000 and told them to build the best hardware and db setup they can:

** PostgreSQL got a small IBM Blade quad machine redundant setup:
"We're relying on standard industry solutions and reliability."

** MySQL clustered 4 PlayStation3 machines and wasted the rest on booze and women:
"We're practical, plus we know what is money best spent on!".

** Oracle purchased a 1200 square foot datacenter and installed a megacluster of 8132 quad-Xeon 64GB RAM 4TB disk machines. With $10'000...?!
"We... uhmm... we hit a great bargain, guys! You wouldn't believe it, but it's true!"
Re:Good, but it can be improved. by Shados · 2007-07-09 10:01 · Score: 1

The oracle thing made me laugh out loud =P
Re:Good, but it can be improved. by perlchild · 2007-07-09 10:28 · Score: 1

You do know that anyone publishing benchmarks of Oracle(and perhaps others) is breaking the EULA for Oracle, except Oracle itself...
Which means that noone will publish a benchmark of oracle db in which it did poorly...
Re:Good, but it can be improved. by NormalVisual · 2007-07-09 16:54 · Score: 1

From Oracle's OTN Developer EULA:

"You may not...disclose results of any program benchmark tests without our prior consent; or, - use any Oracle name, trademark or logo."

Now, let's look down a bit farther in the EULA...

"You may terminate this agreement by destroying all copies of the programs."

I'm no lawyer, but it seems to me that once you ditch your (free) copy of Oracle you're not bound by the license terms anymore. You don't have the right to use the software anymore, but you still have the results of the tests you ran before getting rid of it, no?

--
Please stand clear of the doors, por favor mantenganse alejado de las puertas
Re:Good, but it can be improved. by clambake · 2007-07-09 22:40 · Score: 1

Oracle might cost a lot, but if they can come through with your data they might just be worth it.

This is ironic... because:

Oracle

BEGIN;

DROP TABLE super_important_data;

ROLLBACK;

-- WTF, still lost all my data!?

Postgres:

BEGIN;

DROP TABLE super_important_data;

ROLLBACK;

Hey it rolled back the DROP TABLE, as common sense would dictate, SWEET!
Re:Good, but it can be improved. by Anonymous Coward · 2007-07-10 01:14 · Score: 0

I imagine, the results will be:

PostgreSQL posts some pretty competitive high-concurrency results, and nobody cares.

Boozed MySQL developers declare victory after benchmarking meticulously a single row insert, screaming "First Post!" concurrently.

Oracle recruits everybody who participated in the tests or is vaguely related with one of the projects.

SQLite versus Postgres by Anonymous Coward · 2007-07-09 08:40 · Score: 0

SQLite is faster than Postgres for many types of workloads.

Re:SQLite versus Postgres by Just+Some+Guy · 2007-07-09 08:54 · Score: 1

Absolutely! Of course, they don't really live in the same solution space so it's equally true to say that PostgreSQL is much faster than SQLite for many workloads.

--
Dewey, what part of this looks like authorities should be involved?
Re:SQLite versus Postgres by malraid · 2007-07-09 09:23 · Score: 1

Just try to hammer each database with 50 concurrent transactions to see which one scales better!!

--
please excuse my apathy
Re:SQLite versus Postgres by Ant+P. · 2007-07-09 09:30 · Score: 1

I started getting problems just rapidly F5ing a php script that used an SQLite DB. Can't say anything about PgSQL, because I couldn't figure out how to set that up.
Re:SQLite versus Postgres by glwtta · 2007-07-09 10:13 · Score: 1

I couldn't figure out how to set that up

Better hope you never have to use Oracle.

--
sic transit gloria mundi
Re:SQLite versus Postgres by glwtta · 2007-07-09 10:23 · Score: 1

I'm sure that's true, but that page is crap:

"SQLite 2.7.6 is significantly faster (sometimes as much as 10 or 20 times faster) than the default PostgreSQL 7.1.3 installation on RedHat 7.2 for most common operations."

A drunken badger is 10 to 20 times faster than a default Postgres 7.1.3 installation. The 7.x branch, and particularly the early releases, had pretty bad performance (for this type of queries, at least) - they didn't really start sorting that out until 7.4

Isn't SQLite a bit of a toy though? Something like HSQLDB or (better yet) Berkeley DB usually seems more appealing when you don't need a standalone DB server.

--
sic transit gloria mundi
Re:SQLite versus Postgres by Anonymous Coward · 2007-07-09 10:28 · Score: 0

Make sure your file system is mounted noatime, nodiratime. Then see if you can optimize your queries (transactions, bound params), attempt to limit the DB to a reasonable size and regularly vacuum if you're mainly updating existing fields.

I've used SQLite for a handful of small sites without problem.
Re:SQLite versus Postgres by Anonymous Coward · 2007-07-09 10:40 · Score: 0

> Isn't SQLite a bit of a toy though?

No it isn't, take a closer look
Re:SQLite versus Postgres by nuzak · 2007-07-09 10:43 · Score: 3, Funny

> Can't say anything about PgSQL, because I couldn't figure out how to set that up.

apt-get install postgresql-server too hard for you? Or are you having trouble double-clicking on the .MSI installer?

--
Done with slashdot, done with nerds, getting a life.
Re:SQLite versus Postgres by Ant+P. · 2007-07-09 11:24 · Score: 0, Flamebait

Neither. I gave up 4 hours after installing, realising postgres has no documentation worth mention. That and from what I've seen, I'd rather not be associated with its userbase.
Re:SQLite versus Postgres by Anonymous Coward · 2007-07-09 12:28 · Score: 0

Ciao, can't say as we miss whiny-ass wankers who can't be assed to do the basics.
Re:SQLite versus Postgres by ppanon · 2007-07-09 13:40 · Score: 1

Neither. I gave up 4 hours after installing, realising postgres has no documentation worth mention.
Well, it's normally included in most packages but, even if it wasn't, did you try here?
That and from what I've seen, I'd rather not be associated with its userbase.
The feeling is mutual, I'm sure.

--
Laissez lire, et laissez danser; ces deux amusements ne feront jamais de mal au monde. - Voltaire
Re:SQLite versus Postgres by MrNaz · 2007-07-09 14:29 · Score: 1

I've many times gotten support from the core devs who pop into the IRC channel to give support. Also, the idea that PG is hard to set up is a 5 year old story, long since obsoleted. PG now comes in RPMs, can be apt-gotten in Debian and has a .msi installer that is far easier than even MySQL's Win32 installer. If you are having trouble installing PostgreSQL, then you're obviously not worth anything as a sysadmin.

--
I hate printers.
Re:SQLite versus Postgres by Harik · 2007-07-09 14:46 · Score: 1

Yes it is, and no it isn't. It's really useful for "databases" that are only storing a small amount of data, or mostly static data. While I generally use Postgresql for any serious work, I've made trivial apps out of php+sqlite, because I can send someone the entire application as a single tarball. It's mostly useful to avoid having to write your own storage routines (which will invariably be buggier then something with lots of eyeballs on it).
Re:SQLite versus Postgres by Sxooter · 2007-07-11 05:29 · Score: 1

So, you can't read then?

MySQL's docs are set up as a tutorial, and I can't find anything easily in them. If you know anything about databases then reading MySQL's docs is painfully slow.

PostgreSQLs docs are set up with one section being the tutorial, one a user's guide, and one an admin's guide. I find it MUCH easier to find what I want with it, plus it's got a search engine sitting on top of the docs that's fast and accurate for finding the relevant information.

How someone can find MySQL easy and PostgreSQL difficult (or vice versa really) is beyond me. They're both dirt simple to set up and use. Just one is better at keeping your data coherent.

I repeat, if you couldn't find your answer in the pg docs, you can't read.

--

--- It is not the things we do which we regret the most, but the things which we don't do.

Re:on the playground... by Penguinshit · 2007-07-09 08:41 · Score: 1

big iron?

--

I have something in common with Stephen Hawking...

Re:I do not think it means what you think it means by thanasakis · 2007-07-09 08:42 · Score: 1

http://www.spec.org/jAppServer2004/results/jAppSer ver2004.html contains all the results. Unfortunately the different hardware configurations make it rather hard to draw any conclusions. Which begs the question, how did the submiter knew that these specific guys where biased or not? From what I can see, the whole setup is inherently biased.

Re:on the playground... by morgan_greywolf · 2007-07-09 08:43 · Score: 5, Informative

DB2? It's only useable on large mainframes (big iron, so to speak) from what I understand.

Um, no. DB2 these days runs on most major UNIX variants (HP-UX, Solaris, AIX, IRIX, etc.), Linux and Windows. It's used quite often, in fact. Most Enovia/VPM installations use DB2 backends, for instance. Modern versions use XML along with regular relational database stores and are very, very up-to-date technology-wise. Very scalable.

--
My blog

We finally have PROOF (but not real proof) by Pap22 · 2007-07-09 08:44 · Score: 3, Informative

This publication shows that a properly tuned PostgreSQL is not only as fast or faster than MySQL, but almost as fast as Oracle (since the hardware platforms are different, it's hard to compare directly). This is something we've been saying for the last 2 years, and now we can prove it.

Postgresql 8.2 on UltraSPARC T1
MySQL 5 on AMD Opteron 285

The UltraSPARC has 8 cores on 1 chip and 16GB of memory.
The Opteron has 4 cores and 8GB of memory.

The UltraSPARC should smoke it every time.

Re:We finally have PROOF (but not real proof) by truth_revealed · 2007-07-09 09:05 · Score: 1

Postgres scored 778 on a probably more expensive machine with twice the RAM of the MySQL machine which scored 720. You call an 8% improvement "smoking it"?
Re:We finally have PROOF (but not real proof) by Anonymous Coward · 2007-07-09 09:06 · Score: 0

My understanding is that running mysql on anything with more than 4 cores shows a performance drop, so i can't see why you'd want to use more...
Re:We finally have PROOF (but not real proof) by dino2gnt · 2007-07-09 09:08 · Score: 1

Actually, depending on the amount of floating point work required, the UltraSPARC T1 may perform slower. They smoke on integer performance, but on floating point, they're noticeably slower than a comparable non-SPARC.

--
Future events such as these may affect you in the future!
Re:We finally have PROOF (but not real proof) by Ant+P. · 2007-07-09 09:20 · Score: 1

Sounds a lot like the advert in Microsoft's "Get The Facts" FUD that got pulled from magazines.
Re:We finally have PROOF (but not real proof) by Anonymous Coward · 2007-07-09 12:31 · Score: 0

I used to think the same thing until 9month ago. We got a T2000 (maybe?) with 16 Cores at 1GHz, 32GB or RAM. The performance of our Oracle datawarehouse dropped through the floor compared to our dual core dual CPU Intel boxes running at 3GHz with 4GB of RAM. The performance difference was sizable. Our workload did not perform well on the Sparc. You need to compare the same workload.
Re:We finally have PROOF (but not real proof) by ubernostrum · 2007-07-09 20:22 · Score: 1

My understanding is that running mysql on anything with more than 4 cores shows a performance drop, so i can't see why you'd want to use more...

Yup; it's fairly well known that MySQL doesn't get as much benefit from additional cores (see, for example, MySQL 5 only showing 6%-14% improvement when going from two cores to four, compared to PostgreSQL jumping 77% in the same situation) and has performance drop-offs beyond certain levels though Josh Berkus of PostgreSQL pointed out in response to that test that Postgres will also top out, just at a higher number of cores), so using a machine with fewer cores can actually favor MySQL.

The common wisdom among folks who know what they're talking about seems to be that lower-end hardware and certain use patterns (lots of single-table stuff or simple joins) favor MySQL, while beefier boxes and different use patterns (complex queries where the Postgres cost-based planner can shine) favor PostgreSQL.
Re:We finally have PROOF (but not real proof) by Anonymous Coward · 2007-07-10 04:25 · Score: 0

UltraSPARC T1s have 8 cores per socket
SunFire T2000s are single socket servers

So you have 8 cores per box, not 16.

Re:on the playground... by Anonymous Coward · 2007-07-09 08:45 · Score: 1, Informative

Intelligent people use SQLite (or Berkeley DB) for apps that don't require a heavyweight DBMS.

Databases are even being used in many situations that would be better addressed using flat files.

Bias by suv4x4 · 2007-07-09 08:47 · Score: 1

They certainly are not unbiased

I guess that's a deal breake right there, no?

The test was put together by PostgreSQL core developers at Sun. Didn't we agree earlier, when talking about Intel/AMD benchmarks, that vendor supplied tests are wildly inaccurate?

PostgreSQL should concentrate on more developer tools and better marketing. The "it's got a ton of features you don't need" on a cryptic site doesn't help its cause.

People use MySQL because there's a wide support and lots of dev tools for it, and because the kind of people going for MySQL needs to do simple selects and inserts most of the time.

Re:Bias by pavera · 2007-07-09 09:15 · Score: 1

I really don't understand this argument. I've used PostgreSQL and MySQL pretty much interchangeably for the last 7 years. I have never felt a lack of dev tools or documentation/help from postgresql. Maybe I'm just smarter than the average joe, but setting up and running postgresql is not that much more difficult than mysql.

For the features it provides I much prefer postgresql. sure, clustering is harder, but for the "easy" things that mysql is supposed to be good at, you don't need clustering either. Once you're getting into "Oh, I need 5 DB servers to handle this load" then you're talking about serious DBA work no matter what platform you're using. I'd rather by default have referential integrity. Sure MySQL has it now, but only if you get InnoDB working, which is at least as hard (I think harder) than getting a default PostgreSQL running.

What developer tools is postgresql missing? I have no problems creating, maintaining, editing, or viewing databases in PostgreSQL. There are drivers for every language I've ever used (java, python, ruby, c++, php), what exactly is missing? Why should they "market" they aren't like MySQL they don't have a commercial version for which they get paid.
Re:Bias by Bluesman · 2007-07-09 10:57 · Score: 1

I'm with you here. Nine years ago, PostgreSQL was so far ahead of MySQL in terms of features (basic things like foreign keys didn't exist in MySQL) it wasn't even a contest deciding to go with Postgres.

MySQL has only recently mostly caught up in the feature department, so for any real database work up until that time, Postgres was the only sane choice.

The awful thing still about MySQL is that because of the way it's put together, it's a crap shoot as to what features are going to be installed if you don't own the server you're working on. Any database that makes me wonder, "are foreign keys supported in this version?" is not worth using when there are alternatives.

I'm sure MySQL has its uses, but every time I work with it, I get pissed off at how many features are missing, or how badly it handles complex queries. The fact that, "Oh, the newest version of MySQL has that now too, if you use InnoDB tables!" is no consolation.

--
If moderation could change anything, it would be illegal.
Re:Bias by pavera · 2007-07-09 14:08 · Score: 2, Informative

The great (sarcasm) thing to me is they have added all these features (views, stored procs, triggers) in the last year or so, unfortunately if you want to use any of those features there are huge caveats when using their main selling point of "easy" clustering. Triggers, Views and Stored Procs all break replication under pretty normal use cases. So, if you want to use them you can't use the built in clustering.

I don't know if the same holds true for PostgreSQL, but I would be extremely surprised since Postgre has had all of those features for years and anyone designing a replication scheme would take them into account. MySQL in contrast had replication first, and now the designers of the additional features have to take the replication scheme into account (which is much harder I'd imagine, as views, triggers and stored procs all have some semblance of a standard, and at the very least have a well defined feature set, and sometimes you can't bend a feature set to fit into an already existing replication scheme)

Also, even if you own the server, converting an existing MyISAM DB to InnoDB can be a huge chore. I've done this for at least 3 customers, and every time it is a much more painful process than you'd think. Mostly because existing data in MyISAM almost always has referential problems after any real world usage (like its been live for more than a month). Someone has deleted a customer, and now you have orders that are orphaned, someone ran a bad update that set the FK wrong in the orders table, and MySQL happily obliged, and now you've got orphaned data... I mean to me the most BASIC functionality of an Rdbms is that it enforces the relationships you give it. I don't even count MySQL w/MyISAM as a database. Might as well just use a bunch of flat comma separated files for tables.
Re:Bias by Anonymous Coward · 2007-07-10 00:26 · Score: 0

I personally find it much easier.

Even compiling the thing up is a piece of cake. All you need beyond a basic (old school) UNIX installation is a compiler, make, m4 and bison (as I recall). Sure you can add SLL and readline stuff, but all features of the database (rather than the connection to the database) are there.

Re:I do not think it means what you think it means by Minwee · 2007-07-09 08:49 · Score: 1

As I recall, Oracle's choice of hardware consists of a very large chequebook and a stamp with your signature on it.

Elephant by suv4x4 · 2007-07-09 08:50 · Score: 3, Funny

Won't you guys agree, "elephant" doesn't exactly communicate "fast and modern" very well.
"Dolphin" comes a bit closer.

Who's coming up with those logos?

Re:Elephant by Anonymous Coward · 2007-07-09 08:55 · Score: 0

SQLite is small and nimble, the others are all bloated pieces of shit.
Re:Elephant by afabbro · 2007-07-09 08:57 · Score: 1

Elephant communicates "really long memory".

Which, of course, is what you want in a RDBMS.

Who's coming up with this education system?

--
Advice: on VPS providers
Re:Elephant by pavera · 2007-07-09 09:01 · Score: 3, Insightful

"Dolphin" also conveys "fun play thing" to me...

I'd prefer the elephant that never forgets.
Re:Elephant by suv4x4 · 2007-07-09 09:12 · Score: 1

Elephant communicates "really long memory".

Which, of course, is what you want in a RDBMS.
Who's coming up with this education system?
--

I'm not convinced.

Dolphins are way smarter than elephants: we'll need an elephant/dolphin benchmark.

I'll get some monkeys to setup one.
Re:Elephant by Doug+Neal · 2007-07-09 09:34 · Score: 1

SQLite is small and nimble, the others are all bloated pieces of shit. SQLite isn't very nimble on even a 1.5GB database.

It's the shiznit for storing config data in small Linux appliances though :)
Re:Elephant by AaronW · 2007-07-09 10:29 · Score: 1

And it cannot scale at all. For each update, it locks the entire database. Multiple queries are done by accessing the same file, hence data sharing only at the filesystem level.

Don't get me wrong, it is a nice database for some applications, but not for heavy duty applications.

--
This post is encrypted twice with ROT-13. Documenting or attempting to crack this encryption is illegal.
Re:Elephant by turing_m · 2007-07-09 11:47 · Score: 2, Funny

If the final decision in choosing an RDBMS comes down to the logo, the choice of database will be the least of your problems.

--
If I have seen further it is by stealing the Intellectual Property of giants.
Re:Elephant by Anonymous Coward · 2007-07-09 14:37 · Score: 0

Fat, bloated, slow and high maintenance also spring to mind...
Re:Elephant by runningduck · 2007-07-09 15:12 · Score: 1

For each update, it locks the entire database.
What are you talking about? Do you understand the concept of MVCC, real MVCC? You are sorely mistaken if you think PostgreSQL locks then entire database just because it stores all the data for a table in a single file (which btw, it does not).

http://www.developer.com/open/article.php/877181

--
-rd
Re:Elephant by AaronW · 2007-07-09 15:49 · Score: 1

I was obviously referring to the SQLite, not PostgreSQL.

--
This post is encrypted twice with ROT-13. Documenting or attempting to crack this encryption is illegal.
Re:Elephant by pavera · 2007-07-10 05:53 · Score: 1

Or maybe "I can move houses and trees for you"

Re:on the playground... by dekemoose · 2007-07-09 08:50 · Score: 1

At where I work we've got a number of clients running DB2 on x86 Linux platforms, as well as larger AIX platforms and of course the mainframe. Seems to have a lot of capabilities but is a bit more finicky than Oracle and MS SQL. It seems to like to dead lock unless the DBA has really made sure it is set up properly.

Re:on the playground... by teknopurge · 2007-07-09 08:52 · Score: 1

DB2 runs on distributed systems too - I'm using it on a project right now. It's not bad at all.

--
Website Hosting

Re:on the playground... by jshriverWVU · 2007-07-09 08:53 · Score: 1

Agree, and they're all SQL based anyway.

Re:I do not think it means what you think it means by Control+Group · 2007-07-09 08:53 · Score: 3, Insightful

Oh, I agree. A benchmark of whole systems can be just as (or more) useful as a benchmark of individual pieces of software, depending on what your goals are.

But what's been presented here isn't even that. Links #1 takes us to a SPEC benchmark of PostgreSQL. It doesn't provide any information about anything else; there isn't anything to compare the benchmark to. Link #2 provides an unreferenced statement about Oracle's marginally superior performance on much more expensive equipment.

So, perhaps, one can begin to draw conclusions about PostgreSQL vs Oracle in the contexts of full systems. But neither link #1 nor link #2 provide any information about MySQL (except the quote: "[t]his publication shows that a properly tuned PostgreSQL is not only as fast or faster than MySQL").

Really, my criticism isn't of the benchmark (the data are the data, after all) or of the blog (one expects a vested PostgreSQL interest to comment on such a benchmark), but of the blurb here that either a) draws totally unwarranted conclusions, or b) depends on information it doesn't bother sharing.

--

Reality has a conservative bias: it conserves mass, energy, momentum...

performance isn't the issue by SEAL · 2007-07-09 08:55 · Score: 1, Interesting

Performance isn't what causes a lack of acceptance in the marketplace for PostgreSQL.

The problem is twofold:

MySQL, as others have pointed out, has better developer support and they know their target audience. They supply a fast, easy to use database for those who don't need a whole lot.

Oracle supplies an enterprise level database that MySQL doesn't aspire to. PostgreSQL doesn't know where to fit in.

Do a little investigation on setting up PostgreSQL with fault tolerance and replication and you'll quickly see why large corporations cough up money for Oracle. Performance is one aspect of the price tag, but it is certainly not the only factor.

Re:performance isn't the issue by killjoe · 2007-07-09 09:16 · Score: 1

Postgres should aim itself as a replacement for MSSQL server.

--
evil is as evil does
Re:performance isn't the issue by pavera · 2007-07-09 09:25 · Score: 4, Informative

I've been using both MySQL and PostgreSQL for 7 years now... I've never felt a need for "developer support". Or if that phrase means "documentation + message boards" I've never felt a lack from postgresql. I also really don't get the PostgreSQL is hard to use argument. In every linux distro I've used in the last 7 years (redhat, centos, suse, fedora core, ubuntu) they are exactly the same... IE apt-get, yum, whatever, install postgresql-server/mysql-server. Then $startupscriptdir/postgresql start....

Do people really get that hung up on using postgres as the initial user instead of root?!? That is the ONLY difference that I can see.

Now I can agree that maybe postgresql doesn't really "target" an audience, but that is also because they are a true open source project. They don't have a commercial version. I really don't think Linus sits around saying "Ok, we need to add this feature so that more fortune 500's will adopt linux". Or "we need to add feature xyz so this will appeal to small businesses". MySQL has a "target" audience because they are selling something. If you aren't selling anything, by definition you don't have a target.
Re:performance isn't the issue by LurkerXXX · 2007-07-09 09:32 · Score: 4, Insightful

Actually I think PostgreSQL might have eaten a lot more of the MySQL market if they'd simply been faster to market with better admin tools and Windows support.

Lots of folks went with MySQL early because of those factors. They also then tended to write all their PHP, etc, applications to only talk to MySQL, thus making folks who might have preferred PostgreSQL use MySQL to run the app that they needed to run. Once that happens you are kind of in a Catch-22 place. Folks won't write the apps for PostgreSQL until it's used by a larger chunk of the market, but it won't take that large chunk because all the 'cool' apps were written MySQL only, so they have to run MySQL instead of PostgreSQL
Re:performance isn't the issue by SEAL · 2007-07-09 09:32 · Score: 1

I also really don't get the PostgreSQL is hard to use argument.

Did you try installing it on Windows a couple years ago? :-)

One reason MySQL quickly gained popularity is that they provided a Windows version early on. It took a long time before PostgreSQL finally got around to building a Win32 version with an installer. (Regardless of my opinions of running a database on Windows... the userbase is still huge).
Re:performance isn't the issue by Anonymous Coward · 2007-07-09 09:34 · Score: 0

All the postgres solutions (slony-I, slony-II, pgcluster, others?) are horrible to setup, and don't really allow for great high availability. If you're doing a ton of reads (and not so many writes) pgcluster is nice I guess. Also, recovering from an actual failure on all these can be very much hist and miss.

Lower end Oracle is cheap, but scaling up does get really expensive, but there isn't any "free" competition as you start scaling.
Re:performance isn't the issue by kpharmer · 2007-07-09 09:35 · Score: 2, Insightful

> MySQL, as others have pointed out, has better developer support and they know their target audience. They supply a fast,
> easy to use database for those who don't need a whole lot.
> Oracle supplies an enterprise level database that MySQL doesn't aspire to.
> PostgreSQL doesn't know where to fit in.

This is an oversimplification. Each vendor sees itself in all markets:
- oracle/db2/sql server have free versions for tiny apps and very expensive versions for massive apps
- mysql says it doesn't want to do what oracle does, but also says that this is less than 1% of the market - and knows that plenty of smallish databases are on oracle
- postgresql like the others sees itself doing anything from very small databases to very large ones (often via Enterprise DB or other vendor extensions)

And using a single product for multiple sizes isn't illogical: if you have any very large databases (hundreds of gbytes or more) then you probably have a few dozen little ones as well. It's *far* easier to manage them all on oracle/db2/sql server - even with the small additional licensing costs - than to have a frankenstein collection of products to manage.

"Best tool for the job" is a good consideration when evaluating products (along with vendor viability, cost, etc, etc) - but once you've got a single tool in house to keep adding new products - each with their own licensing, support, patch, backup/recovery procedures, etc is a nightmare. Let alone actually federating your data - and having to test out how to virtualize or replicate data from oracle 10.x.x with mysql 5.y.y

> Performance is one aspect of the price tag, but it is certainly not the only factor.
Very true - and for that reason Postgresql has more going for it than many alternatives, like:
- best licensing options - you don't need to pay a lawyer to go over your contract or license like you should if you use oracle or mysql commercially. And there's no fear that the vendor will change its license terms once you're locked in and start charging an arm and a leg.
- very good foundation - postgesql isn't built from duct tape and bailing wire. The functionality within it is well tested and robust.
- great support for standard database features - whether its subselects, stored procedures, triggers, etc - it's very simple to move from oracle to postgresql.
- great ansi sql support - again, very standard sql - no unnecessarily propretary language elements.

So, yeah - just because Postgresql is performing well on some benchmarks that doesn't mean you should immediately throw out oracle in favor of postgresql. On the other hand, you also shouldn't discard it because it is a good general purpose database solution.
Re:performance isn't the issue by pavera · 2007-07-09 09:37 · Score: 1

Ok, well you've got me there... I've never run a DB server on windows, Oracle on Solaris, PostgreSQL and MySQL on linux...
Re:performance isn't the issue by SEAL · 2007-07-09 09:40 · Score: 1

In case I sounded negative before: I have used postgres in several projects, both personal and professional. I think it is robust and stable, and I mostly agree with what you said.

Would I use it to handle a massively-multiplayer game, or a financial institution? No. Does it fit in for a lot of other projects. Yep.

- SEAL
Re:performance isn't the issue by lakeland · 2007-07-09 09:49 · Score: 1

Hi Pavera,

I'm in a similar situation except I use MySQL. Lets look at why.

1) Ease of use: MySQL has readline, and supports lazier SQL programming. It has loads of syntactic sugar that lets you write simple things in whatever way makes sense to you. Adding users, changing passwords, backing up... it is all _easy_. Of course, postgres can do everything MySQL can too, the question is which is easier.

2) Extensive developer community. We use python and the MySQL/python integration is great. We have a few UDFs that are home-grown but some of them were just downloaded off the net and installed. I'm sure you can find far more for MySQL than for Postgres.

There are some features in postgres that I really wish I had in MySQL: proper commit and rollback, nice/easy triggers rather than the crude enum. Support for R-trees with more than 2 dimensions (but then postgres' 3 dimensions is only a small improvement). Overall though, MySQL just seems to work where postgres can be made to work...
Re:performance isn't the issue by dave562 · 2007-07-09 09:52 · Score: 1

That will never happen. MSSQL server exists because of Visual Studio and the ease of creating programs that rely on it as a backend. Which leads one back to the argument that I've come across a few times in this topic... Postgres is lacking a good development environment.
Re:performance isn't the issue by RelliK · 2007-07-09 10:10 · Score: 3, Insightful

1) Ease of use: MySQL has readline, and supports lazier SQL programming. It has loads of syntactic sugar that lets you write simple things in whatever way makes sense to you. Adding users, changing passwords, backing up... it is all _easy_. Of course, postgres can do everything MySQL can too, the question is which is easier.
PostgreSQL's command-line shell is far better than mysql. In addition to what mysql does, psql supports tab-completion and other nice things. It's been a while since I looked at mysql, but when I did the difference between the two respective command-line interfaces was like the difference between bash and sh.
2) Extensive developer community. We use python and the MySQL/python integration is great. We have a few UDFs that are home-grown but some of them were just downloaded off the net and installed. I'm sure you can find far more for MySQL than for Postgres.
Uh-huh. You are "sure". Several times I asked for support on postgresql mailing lists and the response has always been excellent. Usually I got answers within hours.
I think one of the reasons that mysql became ubiquitous is that it had proper windows support early on. So, just like windows, everyone uses mysql because everyone else does, and they are willing to jump through hoops to work around all the deficiencies the platform has, simply because they don't know any better.

--
___
If you think big enough, you'll never have to do it.
Re:performance isn't the issue by aled · 2007-07-09 10:21 · Score: 2, Interesting

MySQL, as others have pointed out, has better developer support and they know their target audience. They supply a fast, easy to use database for those who don't need a whole lot.

What is exactly easier in MySql?

--

"I think this line is mostly filler"
Re:performance isn't the issue by jedidiah · 2007-07-09 10:33 · Score: 1

Postgres makes Oracle seem user friendly.

That's saying a lot.

--
A Pirate and a Puritan look the same on a balance sheet.
Re:performance isn't the issue by RelliK · 2007-07-09 10:34 · Score: 1

What is exactly easier in MySql?

I was often asking the same question. The only coherent answer I could discern was that mysql was easier to install on windows. Which was true: until 8.0, postgres required cygwin to work on windows. I guess it just never occurred to me that this might be an issue, since I couldn't fathom why anyone would want to do that. But the reality is that most people -- including developers -- use windows. The threshold required to get mysql to work is lower, and people are willing to put up with an inferior solution if it is easier to install.

Which is unfortunate. I was completely blown away by postgres from the first time I used it. After that mysql looked like a pathetic toy.

--
___
If you think big enough, you'll never have to do it.
Re:performance isn't the issue by nuzak · 2007-07-09 10:41 · Score: 1

> Would I use it to handle a massively-multiplayer game

SOE does. Are they big enough?

--
Done with slashdot, done with nerds, getting a life.
Re:performance isn't the issue by RelliK · 2007-07-09 10:45 · Score: 1

Postgres makes Oracle seem user friendly.

That's saying a lot.

I guess you never actually used postgres. Or oracle.

--
___
If you think big enough, you'll never have to do it.
Re:performance isn't the issue by CrazedWalrus · 2007-07-09 10:49 · Score: 1

I keep hearing this replication issue being a show stopper. Are people getting unnecessarily hung up on it?

I work in some massive Sybase shops. Sybase does replication, but it doesn't get used very often. Usually this stuff is just replicated at the storage level on the SAN. Database goes down, bring up the backup using the same SAN storage. No big deal.

Obviously SAN for small shops is out of the question, but we're talking about enterprise usage, where SANs or some sort of online disk replication are SOP.
Re:performance isn't the issue by lakeland · 2007-07-09 11:43 · Score: 1

Firstly, lets compare:
http://www.pgsql.cz/index.php/Project_of_UDF_and_i ts_realization_at_C_for_PostgreSQL
http://empyrean.lib.ndsu.nodak.edu/~nem/mysql/udf/

AFAICT, they're both about as ugly as each other for writing your own UDFs, my point was I can download a few hundred UDFs for MySQL but I don't seem to be able to do the same thing for Postgress.

--

But say I wanted to give postgres another whirl (mostly to see if it is faster than MySQL at what I normally do). I've found SQL Fairy, is that the best approach?

I don't tend to write SQL directly. I write python code that calls the company database wrapper, which calls MySQLdb, which calls DBI, which calls MySQL. Presumably shifting to postgres involves changing the wrapper to call PyPgSQL. Is that any good, or am I in for a world of pain? I use server side cursors extensively and query the tables, but otherwise most code is just passed straight to the interpreter.

PS: I don't suppose you know if R trees are being extended in postgres beyond three dimensions? If so that would be enough justification to make me want to put serious effort in.
Re:performance isn't the issue by lakeland · 2007-07-09 11:59 · Score: 1

No it isn't, Oracle is godawful to use.

It has the most awkward, painful table storage format.
ProC just sucks
Installation will leave you tearing your hair out.
Configuration is a black art.

There are many good things that can be said about Oracle - the use of indicies inside an index so you don't have to create really unnatural composite indicies... tidy SQL syntax... support for R trees of any number of dimensions... Effective subqurery optimisations... Triggers that work (compared to MySQL, not postgres). Ease of use is nowhere near the list.

Oracle is the main reason the role of DBA exists.
Re:performance isn't the issue by allanw · 2007-07-09 12:11 · Score: 1

Well, Postgres has support for Perl or Python or Ruby or Java, or in their pgSQL language for user-defined functions.. that's a huge plus compared to writing in C for simple things. http://www.postgresql.org/docs/8.1/static/plperl.h tml How's that for ugly? ;)
Re:performance isn't the issue by asdfghjklqwertyuiop · 2007-07-09 12:14 · Score: 1

MySQL has readline,

As does PG. With persistent history, tab completion, etc.
supports lazier SQL programming. It has loads of syntactic sugar that lets you write simple things in whatever way makes sense to you.

So you can write code that is vastly non-portable... what a useful feature.
Adding users,

CREATE USER joe;
changing passwords

ALTER USER joe WITH PASSWORD 'password';
backing up

pg_dump dbname
We use python and the MySQL/python integration is great.

We use python too. The PG/python integration is also great.
Re:performance isn't the issue by allanw · 2007-07-09 12:19 · Score: 1
I hate the mysql commandline. Here are the things I can't do in it (or I've been too lazy to try to figure out how):
- Use ctrl-c to reset the prompt (clears the current line). Ctrl-d will then actually quit the program. (not terribly important, but still convenient)
- Actual smart tab completion. SELECT * FROM in psql will show a list of tables/views ONLY. In mysql, it'll show all data structures, such as every single COLUMN for all tables. I can also do: SELECT * FROM table WHERE and get the list of columns for that table ONLY. Tell me if you're able to do that in mysql, because that would be really convenient.
- psql opens up less to show big data
- ... and a couple of things I can't think of at the moment.
Like I said, these features might be available for mysql; I just haven't been able to figure out how (and if so, it is much less apparent than psql)
Re:performance isn't the issue by allanw · 2007-07-09 12:22 · Score: 1

Ah crap, should have previewed. My second bullet point had my markers removed. It should have read:

Actual smart tab completion. SELECT * FROM in psql will show a list of tables/views ONLY. In mysql, it'll show all data structures, such as every single COLUMN for all tables. I can also do: SELECT * FROM table WHERE and get the list of columns for that table ONLY. This even works when you do joins. Tell me if you're able to do that in mysql, because that would be really convenient.
Re:performance isn't the issue by jedidiah · 2007-07-09 12:23 · Score: 1

I've used Oracle extensively and dabbled in both mysql and postgres. Just at the dabbling level there were enough BS nuissances to really put in perspective for me why people have all been using mysql (and not postgres). Postgres SQL syntax is much like using Modula-2: unecessarily anal and restrictive.

--
A Pirate and a Puritan look the same on a balance sheet.
Re:performance isn't the issue by jedidiah · 2007-07-09 12:29 · Score: 1

Why would I care about the Oracle table storage format? You use the RDBMS to access the tables.

Pro*C may be bad. I don't really care. I don't have to use Pro*C. I don't even have to use OCI. If you really had any clue, you would be whining about OCI.

Installation will only be a bother if you are some computer-illiterate granny. Installation hasn't been a problem for years. Admittedly, it was something to write home about roundabout 1997.

If Postgres were so much not a "black art" then Sun could have framed this exercise in what a random undergrad could do with it.

--
A Pirate and a Puritan look the same on a balance sheet.
Re:performance isn't the issue by RelliK · 2007-07-09 12:51 · Score: 1

Postgres SQL syntax is much like using Modula-2: unecessarily anal and restrictive.

Uhhhm what syntax? ANSI SQL? Can you actually provide any examples, cause what you're saying doesn't make any sense.

--
___
If you think big enough, you'll never have to do it.
Re:performance isn't the issue by shish · 2007-07-09 13:14 · Score: 1
What is exactly easier in MySql?
- Inserting "foobar" into a char(5)
- Finding a shared web host who supports it
- Getting your head round the install (My switch to postgres was delayed by several months because there was never a huge "you need to be user 'postgres' to do admin stuff, user 'root' is not enough" sign, so I never managed to create a database to work with...)
--
I mod down anyone who says "I will be modded down for this", regardless of the rest of their comment
Re:performance isn't the issue by pavera · 2007-07-09 13:44 · Score: 1

1) Ease of use: Adding users in postgresql is in my opinion much easier than mysql, createuser or if you've got the GUI, right click, add user... (you can't even add users in the MySQL GUI). Changing passwords, never really done it in either DB, but I can't emagine that is a "chore" in either. Backing up? as if pg_dump was any different the mysqldump (and no, just copying the mysql directory is not a safe backup). Seriously all of the things you mention are not harder in postgresql, maybe a little different, but not harder. As for your "syntactic sugar", and "lazier" programming scare me! I like ANSI standard SQL. I don't like 10 different ways to do a simple thing according to "developer taste". That is a recipe for disaster. Every time you hire a new dev, they complain about "Well, this is a silly way to do xyz, lets rewrite it so its 'right'". Also, its a nightmare for maintainability. If a new dev hasn't seen xyz done a certain way, they won't know what it is, or what the query is doing necessarily, not without doing research at least.

2) Dev community: I've never had a question on postgresql that I couldn't find an answer to, in fact, I've never posted to any message boards because either a) someone already had asked the question and had it answered, or b) it is in the documentation. Can't say that about MySQL. I've posted numerous times about things that PostgreSQL can do, and after searching for hours, I couldn't figure out how to do in MySQL. Invariably the answer to the question when I've posted is either a) a huge nasty cludge of a workaround or b) MySQL version x.y.z doesn't support that, build from source our latest dev snapshot and deploy that on your production server if you need that feature. c) That is in the roadmap, we'll support that feature someday. I program in python primarily too. the psycopg2 module has everything we need... What features does the MySQL "integration" have that psycopg2 doesn't? Both the drivers comply with the Python DB module spec... so basically they afford the same features (except psycopg2 always has transactions)

All in all, I really don't know why these myths about postgresql being hard to administer persist, I use postgresql with PHP, Ruby on Rails, and Django all the time, I've never run into a problem where I went "Oh if only I were using MySQL, I'd have xyz feature or I'd be able to use xyz module"

As for your final comment, I find the exact opposite to be true. When forced (by client requirements) to use MySQL I am constantly working around limitations in the DB, enabling InnoDB (because the 15 year old who installed their web server just used MyIsam) and they are wondering why they keep loosing information, or having orphaned children. I have quite a few more times than one hit unworkable problems in MySQL that PostgreSQL handles flawlessly out of the box.
Re:performance isn't the issue by Harik · 2007-07-09 14:08 · Score: 1

Um, you care about oracle's storage because it forces you to care. You can't do ANYTHING until it's properly configured. It's a serious bitch that way.

And you've seriously GOT to be trolling about postgresql. "user@host w/password on database A" is seperate from "user@host w/password2 at database B" is a mysqlism that drives me up the wall.

Great, virtualized user domains would be nice, but that's not what MySQL is giving you. It's a best-match regex that may change depending on the whims of RDNS (that you may not even have control over). Thanks, but no thanks.

I'll take "create user 'foo' with password 'bla'; grant all on database 'bar' to 'foo';" any day of the week. Both postgres and MySQL have neigh-identical backup, unless you are incapable of changing 'mysqladmin dump database' to 'pgdump database' in your scripts.

Database workload tuning is a black art on all databases. Some just hide the values in obscure non-populated-by-default registry keys and require paid support services to learn about them.
Re:performance isn't the issue by jadavis · 2007-07-09 14:20 · Score: 1

there were enough BS nuissances to really put in perspective

Please provide some constructive criticism. Most people who use postgres enjoy it because it has fewer nuisances than any other database (to them). If you provide the details of what is difficult and how your user experience could be improved, that is much more helpful than saying "postgres is hard to use".

--
Social scientists are inspired by theories; scientists are humbled by facts.
Re:performance isn't the issue by pavera · 2007-07-09 14:26 · Score: 1

I use psycopg2 as my python DB module... I don't know if it is "the best" but I've had no problems with it. It is the recommended module to use with Django (which is what I use it for mostly).

As for SQL Fairy, never used it... Obviously, once you are committed to a platform (any platform) the conversion costs increase. I don't think anyone is "recommending" drop everything and move to PostgreSQL. My point is simply if I'm starting a new project, that is the first DB I look at, and if it satisfies the requirements (99% of the time for me, one customer HAD to have the system on windows, like 5 years ago... so we used SQL Server), that is what I'll use, I haven't had any usability, lack of features, lack of support, or lack of performance problems with PostgreSQL, which seem to me to be the "problems" that most people spout off about when saying PostgreSQL isn't good.
Re:performance isn't the issue by evilviper · 2007-07-09 14:42 · Score: 1

Do a little investigation on setting up PostgreSQL with fault tolerance and replication and you'll quickly see why large corporations cough up money for Oracle.

No, you won't.

The free, BSD licensed PostgreSQL doesn't have much clustering capabilities (basically: read-only). There are, however commercial versions with full clustering support, that are a tiny fraction the cost of Oracle.

It is a fact that many large corporations with huge volumes of transactions use PostgreSQL instead of Oracle, their entire company dependent on it (a credit card company comes to mind).

--
Slashdot gets worse every day... Pipedot: News for nerds, without the corporate slant
Re:performance isn't the issue by jjohnson · 2007-07-09 14:53 · Score: 1

Not these days, they're not :)

--
Anyone who loves or hates any language, platform, or manufacturer, doesn't know what they're talking about.
Re:performance isn't the issue by killjoe · 2007-07-09 15:04 · Score: 1

Postgres has ADO, ODBC, JDBC and .NET drivers as well as drivers for every other programming language in the world.

What does your choice of programming tool have to do with your database choice?

--
evil is as evil does
Re:performance isn't the issue by Anonymous Coward · 2007-07-09 18:09 · Score: 0

1.) EVERY program has readline. Ever heard of rlwrap? Besides, any serious query you write in a separate file anyway and simply run it.

2.) So you're not saying that Python has good RDBMS support, but that MySQL is well integrated into Python?? How does that make a difference? Is there a specific driver for MySQL, but not for other databases? And I'm used to people laughing about Java. Now let me laugh at Python for just a minute. With Java I don't even *notice* what DB I'm working with, seriously.
Re:performance isn't the issue by pavera · 2007-07-09 18:35 · Score: 1

You notice slightly which db you're working with in python, in that you have to load a driver, and if you're writing actual queries you have to write them following the syntax of the DB you're using. However, for most uses, most of the time, it doesn't matter which DB you're using. This guy just doesn't understand database abstraction but python certainly provides it.
Re:performance isn't the issue by smchris · 2007-07-10 00:14 · Score: 1

I agree on the Windows support. Particularly since it made it less risky for publishers to offer LAMP texts. People learn from what's out there.

Personally, my client/server introduction was Oracle and I find PostgreSQL much more "comfortable" than MySQL.
Re:performance isn't the issue by Anonymous Coward · 2007-07-10 02:09 · Score: 1, Interesting

PostgreSQL's command-line shell is far better than mysql. In addition to what mysql does, psql supports tab-completion and other nice things.
MySQL has supported tab-completion for a long time (2000 maybe?). I know it's there in the 3.x series. What are the "other nice things" that you speak off? You chide the parent because he's "sure" yet you won't list any specifics (ok, just one) yourself. Step up and tell us.
Re:performance isn't the issue by Anonymous Coward · 2007-07-10 03:00 · Score: 0

Uh-huh. You are "sure". Several times I asked for support on postgresql mailing lists and the response has always been excellent. Usually I got answers within hours.

Ha! If you were using MySQL it would have been 15 times quicker! ;)
Re:performance isn't the issue by AbbyNormal · 2007-07-10 04:37 · Score: 1

For my SMB, the easy ability to replication my database without any other addons, was a key reason for using MySQL.

--
Sig it.
Re:performance isn't the issue by wfberg · 2007-07-10 06:36 · Score: 1

Postgres SQL syntax is much like using Modula-2: unecessarily anal and restrictive.

Uhhhm what syntax? ANSI SQL? Can you actually provide any examples, cause what you're saying doesn't make any sense.

I think grandparent poster is referring to precisely that. You can do some pretty neat (but weird) stuff in MySQL. It'll allow you to use functions just about anywhere, and alias stuff. IIRC, from back in the day when I made a LAMP website, a query like thus would work on MySQL (contrived example):

SELECT SOUNDEX(a.name) AS s, b.zipcode FROM names AS a, adress AS b WHERE SOUNDEX(b.name) = s;

Also, IIRC MySQL doesn't bother too much with type conversions, you can concatenate and compare numbers and strings, etc.

I don't know about psql, but oracle or even MSSQL won't stand for that sort of thing! The only other "database" I've found so far that will allow you to pull such stunts is Access (using the Jet engine, not the MSSQL interface).

--
SCO employee? Check out the bounty
Re:performance isn't the issue by Sxooter · 2007-07-11 05:19 · Score: 1

Sadly, the simple, easy mysql replication is prone to silent failure, so you have to constantly check to make sure your replication really is replicating. Like a lot of features in MySQL replication is one of those that's better for filling in a check box on a spread sheet than actually providing reliable and functioning replication.

--

--- It is not the things we do which we regret the most, but the things which we don't do.

Re:I do not think it means what you think it means by Anonymous Coward · 2007-07-09 08:56 · Score: 0

"R'ed"

Why did you do that? It doesn't make any sense. Look there - I wrote "doesn't" - which means "does not"; that's called a contraction. What do you think R'ed means? The R in RTFA stands for Read; by adding 'ed onto the end were you saying you "Readed" the article? If so, are you Ralph Wiggum?

Re:on the playground... by teknopurge · 2007-07-09 08:56 · Score: 2, Funny

Intelligent people use SQLite (or Berkeley DB) for apps that don't require a heavyweight DBMS.

Databases are even being used in many situations that would be better addressed using flat files. That kind of mainframe/JCL talk is not welcome here...

--
Website Hosting

Re:I do not think it means what you think it means by PhrostyMcByte · 2007-07-09 08:56 · Score: 2, Insightful

If you want to setup a dedicated database server, you want to know what software with what hardware will run the fastest. So while the benchmarks may not be useful to people wanting to setup a small multi-purpose server, it can still be useful for some people.

Totally unbiased by Frosty+Piss · 2007-07-09 08:56 · Score: 0, Redundant

The test was put together by PostgreSQL's core developers working at Sun.

No question of bias there...

--
If you want news from today, you have to come back tomorrow.

DUH! by Slashdot+Parent · 2007-07-09 09:00 · Score: 4, Insightful

Why this emaciated post made it while mine didn't I'll never know. Yours wasn't posted because it didn't contain enough flamebait.

--
They don't grade fathers, but if your daughter's a stripper, you fucked up. --Chris Rock

MySQL 5 UltraSPARC T1 vs MySQL 5 Opteron by Pap22 · 2007-07-09 09:09 · Score: 1

http://en.wikipedia.org/wiki/UltraSPARC_T1#_ref-0
Led me to...
http://blogs.digitar.com/media/2/T2000_Experience. pdf

where a UltraSPARC T1 customer states that MySQL 5 is up to 13.5 times faster than MySQL 5 on an AMD Opteron.

The results of the original article's data were as follows:

Postresql 8.2 on UltraSPARC T1 - 778.14
MySLQ 5 on AMD Opteron - 720.56

So using my blatantly biased (yet at the same time factual) numbers, I correct the MySQL performance number to reflect what it would be on an UltraSPARC T1: 720.56 x 13.5 = 9727.56

9727.56 > 778.15

MySQL wins!

Re:I do not think it means what you think it means by Weslee · 2007-07-09 09:10 · Score: 1

All the details were published.

From the link in the article you can find the results, or just look here.

Hardware, kernel changes, PostgreSQL configuration, etc.
Its all on that page.

Re:on the playground... by Kalriath · 2007-07-09 09:19 · Score: 1

Ok, apparently I'm somewhat wrong there. I wasn't aware it ran nowadays on your regular desktop PCs with Linux. Though, I must admit I can't see it running very fast compared to software DESIGNED for that type of platform.

How exactly did that get modded informative if it's wrong?

--
For a site about things like basic rights, Slashdot users sure do like to censor "dissent".

Which MySQL? by itsdapead · 2007-07-09 09:33 · Score: 4, Interesting

MySQL is modular - pick'n'mix data storage engines sharing a SQL front end. I can't find the bit in TFA that says which one they compared.

I've always suspected that most MySQL vs Postgres flame wars are based on comparing Postgres with the speed of MySQL/MyISAM (No transactions or relational integrity checks - so, big surprise, dead fast for simple queries) and then waving MySQL/InnoDB around when the functionality issue is raised.

MySQL/MyISAM hits the speed/functionality sweet spot for LAMP data-driven websites, is supported by lots of free webapp software and offered by most decent web hosting services. Comparing it speedwise with Postgres has always been pointless, though. If Postgres has caught up, colour me impressed, but if they're pro-Postgres I bet they're comparing with MySQL/InnoDB (which is a bit closer to like-with-like).

Never quite seen the point of MySQL/InnoDB really - all the advantages of MySQL/MyISAM minus the speed, support by popular webapps, availbility on low-cost hosts... and still lacks the features of Postgres.

--
In a survey of 100 programmers, 111111 thought that duck-typing was a good idea.

Re:Which MySQL? by Anonymous Coward · 2007-07-09 12:16 · Score: 0

This is a bit offtopic, but...
Has anyone used Ingres (who wasn't previously a paying customer)? It was open-sourced a few years ago but I've never heard of anyone adopting it. It seems to be even more obscure than Sybase.
Re:Which MySQL? by byennie · 2007-07-09 13:14 · Score: 1

The point IS pick'n'mix as you put it. InnoDB is more valuable because you don't have to use it for every table. When you need a simple, fast table with no transactions, use MyISAM. When I need transactions, InnoDB. Just need some in-memory lookups? Memory table engine. Clustered data? NDB.

There's a lot to be said for having the ability to choose your engine for each table, even if another product has more combined features in one engine. It even opens some interesting replication tricks if you know what you're doing. Run transactions on your InnoDB table, and then replicate it to a read-only MyISAM table on another server, etc - and it works quite painlessly.

Don't get me wrong, if Postgres has something you need and MySQL doesn't, then yeah- the multiple engines all add up to nothing for you. But for a huge variety of projects, they fit the bill quite nicely.
Re:Which MySQL? by mwkohout · 2007-07-09 13:45 · Score: 1

Totally off topic, but props:

Colo(u)r me impressed....nice Replacement reference.

Paul Westerberg, chief writer for the 'Mats lives in my neighborhood.
Re:Which MySQL? by jadavis · 2007-07-09 15:10 · Score: 3, Informative

MySQL is modular [ from grandparent post ]
The point IS pick'n'mix as you put it. [ from parent ]

I think this is a huge misconception, and completely backwards. MySQL allows you to change storage engines, but the behavior at the semantic level changes. That's the antithesis of modularity: the semantic behavior should remain constant, while the performance changes. If you change both, that's not modularity, that's a new configuration that breaks client applications.

MySQL is configureware, like PHP. Everything you get is a trade. Want full text indexes (MyISAM)? You have to give up transactions (InnoDB). But the marketing material always just says "Yup, we have full text indexes (MyISAM), SQL compliance (strict mode=on), transactions (InnoDB), large number of apps (strict mode=off)". Of course many of the features are mutually exclusive. When postgresql says it has a feature, it's really there.

Just need some in-memory lookups? Memory table engine. Clustered data? NDB.

Seems like the memory engine and NDB are the same thing: http://dev.mysql.com/doc/refman/5.0/en/mysql-clust er-overview.html. That means that your "cluster" is not much of a database, and is totally unacceptable for many applications that require a database. Power off == no more data.

Now compare with PostgreSQL. PostgreSQL has one "storage-engine", but there are many access methods to that storage engine. There's Btree, GiST, and GIN. On top of the access methods, there are also a lot of plans. There's sequential scan, index scan (good for lookups or some sorting needs), bitmap index scan (good for AND/OR with other bitmap index scans, and always orders the I/O in file order), hash join, hash aggregate, merge join, nested loop, group aggregate, etc.

Look at all those algorithms for accessing the single storage mechanism. It's amazing. MySQL doesn't have all those, and even if it did have the algorithms, it couldn't use them. How would MySQL know when to use a hash join and when to use a merge join? PostgreSQL collects statistics, calculates expected costs, and chooses the best plan based on the given query (called a cost-based planner). MySQL uses a rule-based planner (does it have an index? ok, let's use it then). To PostgreSQL, these two queries are different:

(1) SELECT * FROM parents WHERE number_of_children=2;
(2) SELECT * FROM parents WHERE number_of_children=20;

The stats collector would know that #1 will match many more records than #2. It will probably choose a sequential scan for #1, since it needs to read every page anyway. It will probably choose an index scan for #2. Now, if that's in a subselect, postgresql will know that #1 will return a lot of records, and maybe choose a different join algorithm than if it were #2. Again, PostgreSQL chooses these plans for you based on cost -- no special configuration required.

If you want modular, it's being able to replace a full text GiST index with a full text GIN index and the application never knows the difference except performance. And by the way, the DDL in postgresql is transactional, so you don't even have to restart the application even if you do some major restructuring of the table (like replacing a table with an updatable view).

--
Social scientists are inspired by theories; scientists are humbled by facts.
Re:Which MySQL? by byennie · 2007-07-09 16:54 · Score: 1

I think this is a huge misconception, and completely backwards. MySQL allows you to change storage engines, but the behavior at the semantic level changes. That's the antithesis of modularity: the semantic behavior should remain constant, while the performance changes. If you change both, that's not modularity, that's a new configuration that breaks client applications.
In theory, I see your point. It's not truly modular. In practice, it's handy. MyISAM and InnoDB tables work quite well together.

MySQL is configureware, like PHP. Everything you get is a trade. Want full text indexes (MyISAM)? You have to give up transactions (InnoDB). But the marketing material always just says "Yup, we have full text indexes (MyISAM), SQL compliance (strict mode=on), transactions (InnoDB), large number of apps (strict mode=off)". Of course many of the features are mutually exclusive. When postgresql says it has a feature, it's really there.
Right. And that's a good thing about Postgresql. Now tell me, what do you do in Postgresql when you just want a fast table with no transactional overhead (MyISAM)? Or memory only data? Access to the minimal features can be just as valuable as always having everything available. I could care less which company is more honest in their marketing. I guess the MySQL folks are much better marketers, at the cost of a little transparency.

Seems like the memory engine and NDB are the same thing: http://dev.mysql.com/doc/refman/5.0/en/mysql-clust er-overview.html. That means that your "cluster" is not much of a database, and is totally unacceptable for many applications that require a database. Power off == no more data.
No. The memory (or "heap" engine) is just that - a simple in-memory engine: http://dev.mysql.com/doc/refman/5.1/en/memory-stor age-engine.html NDB is the storage engine for MySQL Cluster, and it most certainly is acceptable for applications that need a high-availability clustered solution. The details are way beyond the scope of this post, but it's certainly no joke and is in use in some very high-demand applications (try high-traffic telecommunication apps).

Now compare with PostgreSQL. PostgreSQL has one "storage-engine", but there are many access methods to that storage engine. There's Btree, GiST, and GIN. On top of the access methods, there are also a lot of plans. There's sequential scan, index scan (good for lookups or some sorting needs), bitmap index scan (good for AND/OR with other bitmap index scans, and always orders the I/O in file order), hash join, hash aggregate, merge join, nested loop, group aggregate, etc.
That's nice. Good features. There are also lots of ways to optimize searches in MySQL. BTW, I like Postgresql. Never said I don't.

Look at all those algorithms for accessing the single storage mechanism. It's amazing. MySQL doesn't have all those, and even if it did have the algorithms, it couldn't use them. How would MySQL know when to use a hash join and when to use a merge join? PostgreSQL collects statistics, calculates expected costs, and chooses the best plan based on the given query (called a cost-based planner). MySQL uses a rule-based planner (does it have an index? ok, let's use it then). To PostgreSQL, these two queries are different: (1) SELECT * FROM parents WHERE number_of_children=2; (2) SELECT * FROM parents WHERE number_of_children=20; The stats collector would know that #1 will match many more records than #2. It will probably choose a sequential scan for #1, since it needs to read every page anyway. It will probably choose an index scan for #2. Now, if that's in a subselect, postgresql will know that #1 will return a lot of records, and maybe choose a different
Re:Which MySQL? by jadavis · 2007-07-09 17:49 · Score: 1

Now tell me, what do you do in Postgresql when you just want a fast table with no transactional overhead (MyISAM)? Or memory only data?

When you don't care much about transactions/ACID, you can use functions to manipulate external data (which can be in memory, on a separate server, or anything you want), and then use a set-returning function to return the data in the form of a relation (which you can join against, etc). I don't know whether that's a good solution for the general need you're describing, but it seems reasonable that postgres leaves it up you to use already-existing libraries to manage non-ACID data. You could attach it to memcached, or another database engine, or whatever suits your needs. If you want it to behave more like a relation, then you define the rules so that INSERT/UPDATE/DELETE behave like you want them to.

No. The memory...

Ok, I see that they are two distinct ways of storing data. I also see that memory tables only store fixed-length records and can't contain BLOB or TEXT data.

Also, keep in mind that PostgreSQL stores it's temporary tables in a similar way. I guess you could consider that a different "storage engine" too, except that it still allows variable-length types.

The optimizer is solid most of the time, and there are numerous ways to influence it's choices.

But does MySQL even have, for example, a hash aggregate, or does every GROUP BY require a sort? Can it turn an index scan into a bitmap and fetch the tuples in file block order? I don't know a lot about MySQL and the developments it's made recently, but I don't understand how it can use those types of plans if it doesn't use a cost-based planner. You may not need any of these things, but when you do, it can be a huge performance win.

Look, I like Postgresql quite a bit. I think you mistook me for someone who was saying that MySQL is great and Postgresql sucks. You can argue the merits of Postgresql 'till you are blue in the face, but it doesn't make MySQL any less useful.

MySQL may be useful, but I don't consider modularity to be a strong point of MySQL. I really just latched onto the one concept of modularity (which was expressed by the post to which you replied and seconded by your own post). I think that is a big misconception, and I stand by that claim.

That being said, you obviously are aware of the tradeoffs you're making. I am posting because I want to express my opinion to others who might not be aware of the limitations of MySQL's modularity, and the modularity that PostgreSQL does offer.

I don't see how you can say that MySQL storage engines aren't useful, since you've clearly never used them.

False on two counts: I never said they weren't useful, I said they weren't modular because they affect the semantics, and are therefore configuration; and I have used MySQL before. If I said something factually incorrect, please correct me.

--
Social scientists are inspired by theories; scientists are humbled by facts.
Re:Which MySQL? by byennie · 2007-07-09 18:34 · Score: 1

When you don't care much about transactions/ACID, you can use functions to manipulate external data (which can be in memory, on a separate server, or anything you want), and then use a set-returning function to return the data in the form of a relation (which you can join against, etc). I don't know whether that's a good solution for the general need you're describing, but it seems reasonable that postgres leaves it up you to use already-existing libraries to manage non-ACID data. You could attach it to memcached, or another database engine, or whatever suits your needs. If you want it to behave more like a relation, then you define the rules so that INSERT/UPDATE/DELETE behave like you want them to.
Well yeah, you *could*... or you just change one word of your table definition with MySQL. There are tons of ways to implement simple in-memory storage, including just using memcached directly. But the point is, just because it's not ACID doesn't mean I want to jump through hoops. Forget in-memory stuff, this is exactly why the lightweight MyISAM is nice to have around. I get everything I want from a simple non-ACID storage and I don't have to do anything special to get all the joins / relations I want.

Ok, I see that they are two distinct ways of storing data. I also see that memory tables only store fixed-length records and can't contain BLOB or TEXT data. Also, keep in mind that PostgreSQL stores it's temporary tables in a similar way. I guess you could consider that a different "storage engine" too, except that it still allows variable-length types.
Well I would hope one wouldn't use memory tables for storing BLOBs =). The point is, as usual, that they have a specific purpose instead of a general one. MySQL has temporary tables as well, and no, they aren't considered a different "storage engine".

But does MySQL even have, for example, a hash aggregate, or does every GROUP BY require a sort? Can it turn an index scan into a bitmap and fetch the tuples in file block order? I don't know a lot about MySQL and the developments it's made recently, but I don't understand how it can use those types of plans if it doesn't use a cost-based planner. You may not need any of these things, but when you do, it can be a huge performance win.
Sorry, I don't know off-hand. What I do know is that MySQL has good performance and a plethora of ways to optimize it. Just like Postgresql. I'm sure there are optimizations possible in each product that are not in the other. Cost-based planners are nice, but they are hardly the only factor in performance. Bottom line, both products can give you pretty impressive performance if you understand your data and how to tweak them.

MySQL may be useful, but I don't consider modularity to be a strong point of MySQL. I really just latched onto the one concept of modularity (which was expressed by the post to which you replied and seconded by your own post). I think that is a big misconception, and I stand by that claim. That being said, you obviously are aware of the tradeoffs you're making. I am posting because I want to express my opinion to others who might not be aware of the limitations of MySQL's modularity, and the modularity that PostgreSQL does offer.
Fair enough. I can't speak for others, and I have no doubt that MySQL storage engines are misunderstood much of the time. I just disagree that Postgresql is *more* modular rather than just having a different approach - but I doubt we will sway each other on that one.

False on two counts: I never said they weren't useful, I said they weren't modular because they affect the semantics, and are therefore configuration; and I have used MySQL before. If I said something factually incorrect, please correct me.
OK - maybe you didn't say they had no usefullness, but you did start this whole thread by contrasting MyISAM and InnoDB and saying that you didn't "see the point"
Re:Which MySQL? by jadavis · 2007-07-09 19:53 · Score: 1

OK - maybe you didn't say they had no usefullness, but you did start this whole thread by contrasting MyISAM and InnoDB and saying that you didn't "see the point" in InnoDB because of the tradeoffs introduced between the engines.

To set the record straight, it wasn't me who said that, it was "itsdapead".

I don't think my posts were anything more than a long-winded way to try to promote my own strict definition of "modular," with more than a hint of postgresql advocacy thrown in :)

--
Social scientists are inspired by theories; scientists are humbled by facts.
Re:Which MySQL? by byennie · 2007-07-09 20:06 · Score: 1

Ah, sorry, missed that. Thought it was all the same person. I'm ok with Postgresql advocacy. As long as you aren't pushing MS-SQL.
Re:Which MySQL? by midom · 2007-07-10 19:59 · Score: 1

One of things missed by PG camp is that MySQL/InnoDB has row visibility information in indexes. That makes index-only reads possible (what beats the crap out of any bitmap based reads). Also, index-based reads don't need sorting afterwards, as well as index-based GROUP BYs are done. Anyone with that in mind can make really efficient apps. Do note, that with enough of scatter of data bitmap-based physical-order reads start making no sense..
Re:Which MySQL? by jadavis · 2007-07-12 04:18 · Score: 1

That makes index-only reads

That's a valid point. I don't think it's missed by PG though, it's been discussed before on several occasions. Index-only scans are very similar to two other concepts:
* storing the entire table in a Btree. This makes the primary index useful, at the cost of secondary indexes and other overhead of using a btree.
* Maintaining some cluster order on the heap

Visibility in the index is not out of the question for PG, but there are some complications. I think the main problem is with keeping the visibility information up to date in the indexes when it changes in the tuple. If you have some implementation suggestions I'd be interested to hear them.

--
Social scientists are inspired by theories; scientists are humbled by facts.
Re:Which MySQL? by Walles · 2007-07-17 03:03 · Score: 1

I don't really see how "No transactions or relational integrity checks" could really be useful for any kind of data storage. One bug in your application and your database is hosed.
Under what kind of circumstances would somebody not see this as a problem?

--
Installed the Bubblemon yet?

Re:I do not think it means what you think it means by Qzukk · 2007-07-09 09:36 · Score: 1

(Whether the necessary remaining data (what are hardwares x and y? how was PostgreSQL tunde) is published now, later, or never, is a separate issue. B-) ) From the SPEC site, click on the "Disclosures" links to find out the hardware and software used for each part of the test. For instance, the postgres server ran on a SunFire T2000 with one 8 core (4 virtual threads per core) UltraSPARC T1 processor at 1.2GHz, 16GB of ram running 64-bit solaris 10, etc. The HTML Disclosure links to the "Disclosure Archive" which is a .jar with all of the configuration files used.

--
If I have been able to see further than others, it is because I bought a pair of binoculars.

Re:on the playground... by Fozzyuw · 2007-07-09 09:36 · Score: 2, Interesting

Intelligent people use SQLite (or Berkeley DB) for apps that don't require a heavyweight DBMS.

Databases are even being used in many situations that would be better addressed using flat files.

Now there's a good point. Is there any good documentation out there on databasing on what is the best solution for what kind of problem? Particularly, I would like to see something between flat-files and DBMS and then between different DBMS.

Though, technically, I'm stuck between whatever our 'host' is giving us (currently PostgreSQL, but usually MySQL, and I dabbled with MSSQL when an intern for the Government), I'd like to know when it might be better to actually just store some stuff in a flat file or even use an XML DB (which one collage professor loved to talked about as a holy grail).

Outside of "Basic Database Theory and Generic SQL" books, a programming class on Databases (how to build tree structures and navigate them, etc), I've never really got any 411 on the benefits of using MSSQL, Oracal, MySQL, PostgreSQL, etc and I've never gotten the change to learn more about advanced database features such as stored procedures or functions or whatever they might be called and what they're good for.

Recommended books, anyone?

Cheers,
Fozzy

p.s. I work mostly with internet applications.

--
"The past was erased, the erasure was forgotten, the lie became truth." ~1984 George Orwell

Re:on the playground... by kpharmer · 2007-07-09 09:41 · Score: 1

> Though, I must admit I can't see it running very fast compared to software DESIGNED for that type of platform.

Actually, it's probably the fastest solution out there for heavy reporting/analytical workloads on windows, linux or unix. Not sure about teradata and informix - haven't looked at them in quite a while.

Teradata, informix and db2 were doing the 'beowulf' thing years before anyone on slashdot asked what a beowulf cluster of these would be like. So, you can easily distribute data across a hundred db2 server blades to get very fast response times over terabytes of data. This can be far faster at the high-ends than the typical oracle scenario of range partitioning on a 32-way smp. And oracle grid is more failover oriented than performance oriented, so that doesn't really compete either, last I looked.

I've built and managed very large databases on oracle, db2 and sql server in the last five years. Of them all, db2 on unix was the easiest to build and manage. It did take a little extra time - since some of the tools are a little finicky, but the manageability and performance was far better than with either oracle or sql server. And with very large data - far, far better than with mysql or postgresql.

Re:on the playground... by cervo · 2007-07-09 09:51 · Score: 1

Please mod parent down. DB2 is not only for mainframes. I worked at a place which used a bunch of DB2 boxes running AIX.

While DB2 does run on mainframes, it also runs on many other operating systems, including Linux.

http://www-306.ibm.com/software/data/db2/9/

As you can see, DB2 runs on at least Linux, Windows, and Solaris. It also runs on AIX, even though I didn't see it mentioned in a quick glance on the page. Please check your facts next time.

Think of it as marrying Oracle Corporation. by Futurepower(R) · 2007-07-09 09:53 · Score: 1

It's like getting married. "How much money do you have?" "Well, that's what it costs."

Re:on the playground... by imemyself · 2007-07-09 10:02 · Score: 1

I get the impression that the core of it wasn't written by MS way back when, though.

MS licensed Sybase's database for Windows back in the late eighties/early nineties and eventually MS SQL Server grew out of that.

--
Every time you post an article on Slashdot, I kill a server. Think of the servers!

VACUUM FULL / VACUUM ANALYZE by Anonymous Coward · 2007-07-09 10:05 · Score: 0

I have used Postgres in two different production environments over the past 6 years, but they have both been environments where it was possible to effectively take the system down for a little while each night while a vacuum is done. The vacuums seem to be quite necessary, as without them, performance slowly and steadily degrades as the trees get funky and the tables grow larger and larger with unused update holes in them. It would probably be possible to hold off and do them weekly rather than daily, but the point is that they seem to be necessary, and they seem to take the system down while they're running. Between vacuums, the database is fast.

This leads me to believe that Postgres isn't suitable as part of a system that needs to be running 24/7, such as an online order processor. What am I missing?

Re:VACUUM FULL / VACUUM ANALYZE by Anonymous Coward · 2007-07-09 10:29 · Score: 0

You're missing that vacuums don't take the system down while they're running.
Re:VACUUM FULL / VACUUM ANALYZE by Per+Wigren · 2007-07-09 10:33 · Score: 2, Informative

What am I missing? You are missing that recent versions don't need that anymore as it will auto-vacuum for you incrementally in the background. You should still run a "vacuum full analyze" after schema changes though but you usually don't change the schema live on heavily loaded databases anyway...

--
My other account has a 3-digit UID.
Re:VACUUM FULL / VACUUM ANALYZE by AaronW · 2007-07-09 10:34 · Score: 1

I believe you no longer need to take the database offline and it now has an autovacuum daemon which automatically performs the operation when needed.

--
This post is encrypted twice with ROT-13. Documenting or attempting to crack this encryption is illegal.
Re:VACUUM FULL / VACUUM ANALYZE by Anonymous Coward · 2007-07-09 10:40 · Score: 0

You're missing that vacuums don't take the system down while they're running.

Sure they do. On Postgres 8.1.0, if I start a "vacuum full transactions"
then this, if performed programmatically or in another psql session, will
hang until the vacuum is completed:
"select * from transactions where accountid=11011"
Re:VACUUM FULL / VACUUM ANALYZE by Anonymous Coward · 2007-07-09 11:50 · Score: 0

You're not supposed to run VAUCUM FULL often. In fact, it might even slow down the database (as it cleans up "dead" space, which could be used by further writes), and it takes much longer and locks the database while it's running.
Re:VACUUM FULL / VACUUM ANALYZE by Anonymous Coward · 2007-07-09 12:03 · Score: 0

To clarify, just run VACUUM, or turn on autovacuum as the other posters have suggested.
Re:VACUUM FULL / VACUUM ANALYZE by kestasjk · 2007-07-09 14:58 · Score: 1

Incidentally SQLite also requires regular VACUUMs.

--
// MD_Update(&m,buf,j);
Re:VACUUM FULL / VACUUM ANALYZE by jadavis · 2007-07-09 15:19 · Score: 1

You should still run a "vacuum full analyze" after schema changes

Do not run VACUUM FULL under normal circumstances. It's a special command that's used to return free space in a table to the operating system by compacting the file and eliminating internal free space. PostgreSQL will reuse this space anyway, so normal operation does not require VACUUM FULL, a plain VACUUM will suffice.

Schema changes may or may not affect the need to run VACUUM FULL, but if you are worried about the performance impact of VACUUM FULL, read the docs, because chances are you don't need it.

--
Social scientists are inspired by theories; scientists are humbled by facts.

Re:on the playground... by MarkLewis · 2007-07-09 10:07 · Score: 1

Microsoft bought Sybase SQL Server and released it as MSSQL 6. Since then, Microsoft has maintained the code independently of Sybase. So yes, it wasn't written by MS way back in the days. MS did what they always do and bought one of the underdogs, then marketed the snot out of it.

If you prefer MSSQL, then I'm assuming that you haven't developed any sophisticated transactions in a high-concurrency environment. It's painful to work around the limitations of the old system of hierarchal locks that MSSQL uses (as does DB2, and MySQL with anything except for InnoDB tables). It can be done, and the app can be coerced into working well if you base your design around MSSQL's limitations, but it requires careful planning. (This is somewhat outdated, since MSSQL did legitimately get much better in this regard with 2005)

From a development point of view though, once you've used any system which supports multi-versioning concurrency control PostGres, Oracle, MySQL with InnoDB, or even MSSQL 2005 using the new snapshot isolation features they bolted on, you'll never want to go back :)

Re:on the playground... by Anonymous Coward · 2007-07-09 10:10 · Score: 0

Yes, those folks at Yahoo, Ebay, Amazon, Google, and even Hotmail all run that lightening fast MSsql. You were talking about all those systems for tracking games scores with, yes? Because if you meant REAL work, then time to chuckle.

Re:on the playground... by Anonymous Coward · 2007-07-09 10:12 · Score: 1, Informative

Berkeley is extremely fast for simple key/value pairs. It's often embedded and has bindings to every major scripting language.

SQLite is a small and fast database that works from local files. Again it's often embedded and there are bindings for all major scripting hosts. You could run into concurrency issues using it under heavy load.

Flat files are simple and fast, you have the concurrency issue and need to lock the file for writes.

XML is typically flat files but sleepycat and others have XML databases. XML is mainly useful for interchange where several apps need access to the same data. Leaner formats like YAML and JSON are gaining ground for serialized data.

Then you get the full blown RDBMS which are first choice for large scale apps with multiple clients. Multi user ERP, accounts packages, busy web forums - it's the right choice for these apps. Using MySQL or Postgres to catalog your personal DVD collection or for a personal blog/forum with only a handful of users is worse than overkill - it's ridiculous.

I once did benchmarking by Stinking+Pig · 2007-07-09 10:15 · Score: 5, Interesting

I worked for a company whose product ran on MS-SQL, PostgreSQL, and Oracle. Should I explain why we didn't support MySQL or not? It'll draw fanboys either way. I used the same server, reinstalled the OS (Red Hat Enterprise 3 or Windows 2000) and database between each test, and rebuilt the application server to be extra sure.

Since it was more difficult to write Oracle-compliant SQL and we didn't have a lot of Oracle customers, the developers didn't care to spend time on it, and our stuff ran about 20 percent slower there. That's after a lot of tuning time, it was 50% slower on a default install. Oracle 9 took two days to install and tune, plus another two days of preparation. I was particularly underwhelmed that I had to deal with stupid errors like tarballs that extracted onto themselves and assumptions about the shell being used. At the time, Oracle was a very Solaris-like experience; user-unfriendly to the extreme.

Postgresql 7 ran great; it was neck and neck with MS-SQL in all tests, after proper tuning, and 30 percent slower on a default install. Postgres took half a day to install and tune, but it took me a week and conversation with the postgres mailing lists to find out what needed tuning. Still, we were able to put together a document that took users from bare metal to RHEL+Postgres in four hours if they had all their media handy.

Microsoft SQL Server 2000 ran great with no tuning at all, and took fifteen minutes to install. It also cost as much as paying me to do the entire set of tests. OS installation/patching times and tested workloads were the same for all three tests.

YMMV.

--
"Nothing was broken, and it's been fixed." -- Jon Carroll

Re:I once did benchmarking by glwtta · 2007-07-09 11:03 · Score: 0, Troll

Microsoft SQL Server 2000 ran great with no tuning at all, and took fifteen minutes to install. It also cost as much as paying me to do the entire set of tests.

Sounds great! Which version of Linux were you running it on?

Seriously, I think I'd be happily using many Microsoft products, if it wasn't for Microsoft the company.

--
sic transit gloria mundi
Re:I once did benchmarking by Slayer_X · 2007-07-09 11:03 · Score: 1

"Postgres took half a day to install and tune" ???
are you serious? I can install(from sources) and tune Postgres in a couple of hours.

btw, you CAN'T tune MSSQL :P

--
- Slayer_X
http://www.slayerx.org/
Lima
Re:I once did benchmarking by RedWizzard · 2007-07-09 13:12 · Score: 1

I was particularly underwhelmed that I had to deal with stupid errors like tarballs that extracted onto themselves and assumptions about the shell being used. Ok.
Microsoft SQL Server 2000 ran great with no tuning at all, and took fifteen minutes to install. So you did have Windows boxes but didn't try Oracle under Windows? It's pretty easy to install - probably around the same 15 mins as SQL Server.
Re:I once did benchmarking by Shados · 2007-07-09 14:28 · Score: 1

Half a day of work is usually 3.5 to 4 hours for most places, so that comes pretty close to your "couple of hours". And of course you can tune MSSQL, it has quite a few options to tweak and optimise ...it just tends to configure them correctly on its own.
Re:I once did benchmarking by Stinking+Pig · 2007-07-09 14:42 · Score: 1

I repeat: Still, we were able to put together a document that took users from bare metal to RHEL+Postgres in four hours if they had all their media handy.
I presume that you are not a user, and that you are not including OS installation.

--
"Nothing was broken, and it's been fixed." -- Jon Carroll
Re:I once did benchmarking by Stinking+Pig · 2007-07-09 14:47 · Score: 1

This was four years ago, I don't recall Oracle for Windows being an option at the time. I could be wrong. I know there were Oracle client tools for Windows because installing it destroyed the OS on one of my workstations :) I haven't had to install Oracle since, but I have worked with a few Oracle installations since then, but they were all on Solaris.

--
"Nothing was broken, and it's been fixed." -- Jon Carroll
Re:I once did benchmarking by WasterDave · 2007-07-09 15:20 · Score: 1

It also cost as much as paying me to do the entire set of tests.

Putting a value on your time? You must be lost, this is slashdot :)

Dave

--
I write a blog now, you should be afraid.
Re:I once did benchmarking by RedWizzard · 2007-07-09 16:38 · Score: 1

Oracle has been available for Windows for at least 10 years. I first used it on Windows with version 7. IIRC Oracle 7 was a bit of a pig to setup databases on but since version 8 it's a breeze.

Re:on the playground... by Spy+der+Mann · 2007-07-09 10:17 · Score: 1

I AM collecting paychecks, and I use mySQL.

Of course... with InnoDB.

Re:on the playground... by coyote-san · 2007-07-09 10:18 · Score: 1

Stored procedures are bits of code (previously database-specific, but there's often support for java, perl and python now) that are stored in the database and can be executed by the server. There are at least three times when they're a Really Good Idea:

1. when you want to make sure something is always done despite lazy or even hostile users. You attach the SP to standard operations. E.g., it's common to set a 'last modified date' on every insert and update to a table. Depending on the DB you could even do fancy stuff like sending an alarm when certain actions are performed. You can even support things like updateable views this way.

2. when you want to really lock down your database. You can revoke 'insert' and 'update' rights from everyone and force them to go through a stored procedure to insert/update data. No risk of SQL injection, you can do arbitrarily complex input validation, etc.

3. when you want to perform complex aggregation operations. I had a problem where I needed to create a table that contained the rolling average of the 30 most recent values for each date. It was much faster (orders of magnitude faster) to do this in a SP than programmatically on a client.

Many DB also allow you to create 'user-defined functions' and 'user-defined types', albeit at a far higher integration cost. The first allows you to add C functions to the database server itself. This would allow you to perform fairly complex operations, but the need for it has dropped significantly with direct support for perl/java/etc., but you can still use them when you don't want to make it easy for somebody to alter your procedures.

The latter (with appropriate user-defined functions) allow you to create new first-order objects in your database. E.g., I've written a user-defined type that can hold digital certificates and wrapped encryption keys. Standard databases can always store them as blobs, but with the UDT I can provide an extremely powerful set of tools to the stored procedures and sql queries.

--
For every complex problem there is an answer that is clear, simple, and wrong. -- H L Mencken

I love this by RelliK · 2007-07-09 10:18 · Score: 2, Funny

Back when mysql was faster than postgres for simple selects, mysql proponents were jumping up and down about performance. And, in fact, a very narrow definition of performance, with only simple reads. No writes. Single CPU. One client. You can't step too far away from those parameters because then mysql performance starts to suck.

Now that there is a respectable benchmark showing postgresql to be faster, performance is suddenly not the issue.

Yep. It's right up there with mysql developers claiming that transactions are useless back when mysql didn't have them.

--
___
If you think big enough, you'll never have to do it.

Re:I love this by SEAL · 2007-07-09 11:11 · Score: 1

Congrats on not even reading what I wrote.

a) I'm not a MySQL proponent.
b) When I mentioned performance I was discussing postgres versus Oracle.
b) I said performance is not the only issue for an enterprise database solution, which MySQL is not.

But I guess expecting someone to read parent posts on Slashdot is sortof like expecting someone to read the article.
Re:I love this by Anonymous Coward · 2007-07-09 11:24 · Score: 0

yeah yeah that should be a) b) c)

Re:I do not think it means what you think it means by jedidiah · 2007-07-09 10:24 · Score: 1

I am more interested in how it handles an axe wielding maniac going postal in the datacenter, or what would happen of a mile wide tornado struck town, or what would happen if there's a bug, or if the project becomes wildly successful and does 10x or 100x the processing.

--
A Pirate and a Puritan look the same on a balance sheet.

Version 8.1 :-) by Ayanami+Rei · 2007-07-09 10:25 · Score: 2, Interesting

8.1 automatically does table maintenance (vacuum, stats, etc.)
http://www.postgresql.org/docs/8.2/static/routine- vacuuming.html#AUTOVACUUM

You can tune it as to when to do such operations, or what max percent of normal I/O can be used for those tasks.

--
THIS THING CAN TURN ON A DIME, MACROSSZERO STYLE ALSO FUCK BETA, ~NYORON

Re:I do not think it means what you think it means by jedidiah · 2007-07-09 10:27 · Score: 1

Not necessarily.

Oracle will happily run on the cheap crap wintel servers that Postgres will. It will even do so under the typical OS of choice for running Postgres on cheap crap wintel servers.

Oracle will also theoretically run on some 64 cpu monster (from Sun, IBM or SGI).

--
A Pirate and a Puritan look the same on a balance sheet.

PostgreSQL's Core Developers @ Sun? by allenw · 2007-07-09 10:34 · Score: 1

I thought Sun paid Pervasive for their distribution in Solaris 10. Has this changed or by "core developers" do we really mean "people who commit the pkgs to the WOS" or "people who actually work at Sun who also just happen to work on Pg in their free time"?

Re:PostgreSQL's Core Developers @ Sun? by greg1104 · 2007-07-09 11:59 · Score: 2, Informative

Josh Berkus, the person whose blog was referenced here, is one of the PostgreSQL core developers, and he's currently employed by Sun to work on projects like this--which includes contributions back to the PostgreSQL development efforts. He's been doing a lot of work scaling performance upwards to larger capacity servers than the database has traditionally been deployed on.

Re:I do not think it means what you think it means by IdleTime · 2007-07-09 10:43 · Score: 1

PostgresSQL: 778.14
Oracle 10gR2: 6,812.79

Real close, I agree.. Postgres had a 40GB cache, Oracle used 21GB
See http://www.spec.org/osg/jAppServer2004/results/res 2006q4/jAppServer2004-20061121-00046.html

--
If you mod me down, I *will* introduce you to my sister!

Now THAT is a summary by p3d0 · 2007-07-09 10:51 · Score: 1

That is all.

--
Patrick Doyle
I mod down every jackass who puts his moderation policy in his sig. Oh, wait a sec....

Tuning by Anonymous Coward · 2007-07-09 11:05 · Score: 0

I have heard from several different sources that postgres is much better than mysql etc, and all you have to do is to properly tune it. I have spent days converting databases, and trying to tune the parameters and queries. I have examined the query plans, and they seem to be pretty optimal to me, but still - I can't seem to beat MySQL.

So, why is it so hard to tune postgres? Would it be possible to make it autotune some parameters? Is there any outstanding tutorials out there somewhere were I can learn all this magic (yes, I have looked high and low but all I have found seem pretty basic)?

Re:Tuning by turing_m · 2007-07-09 11:31 · Score: 1

What sort of tables are you using in MySQL... InnoDB or something else? Only if you were using InnoDB tables would it be an apples to apples comparison.

--
If I have seen further it is by stealing the Intellectual Property of giants.

Re:on the playground... by Kalriath · 2007-07-09 11:23 · Score: 1

I have an intense dislike of Oracle. PL/SQL is a nasty language. And MSSQL and MySQL aren't even on the same level. MySQL is generally seen as one of the lower tier of "low-range database servers" while MSSQL is considered "mid-range". Oracle is considered "high-range", and it's ease of use is pretty much testament to that.

That being, the higher up the chain the harder to bloody configure and use.

--
For a site about things like basic rights, Slashdot users sure do like to censor "dissent".

Re:I do not think it means what you think it means by Jack9 · 2007-07-09 11:28 · Score: 1

Which makes the results pretty much useless.

Not necessarily.

Pretty much, it does.
Running on an Xbox versus an IPX, hmm. With no data, you assume the worst case scenario because any given variable is UNKNOWN. RAM, Processor, disk IO?

Correction, it certainly does.

--

Often wrong but never in doubt.
I am Jack9.
Everyone knows me.

Re:on the playground... by Anonymous Coward · 2007-07-09 11:34 · Score: 0

I'd add

4. You're supporting multiple front ends (e.g., web and native apps) or providing an SQL API and want to enforce business/data model logic at a common point, but generic SQL constraints aren't enough.

John.

Re:on the playground... by Vrallis · 2007-07-09 11:34 · Score: 1

I get the impression that the core of it wasn't written by MS way back when, though. And it sure wasn't built by the Windows team.
It was written by Sybase, bought by MS and forked somewhere around Release 10. MS then ripped out the good parts (i.e. the ability to run on real operating systems) and rebranded it. Can you tell I come from a Sybase shop?

Re:I do not think it means what you think it means by swusr · 2007-07-09 13:01 · Score: 2, Informative

Real close, I agree.. Postgres had a 40GB cache, Oracle used 21GB

PostgreSQL setup: 8 cores, 16GB RAM.

Oracle setup: 64 cores, 512GB RAM.

--
- Sw Usr

Non-native English speaker? by Estanislao+Mart�nez · 2007-07-09 13:23 · Score: 1

Do spell checkers actually catch that error?
I bet you the submitter's first language is German.
Don't be such a prick.

--

Are you adequate?

Re:I do not think it means what you think it means by timmarhy · 2007-07-09 13:31 · Score: 1

I think he is reffering to larry's insane licensing polices, such as dual core = 2 cpu's

--
If you mod me down, I will become more powerful than you can imagine....

Re:I do not think it means what you think it means by IdleTime · 2007-07-09 13:32 · Score: 1

Ooops, sorry, 1133 is the comparable result.

More than 50% higher performance.

--
If you mod me down, I *will* introduce you to my sister!

Re:on the playground... by Dan+Ost · 2007-07-09 13:52 · Score: 2, Interesting

The guidelines are pretty simple, but highly subjective.

Here are your options:
A. flat file
B. key-indexed (berkley db, dbm, gdbm)
C. feature/config light db (sqlite)
D. feature/config heavy db (postgrsql, oracle)

* Use A for configuration
* Use A if you have a small amount of data and can slurp it all into memory.
* Use B if you only access data via a known unique value and never compare records.
* Never use either A or B if the data is shared between processes unless file locking is okay.

For anything else, use C or D. If you've got a lot of familiarity with D, sometimes it's easier to use D than C, even if D seems like overkill. My recommendation is to get familiar with C and then worry about D once you understand the limitations of C. It'll help you appreciate what D gives you when you're ready for it.

--

*sigh* back to work...

Re:I do not think it means what you think it means by Anonymous Coward · 2007-07-09 14:09 · Score: 0

And, my personal favourite, "Oh, you bought all of your servers under our old licensing plan? Well we just changed it so please give us more money."

Re:on the playground... by jadavis · 2007-07-09 15:30 · Score: 1

Intelligent people use SQLite (or Berkeley DB) for apps that don't require a heavyweight DBMS.

Databases aren't just about performance. Type checking is a big reason why RDBMSs are more prevalent.

Type checking databases are more useful in the same way that a type-checked language is more useful than assembly. Assembly is very simple to program in, but all variables are just bytes. SQLite can't order by date, because dates are strings and don't order the same way. I'm not saying SQLite is bad (not at all, I think it's an interesting project), but simpler does not always mean better.

--
Social scientists are inspired by theories; scientists are humbled by facts.

Re:I do not think it means what you think it means by aliquis · 2007-07-09 15:31 · Score: 1

Well, for people who consider prices I guess it isn't since the oracle system cost more (exclusive software cost), thought then the choice of systems might have been "optimized" aswell.

Re:on the playground... by darnok · 2007-07-09 16:49 · Score: 1

I'm with the parent in terms of when to choose what...

For anything "small" (and you can put your own definition to that) and Web-based, SQLite is my first choice. Works well, solid, runs on any platform, and there's only a single file to worry about when it comes time to backup/restore so development version control is a non-issue. In particular, I'd suggest that SQLite would be a fine choice for just about any small-scale DB-backed Web site out there (again, choosing your own definition for the word "small").

For bigger stuff, I'd choose Postgres over MySQL, but that's more a personal preference than anything else at this point. There's pros and cons to both - coming from an Oracle background, I find the mentality behind Postgres easier to work with, but I'll happily work with MySQL when the situation requires it.

Oracle, DB2, MS-SQL - you've got to use them when the situation demands it. They all work great, scale well and have excellent support, but you pay for it. I wouldn't even think about using them with a startup, but many big corps won't consider anything else.

Re:on the playground... by Anonymous Coward · 2007-07-09 16:51 · Score: 2, Informative

SQLite has before insert triggers. Why doesn't MSSQL?

Re:I do not think it means what you think it means by westyx · 2007-07-09 17:14 · Score: 1

Try following the HTML link - http://www.spec.org/jAppServer2004/results/res2007 q3/jAppServer2004-20070606-00065.html

Is more indepth on the systems and tuning parameters used.

Re:on the playground... by olabri · 2007-07-09 19:14 · Score: 1

DB2 is used as (optional) storage engine on later versions of Lotus Domino, which makes it widely used in larger enterprises, not necessarily on large irons. http://www-142.ibm.com/software/sw-lotus/products/ product4.nsf/wdocs/nsfdb2

Why not have opensource benchmark? by TrueKonrads · 2007-07-09 19:54 · Score: 1

I think this sounds pretty obvious, probably has been done:

Have an open source suite that allows anybody to verify the test results. Let the experts tweak the configuration for each database. Provide free downloads to databases and/or easy install. This way entire community is tweaking the configurations for different servers and posting back results.

Not only this would be a more or less "proper" benchmark, but also the secrets of db tuning, if there are any, will be revealed to public. Vendors get to brag if they are in top3, rest get a nice motivation to improve.

Just my $2,000,000

--
Lone Gunmen crew.

Re:on the playground... by Senjaz · 2007-07-09 20:45 · Score: 1

Sadly has a certain amount of truth to it. I've been trying to convince my company to use PostgreSQL for years. Yet I'm still stuck having to use MSSQL. The it's free arguement doesn't work very well. MSSQL is free for developers when you already have the rather expensive MSDN subscription, and you can package MSDE with a runtime for free.

Sure it has a pile of restrictions and performance sucks but management doesn't see it that way. Nobody every got fired for buying IBM^W Microsoft...

--
Don't blame me - this .sig had steal me written all over it.

Mod parent(s) up, up, up by Anonymous Coward · 2007-07-09 21:49 · Score: 1, Informative

This is the best post I have read in ages. I am old school. I like DB's and when doing requirements gathering am already thinking in terms of tables and relational schema. In a really well designed relational schema it is possible to do very complex operations without ever resorting to plsql/transact etc. simply using plain sql (insert/update etc). I hate cursors/recordsets. As soon as I see these constructs in the db I generally find a VB/java programmer messing where he is not wanted.

PS: Oracle is not Fortran - PL/SQL is derived from Ada and therefore shares much of the Pascal/Modula feel.

More info on comparison by mahipv · 2007-07-09 23:20 · Score: 1

http://blogs.sun.com/tomdaly/

Re:on the playground... by eneville · 2007-07-09 23:34 · Score: 1

Agree, and they're all SQL based anyway. there is stuff on mysql/postgres that doesn't work on mssql. select * from foo limit 10, offset 10 etc. mssql just doesn't have pagination OOTB. that's just one thing that's so stupidly crap about it. pagination is so basic. in most cases the programmer just gets the entire lot back and only processes the required record sets.

--
Why UNIX?

ORM / Denormalizing etc. by dusty123 · 2007-07-09 23:37 · Score: 1

At first, many situations simply don't need a complex database schema. Think of a contact or email list in a simple web application. Such applications often need only one table. And therefore MySQL suffices. I would not call that bad design. Moreover, don't underestimate MySQL! Recent versions are not so much behind PostgreSQL and they perform reasonably well.

It's true that the SQL knowledge of many developers is quite limited. But for this reason, and, more important, due to the bad compatibility of SQL and OO design, ORM libraries such as Java Hibernate or Python's SQL Alchemy were developed that handle the SQL part. In this case we still need a decent database schema, but it may be described in the framework, which looks more familiar to developers. However, don't underestimate the generated SQL code! Complicated SQL statements may be generated and ORM libaries may make use of referential integrity, check constraints etc. and many other features a mature database has to offer. If designed right, a ORM-based design will have a decent database schema and will make use of many "database goodies".

ORM-generated SQL code may therefore stress the database in many ways - and therefore we still need good databases - regardless of the developer's SQL knowledge.

One pitfall we always seem to step into is that in case a good, new technology is invented, we think of it as a solution to all our problems. SQL is no different here, in the past we were so proud of designing normalized schemas that we forgot about the scalability. Such schemas - although perfect - are often today the reason for performance problems. Sadly, we often overreact and tend to throw away old technology and follow the new hype. And that's what we can see today, such as thoughtlessly vandalizing database schemas due to performance problems. Nevertheless, it is often appropriate to denormalize a database schema to improve the performance. If done right, I would not call this "vandalizing".

Re:I do not think it means what you think it means by perfczar · 2007-07-10 03:06 · Score: 1

Hardware concerns are valid. But more importantly, SPEC jAppServer is not a database benchmark. It's a Java App Server benchmark with a database component. To be fair, you have to change only the database, not the app server, hardware, and everything else too. Not to say that the conclusion is wrong, but it's very weird to say anything about database performance in this circumstance.

In our tests, which are database-only and data-warehouse focused, PG seems to be 3x-5x slower than Oracle on the exact same box, as it does not do data compression and is not as CPU-efficient. Which is quite respectable given the price difference, but may be a big enough gap to keep people who have money to spend coming back to commercial databases.

Uh, no, actually, there's not by p3d0 · 2007-07-10 07:35 · Score: 1

Obviously the vendor puts together the best submission they can for their own product. Oracle did the same for theirs.

--
Patrick Doyle
I mod down every jackass who puts his moderation policy in his sig. Oh, wait a sec....

Re:damn you must be a real shitty database admin by DuckDodgers · 2007-07-11 02:38 · Score: 1

Get real, that does happen. We have a product, which is a set of surveys patients or their caregivers can take. Each survey event represents a survey taken by a patient or caregiver regarding one particular patient. Each survey event results in one associated report getting printed. Survey event completion also generates scores, and score values can trigger up to 2 additional reports. The additional report behavior is governed by a score to report mapping table. Also, each survey can have alternate reports specified based upon site and product settings.

When an administrator logs into the system, an 11 table join is done to generate the list of survey_reports. For sites with 3000 patients and 4500 surveys, it takes Postgres less than a second on old hardware.

Slashdot Mirror

First "Real" Benchmark for PostgreSQL

275 comments