PostgreSQL 8.3 Released

Will it be used? by Anonymous Coward · 2008-02-04 09:13 · Score: 4, Informative

I'm a postgresql fan, I've considered it a superior database for years.

However, it seems every client I come into contact with (I am a freelance software designer) seems to believe mysql is the only open source database available and certainly the best one for all jobs.

Mysql is great (or at least, was great) for fast connection times and speed but for a stable, feature-rich excellent database, postgresql has always been ideal.

It's just a shame no one seems to be aware of it.

Re:Will it be used? by cheater512 · 2008-02-04 10:25 · Score: 2, Informative

People like myself who design software requiring a database usually prefer speed over features.
Thats why MySQL is usually chosen.
Re:Will it be used? by LurkerXXX · 2008-02-04 11:56 · Score: 2, Informative

You people who design software should read up on this thing called data integrity, and enforcement of foreign key constraints.

MySQL is pretty bad at those, but if you use an innodb table and try to use them, you find it's no faster than postgresql. And still missing many many features that postgresql gives you.
Re:Will it be used? by jocknerd · 2008-02-04 12:01 · Score: 2, Informative

If you want raw speed, you should be using SQLite instead of MySQL. Plus, are you using the InnoDB database engine? If you are, then you are not getting MySQL's raw speed. That only comes with their native database engine. You know, the one that doesn't support much of SQL at all.

what do you mean by cross-database join? by Reality+Master+201 · 2008-02-04 09:40 · Score: 2, Informative

Are you thinking like you'd do in SQL server (IIRC) or MySQL, where you have a db reference in the table list (i.e., SELECT * from db1.table1, db2.table2 WHERE [join clause])?

you'd probably just want to use a schema for that; the concept maps more or less the same way.

Re:Multi-master replication built in by Acheron · 2008-02-04 09:48 · Score: 3, Informative

The 8.3 release notes list the Bucardo project http://bucardo.org/ for multi-master replication. I haven't used it... is there something that it is lacking that you think would be addressed by bringing it into the core code base?

Re:asynchronous commit by Anonymous Coward · 2008-02-04 09:51 · Score: 2, Informative

Asynchronous commit is very useful in applications where the thing that's important about the data is its statistical distribution, and not the individual data points per se.

Re:asynchronous committ by Simon+(S2) · 2008-02-04 10:02 · Score: 3, Informative

Sure. Like so many times in software development "only you can decide whether data loss is acceptable to you. But there are many classes of applications, including high-speed data ingest programs whose only goal is to get the stream of data into the database, where commit-but-don't-wait is not only acceptable but very desirable performancewise." (Tom Kyte)

--
I just don't trust anything that bleeds for five days and doesn't die.

Re:asynchronous committ by ivoras · 2008-02-04 10:13 · Score: 2, Informative

Are you sure this is such a disaster? As far as I can tell this only means that executing "COMMIT" doesn't block (wait) until the commit has actually happened but returns immediately, and the actual operation is performed "later". The data still goes through the journal (WAL), is still fsynced when needed, etc.

--
-- Sig down

Re:Time for a cross-DB comparison by naasking · 2008-02-04 10:15 · Score: 2, Informative

A couple such comparisons already exist. They may be a year or two out of date however.

--
Higher Logics: where programming meets science.

Re:Time for a cross-DB comparison by TheNarrator · 2008-02-04 10:31 · Score: 3, Informative

Here's a feature matrix of new features vs. old versions. It's easy to see that 8.3 is a huge upgrade.

http://www.postgresql.org/about/featurematrix

Re:Cross Database Joins?? by prog-guru · 2008-02-04 10:40 · Score: 3, Informative

I don't know, seems to me a cross database join is neither simple nor proper structure. If you are stuck with it schemas would probably work, painless in Active Record, perl DBI requires 'schema.table' syntax.

--

chris@xanadu:~$ whatis /.
/.: nothing appropriate.

Re:asynchronous committ by greg1104 · 2008-02-04 10:46 · Score: 4, Informative

Why would you risk losing data for speed? There are so many ways to tune things and speed things up without taking such drastic measures.

The new async commit feature bypasses the requirement that records physically hit disk in order to complete a commit. If you must wait for a disk commit (typically enforced by fsync), the maximum number of true commits any one client can do is limited by the rotation speed of the hard drive; typically an average of around 100/second for a standard 7200RPM disk with PostgreSQL. There is no way whatsoever to "tune things and speed things up" here; that's how fast the disk spins, that's how fast you get a physical commit, period.

In order to accelerate this right now one needs to purchase a disk controller with a good battery-backed disk controller and pray it always works. If it doesn't, your database might be corrupted. With async commit, you can adjust the commit rate to something your disks can keep up with (say 50/second) just with this software feature while still allowing a write rate much higher than that, and at no point is database corruption possible (from this cause anyway). This makes people who want to use PostgreSQL in things like shared hosting environments have an option that allows heavy writes even for a single client while having a reasonable data integrity policy--only server crashes should ever lose you that brief period since your last true commit. That's a fair trade for some applications (think web message boards for example) and lets PostgreSQL be more competitive against MySQL based solutions in those areas.

Re:How quickly they turn on you .. by glwtta · 2008-02-04 10:48 · Score: 2, Informative

8.2 was released over a year ago - this is not a minor revision.

--
sic transit gloria mundi

Re:Time for a cross-DB comparison by 0racle · 2008-02-04 11:12 · Score: 2, Informative

PostgreSQL vs. MySQL WIki

Seems to be features only, no performance.

--
"I use a Mac because I'm just better than you are."

Re:Cross Database Joins?? by Qzukk · 2008-02-04 12:00 · Score: 2, Informative

the simple solution is often easier and faster than the complex solution

When you want something that walks like a duck and talks like a duck, the simplest, fastest, and easiest solution is to get a duck. What you want is done by schemas in postgresql. If you're really doing the separate databases for "performance" reasons, then one presumes that at some point you're going to be putting the databases on separate servers, in which case you'll be wishing you had started with dblink in the first place.

What you're asking for may be "simple" for you, but what about the server side? Rewriting the entire database engine to have a monolithic superdatabase to store system catalogs and authorization information for the databases underneath is none of "simpler, faster or easier", while it does the exact same thing for you as having a "super" database with schemas underneath it using the search_path setting for applications that you don't want to rewrite with schema.table syntax for all of their queries.

--
If I have been able to see further than others, it is because I bought a pair of binoculars.

Why PostgreSQL doesn't have more of the market by einhverfr · 2008-02-04 12:11 · Score: 2, Informative

I have used PostgreSQL as my primary db since 2000 (version 6.5!) and I have watched it for a while.

PostgreSQL had a number of problems in the past which made it hard to work with including:
1) No ALTER TABLE DROP COLUMN support and other things needed for prototyping (fixed in 7.2 iirc)
2) Issues with dependency tracking in text SQL dumps (fixed in 8.0) meaning that some times one had to hack dumps to get them to restore properly.
3) maintenance tasks required exclusive locks on tables (corrected sometime in 7.x)
4) other things which generally made it somewhat user-unfriendly.

Note that all of the above issues have been corrected as of three years ago. However by that point a lot of the small less serious appliations were more or less wedded to MySQL. For large, shared databanks, PostgreSQL and Ingress II (and maybe Firebird) are the only serious choices. MySQL fails to be useful as soon as you end up requiring the sort of thing described in the title of Codd's original paper....

--

LedgerSMB: Open source Accounting/ERP

Re:asynchronous committ by Anonymous Coward · 2008-02-04 13:11 · Score: 1, Informative

the maximum number of true commits any one client can do is limited by the rotation speed of the hard drive; This assumes your disks are configured for write-through instead of write-back. I'd wager there are quite a few folks who don't understand how the difference relates to database integrity (I was at an Oracle training session recently, and the instructor had never heard of such a thing). A lot of times, the db guys and the systems guys live in completely different silos, and it's very probably that the disk systems have been tuned for maximum performance, without regard for the effect on data integrity. In this scenario, the db guys/gals may be, and often are, completely deluded about the solidity of their database (but they are probably quite happy about the performance).

Disks configured for write back with a big cache can handle db writes very quickly. An fsync won't save you, it just asks the controller to verify the write took place. If the controller lies, well, too bad. From the fsync man page: "If the underlying hard disk has write caching enabled, then the data may not really be on permanent storage when fsync() / fdatasync() return."

All that said, I do think there is a place for async commits / write-back caching. Not all data going into a database is mission critical; it depends on the app. This will be a real boon for packet logging, for example. If you work in a bank, well, that's obviously a different story.

Re:long live postgres by greg1104 · 2008-02-04 13:23 · Score: 3, Informative

This is one of those things that's painful only until you've crossed a certain threshold of information, then it never bothers you again. It's better now but still harder to put the pieces together than it should be.

Start with the documentation on creating a cluster: http://www.postgresql.org/docs/current/static/creating-cluster.html In 8.3 the default of using auth you mentioned has been removed, for the reasons you described. So it now runs as unsecured for local users by default and you have to worry about this yourself, which since it reduces the frustration at getting started was deemed an improvement.

That page suggests some options you can pass to initdb to adjust the default security level. Now you want to look at the initdb docs: http://www.postgresql.org/docs/current/static/app-initdb.html See that if you use -W/--pwprompt (same thing) you can assign a password to the database superuser at cluster creation time. If you do that, you can now change the default authentication scheme to password-based (-A md5 passed to initdb will do that), you'll be secure, and you'll have one user you can login as (postgres) to create more.

To see what other authentication methods are available and to learn what your options are look at http://www.postgresql.org/docs/current/static/client-authentication.html The one you really need to dive into is pg_hba.conf which is the magic text file to edit here. A new one of those will be put in the base directory of your new database cluster. Open that file up, look at the documentation, and you'll need to add a line to add network support like one those in the examples. Probably something like

host postgres all 192.168.12.0/24 md5

(allow access to anybody on the 192.168.12 subnet access the database with a password)

That should get you past the ugly initial hurdles. The next document you may need is how to add more users: http://www.postgresql.org/docs/current/static/sql-createrole.html

again look at the examples first and then backtrack to the parameters, will make more sense that way. After that you'll want to create more databases with createdb: http://www.postgresql.org/docs/current/static/app-createdb.html

And then you should be able to find your away around from there using the psql command line tool.

Note that once you get past accepting connections over the network, you could use a tool like pgAdmin III to handle the rest of this work using a slicker interface. There's even a copy of it bundled with the Windows installer you can use on such a client to administer a remote server running a real OS. It's of course possible to install pgAdmin manually on other platforms as well, see http://www.pgadmin.org/ for other verions (note that binary packages for platforms like RPM don't show up in every release, you have to go back to v1.8.0 to get the last full set of packages).

Re:Does it support multithreaded queries? by Just+Some+Guy · 2008-02-05 03:19 · Score: 2, Informative

It would seem not to. Yeah, I wish it had that, too. The other posters who keep telling you to get faster IO miss the idea of having extra CPUs handling locking, cache management, etc. so that even single running queries are faster.

--
Dewey, what part of this looks like authorities should be involved?

Re:whuh? by ahodgson · 2008-02-05 04:34 · Score: 2, Informative

A PostgreSQL schema is just a namespace qualifier; it functions just like MySQL's cross-database joins and is conceptually similar. It isn't a full copy of your database DDL.

Apples and oranges by einhverfr · 2008-02-05 07:09 · Score: 2, Informative

One of the issues with this discussion is that there is a disjunct in the terminology used by MySQL users and that used by the rest of us.

MySQL uses "database" and "schema" synomymously (note that their "information schema" is a separate "database"). In that sense, PostgreSQL has long had cross-schema joins in the same way MySQL does. It is just slightly harder to set things up so that you create tables in the right schemas. (hint: SET search_path='schema_name');

In this way, I do a *lot* of work using cross-schema queries. They work like a charm on PostgreSQL.

What PostgreSQL calls a "database" is analogous to the entire cluster of "databases" on MySQL. There is no analogy to "PostgreSQL cluster" on MySQL (i.e. a set of databases controlled by the same process, each of which contains multiple schemata). For that matter, last time I checked, there was no equivalent on Oracle either.

So "PostgreSQL doesn't have an equivalent to MySQL's cross-db queries" is an issue of terminology rather than substance.

At the same time, as nconway points out, PostgreSQL's MED solutions do not optimize well across distributed queries (i.e. where parts of the queries have to go through connections to other databases/servers). For MySQL users, these are like Federated tables, and I would expect similar (or worse) optimization problems in this area on MySQL. Hence when you are trying to use PostgreSQL as a distributed database, you are in for headaches. Fortunately this is not what the OP asked for :-)

--

LedgerSMB: Open source Accounting/ERP

Re:New HOT, faster Postgres by byoung · 2008-02-05 07:41 · Score: 2, Informative

I guess nobody appreciates humor here. Well, not if it's against the fanboys.

I'll stand by it though-- switching to MySQL from Postgres made my life significantly simpler:

1) you can install MySQL easily
2) MySQL has great vendor support
3) my experience is that MySQL performs significantly better in the general case (i.e. I'm not spending my entire life tweaking performance)

Many an honest thing were said in jest, I suppose.

23 of 286 comments (clear)