FYI, there are sucj recovery tools for PostgreSQL, but they are kept hidden away from general users and generally only considered useful to folks doing development who need to inspect what's in the table files. Do a google search for pgfsck...
The main thing point in time recovery allows you to do is to backup your database to another machine, and ship the WAL files there. The, when a meteor hits your main production server in New York, you can replay the wal files on the backup machine in San Francisco and be back up to the exact point the machine in NY took a strike and shipped the last few bytes of data across the wire in the WAL files.
Actually, they both handle load fairly well, it's how they handle overload that differs. PostgreSQL generally gets slower and slower until you think it's hung (it's not) and if you take the load off, it will finish up all the transactions etc...
MySQL, on the other hand, will tend to die, i.e. the mysqld will crash and stop running.
Now, that was a year or so ago, with v 3.x of MySQL and v 7.2 of PostgreSQL that I found the two to behave like that.
Actually, you can kill -9 postgresql, or pull the power plug in the middle 1,000 write transactions, and if you have a journaling file system, it will come right back up at boot up.
This isn't a theory, I've tested it. (note either you have to turn off the lying cache in IDE drives or use SCSI for this kind of power off reliability.)
However in the world of enterprise RDBMS's, that's one of the key points about stored procs is the ability to cache the execution plan of a stored proc; Oracle, MSSQL, Sybase, etc.
And it's a right bad idea.
You create your stored proc today, and it has a selectivity of 95% of a 10000 row table. The query plan right uses a seq scan on a small table.
Next year, you have 10,000,000,000 rows in that table, and the selectivity is now 0.05%. Should we use the same query plan?
And, and, and, and, I talked to the people in accounting and they said I needed to talk to you, but, you see, I haven't gotten my paycheck in several months now, and, if I don't get some satisfaction, I will burn this place down.
And another thing, you took my red stapler, but you see, it's the only one that doesn't jammed. I saved some of the old staples it uses so I don't have to switch to the new ones that jam up all the time.
Re:Does the on-disk format still change at every r
on
PostgreSQL 8.0 Enters Beta
·
· Score: 2, Insightful
Actually, it's not the format of the data store so much as changes to the system catalog, which are put in place during initdb.
There has been a project for upgrading in place, but it just hasn't had enough man power thrown at it to be a viable solution.
It's obvious from your reply you aren't very familiar with PostgreSQL.
1: PostgreSQL never stores a query plan, they're always dynamically generated, as the underlying data may change, and why go to the bother of having a cost based query planner if you're not going to use it.
2: PostgreSQL has set returning functions.
3: PostgreSQL does not have stored procedures, it has user defined functions. These functions can be used in place of what other databases call stored procedures.
True, but if you're too stupid to notice that you're logging into yahoo.somebodyelsesdomain.com when you're giving out your info then you probably shouldn't be allowed near the internet.
The real problem is the first click, the one that delivers a payload that just takes over the box, and that problem is for the most part, non-existent in any OS other than Windows / IE
No, I understand that kernel and process scalaing are inexorably linked at the hip. A kernel with no processes running on it isn't going to accomplish much,
I run ximiam on linux, so this affects me how again? Poor bastards out there trying to keep from having their windows box zombified, I dont' know how you even read email on that platform without getting an ulcer.
Have you ever submitted code to Linus, Alan, et. al.? They're total code nazis. Many many patches never make it in because it's insecure / poorly indented / poorly written.
I.e. the kernel hackers don't just toss patches at the kernel and see what sticks, they review patches very carefully. And toss out a lot of them.
Solaris scales to hundreds of processors out-of-the-box.
BZZZT! But thanks for playing. Solaris may well do better out of the box on hundreds of processors. However, any two installations using hundreds of processors are different enough that you'll have to do plenty of tuning on either kernel, Solaris or Linux, if you want them to run well.
PostgreSQL needs fair amounts of shared memory, but being process driven will likely scale better under Linux.
Java virtual machines, being threaded, will likely scale better on Solaris than on Linux.
They're different tasks that can use hundreds of CPUs for high load, but they each tax the OS differently, and therefore, neither one can scale properly "out of the box" beause there's simply no such thing.
Solaris may be a little better tested in this space, and a little simpler to configure, but it's not some magic solution you just drop in place and watch it work either.
If it takes fewer people to make quality open source software than it does to make quality closed source software, then Open Source development, by definition, must be more efficient. It uses fewer resources, and lowers cost.
Sounds like he's busy complementing Open Source / Free Software and he doesn't even know it.
Well, if you or anyone you know has a patch to do it, I'm sure it would be accepted as long as it worked well on all the major OSes that pgsql supports. But it just isn't a big enough fish to fry what with PITR, nested transactions, tablespaces and all the other improvements being made. But making a threaded process model work right now would be a LOT of work for a fairly small gain, given PostgreSQL's focus on primarily supporting large numbers of simultaneous users.
But once all the major parts of the SQL92 and SQL99 specs are implemented, I'm sure someone will start working on the more esoteric performance issues like this.
Multi threading isn't likely to show up any time soon. PostgreSQL's primary focus is on correct and reliable behaviour and it isn't going to change, and the currecnt view is that threads are not quite there yet in all the OSes they need to be there in, and that they aren't the cureall many folks believe them to be.
FYI, there are sucj recovery tools for PostgreSQL, but they are kept hidden away from general users and generally only considered useful to folks doing development who need to inspect what's in the table files. Do a google search for pgfsck...
The main thing point in time recovery allows you to do is to backup your database to another machine, and ship the WAL files there. The, when a meteor hits your main production server in New York, you can replay the wal files on the backup machine in San Francisco and be back up to the exact point the machine in NY took a strike and shipped the last few bytes of data across the wire in the WAL files.
There's a "sleepy vacuum" thing that was almost added to 7.4 where you can set vacuums to sleep a few milliseconds every few blocks.
It was added, it's just set to 0 by default, disabling it. But it's still in the code.
and provided one does NOT expose the Postgres OIDs, which _are_ wrong-headed in third manifesto terms.
FYI, OIDs are now optional, and may, in some future release, default to being off on a table at creation time.
To be a good OLAP database, ... index-only queries,
PostgreSQL cannot do this due to it's design. It's MVCC datastore system means that you can only get the right number out of the actual table.
PostgreSQL ONLY caches query plans for prepared queries, and only for that connection. It doesn't automagically cache query plans or result sets.
Actually, they both handle load fairly well, it's how they handle overload that differs. PostgreSQL generally gets slower and slower until you think it's hung (it's not) and if you take the load off, it will finish up all the transactions etc...
MySQL, on the other hand, will tend to die, i.e. the mysqld will crash and stop running.
Now, that was a year or so ago, with v 3.x of MySQL and v 7.2 of PostgreSQL that I found the two to behave like that.
Actually, you can kill -9 postgresql, or pull the power plug in the middle 1,000 write transactions, and if you have a journaling file system, it will come right back up at boot up.
This isn't a theory, I've tested it. (note either you have to turn off the lying cache in IDE drives or use SCSI for this kind of power off reliability.)
However in the world of enterprise RDBMS's, that's one of the key points about stored procs is the ability to cache the execution plan of a stored proc; Oracle, MSSQL, Sybase, etc.
And it's a right bad idea.
You create your stored proc today, and it has a selectivity of 95% of a 10000 row table. The query plan right uses a seq scan on a small table.
Next year, you have 10,000,000,000 rows in that table, and the selectivity is now 0.05%. Should we use the same query plan?
And, and, and, and, I talked to the people in accounting and they said I needed to talk to you, but, you see, I haven't gotten my paycheck in several months now, and, if I don't get some satisfaction, I will burn this place down.
And another thing, you took my red stapler, but you see, it's the only one that doesn't jammed. I saved some of the old staples it uses so I don't have to switch to the new ones that jam up all the time.
Actually, it's not the format of the data store so much as changes to the system catalog, which are put in place during initdb.
There has been a project for upgrading in place, but it just hasn't had enough man power thrown at it to be a viable solution.
Maybe it's time for you to volunteer?
Btw, does PostgreSQL have row-level locking yet?
...
You mean:
select * for update from table
That's been there for some time now.
Inheritance for tables, so you can agregate different types of data without needing to explicitly set all kinds of primary keys.
Unfortunately, the inheritance doesn't currently handle things like pks and unique indexes across the inherited tables. sigh.
Stop grumbling and go here:
c /F AQ/FAQ_DEV.html+Developers-FAQ
http://developer.postgresql.org/readtext.php?sr
and implement it. This ain't no commercial app, it's open source.
It's obvious from your reply you aren't very familiar with PostgreSQL.
1: PostgreSQL never stores a query plan, they're always dynamically generated, as the underlying data may change, and why go to the bother of having a cost based query planner if you're not going to use it.
2: PostgreSQL has set returning functions.
3: PostgreSQL does not have stored procedures, it has user defined functions. These functions can be used in place of what other databases call stored procedures.
Just FYI.
How is Bill supposed to afford that ivory back scratcher if you won't buy another version of Windows you don't need?
True, but if you're too stupid to notice that you're logging into yahoo.somebodyelsesdomain.com when you're giving out your info then you probably shouldn't be allowed near the internet.
The real problem is the first click, the one that delivers a payload that just takes over the box, and that problem is for the most part, non-existent in any OS other than Windows / IE
No, I understand that kernel and process scalaing are inexorably linked at the hip. A kernel with no processes running on it isn't going to accomplish much,
I run ximiam on linux, so this affects me how again? Poor bastards out there trying to keep from having their windows box zombified, I dont' know how you even read email on that platform without getting an ulcer.
I would not be surprised if SGI got NASA to agree to not give the source to anybody else (I believe that is legal within the GPL, but IANAL).
No, it's not legal, it's adding restrictions to redistribution, which the GPL explicitly forbids.
However, knowing SGI, and the way they've behaved, I would be very surprised if they tried to pull something like that.
The fact is, SGI has benefitted greatly from the GPL'd linux and it would be quite rude to try and make their changes proprietary.
Have you ever submitted code to Linus, Alan, et. al.? They're total code nazis. Many many patches never make it in because it's insecure / poorly indented / poorly written.
I.e. the kernel hackers don't just toss patches at the kernel and see what sticks, they review patches very carefully. And toss out a lot of them.
Solaris scales to hundreds of processors out-of-the-box.
BZZZT! But thanks for playing. Solaris may well do better out of the box on hundreds of processors. However, any two installations using hundreds of processors are different enough that you'll have to do plenty of tuning on either kernel, Solaris or Linux, if you want them to run well.
PostgreSQL needs fair amounts of shared memory, but being process driven will likely scale better under Linux.
Java virtual machines, being threaded, will likely scale better on Solaris than on Linux.
They're different tasks that can use hundreds of CPUs for high load, but they each tax the OS differently, and therefore, neither one can scale properly "out of the box" beause there's simply no such thing.
Solaris may be a little better tested in this space, and a little simpler to configure, but it's not some magic solution you just drop in place and watch it work either.
If it takes fewer people to make quality open source software than it does to make quality closed source software, then Open Source development, by definition, must be more efficient. It uses fewer resources, and lowers cost.
Sounds like he's busy complementing Open Source / Free Software and he doesn't even know it.
Well, if you or anyone you know has a patch to do it, I'm sure it would be accepted as long as it worked well on all the major OSes that pgsql supports. But it just isn't a big enough fish to fry what with PITR, nested transactions, tablespaces and all the other improvements being made. But making a threaded process model work right now would be a LOT of work for a fairly small gain, given PostgreSQL's focus on primarily supporting large numbers of simultaneous users.
But once all the major parts of the SQL92 and SQL99 specs are implemented, I'm sure someone will start working on the more esoteric performance issues like this.
Multi threading isn't likely to show up any time soon. PostgreSQL's primary focus is on correct and reliable behaviour and it isn't going to change, and the currecnt view is that threads are not quite there yet in all the OSes they need to be there in, and that they aren't the cureall many folks believe them to be.