PostgreSQL 9.1 Released
With his first posted submission, sega_sai writes "The new version of open source database PostgreSQL was released. This version provides many important and interesting features such as synchronous replication, serializable snapshot isolation, support of per-column collations, K-nearest neighbor indexing, foreign data wrappers, support of SELinux permission controls and many others. The complete list of changes is available here"
So...how does PostgreSQL compete with Oracle nowadays as far as features go (specifically, spatial, and data-guard-like replication)? Anybody here tried making the switch?
If you wish to use their replication implementation to increase performance, you will probably have to look elsewhere, I'm afraid. In the event that your primary server fails, you are required to promote one of the existing slaves to be the new primary server and all other slaves will require a fresh data dump from the new master. Maybe in another year (when 9.2 gets released) it will be ready for the masses.
I am not excited about any of these changes unfortunately, they are somewhat specialized, though having synchronous replication and serializable transaction isolation sounds more useful than other stuff.
But there are real things that are missing. Most obvious is distributing of one SQL request into parallel processes or threads to speed up query execution on multi-core systems (which are all multi-core today). The other is the entire issue of attempting to calculate execution time and failing in various cases in the planner, like the really sad cases of completely mis-handling of the mergejoin estimates, which then forces people to set enable_mergejoin to false unfortunately, it's a sledgehammer approach, but otherwise things that can execute in a few milliseconds can take tens of seconds and even minutes instead.
There are so many ways to improve performance and really kick it up, and instead there are more features added. I think database performance is now more important for PostgreSQL than features (unless this means introducing parallelization of single SQL requests.)
Otherwise it's a good database, it already provides tons of features. The one weird thing that I find though, is that for replication or hot stand by or just for creating a dynamic backup, the segments that are written to the disk are always of fixed size.
You can modify the size, which is 16MB by default, but you can only modify the size when you configure the source code before compiling it: configure --with-wal-segsize=1 - this configures the segments to 1MB, which allows the second drive to last that much longer if all you are doing is using a second drive to keep dynamic backup (and that asynchronous backup method, by the way, the problem that they are solving with "synchronous replication", it's that you either have these segments fill up, and then the segment is written to disk, or you wait until time expires for segment to be written to disk if you set checkpoint_timeout). I imagine treating fixed sized segments is easier than generating segments that are of exact size equal to amount of data that was produced in a time period, but it's a waste of disk though.
The other big thing that I would love to have in a database is ability to scale the database to multiple machines, so have a logical database span multiple disks on multiple machines, have multiple postgres processes running against those multiple disks, but have it all as one scalable database in a way that's transparent to the application. That would be some sort of a breakthrough (SAN or not).
You can't handle the truth.
Just one small nitpick...sqlite is really meant as an embedded database into an application, it's not a full-fledged database like any of the others mentioned (it doesn't have networking, for example). I suppose you could be scaling up from an embedded sqlite db, but that suggests your application has gotten so big that an external database is necessary.
It's also one of the backing store options for Apple's Core Data framework.
I hate MSSQL as much as anyone, but it does (in later versions at least) support cursors and ROW_NUMBER. It is a bit silly to not support FETCH FIRST in 2011, but hey, it's doing better than Oracle.
Finally! A year of moderation! Ready for 2019?
Actually, MSSQL11 will support FETCH FIRST.
Check it out (you'll have to scroll a couple pages down): http://www.codeproject.com/KB/database/Denali_Tsql_Part_2.aspx#3.3
Oops, I forgot to also mention that it will support LIMIT/OFFSET as well, which is noted in the same link. Sorry for double-post.
The odds of this happening are NULL...
"I say we take off, nuke the site from orbit. It's the only way to be sure."
Already we can tell you are doomed to failure. You are writing a script while ignoring the fact that this functionality already exists. Just cat /dev/random into your db.
spectateswamp, is that you? :)
A successful API design takes a mixture of software design and pedagogy.
But I run Debian Stable on all my servers.
Insensitive clod!
What installer?
They have rpms and debs, what more could you want?
I work with Postgres. I'm "for" Postgres. I think it's great. But you know what? I barely understand what most of these features are. And of the ones I understand, I have no plans of using.
Hate to be a troll, but databases are boring. And Postgres is boring. The database people I know are boring. The guy I know with the most Postgres knowledge I know, is someone I respect as a worker, but is also the most passionless person I have ever met. No one wants an exciting data storage product. They want it to be "mature" and "reliable" and "predictable." I bet that the better Postgres gets, the more boring it'll get. And these exciting "noSQL" databases will all die.
Democracy Now! - your daily, uncensored, corporate-free
I prefer source and my script
You can't handle the truth.
Don't run Postgres on Windows. That would be mindbogglingly stupid.
2 things:
1. You need more ram
2. you forgot shmall, which you are probably not hitting with so little ram.
that's part of install procedure I have for little boxes, that go into separate stores in a chain. They have maybe 2GB, and shmall is not set at all.
For the central servers the settings are different altogether, it's also 48-96GB RAM.
You can't handle the truth.
The other big thing that I would love to have in a database is ability to scale the database to multiple machines, so have a logical database span multiple disks on multiple machines, have multiple postgres processes running against those multiple disk
This exists for Postgres in the form of Yale's HadoopDB project: http://db.cs.yale.edu/hadoopdb/hadoopdb.html http://radar.oreilly.com/2009/07/hadoopdb-an-open-source-parallel-database.html
as well as for commercial forks of Postgres such as EMC's GreenPlum.
Because they already have a Windows server and they don't want to buy another machine?
Things are improving, although a bit slowly. Plus there are windows installers available at enterprisedb.com.
Use sqllite. If you need a real DB install it on a linux/BSD box. No need for the frontend to run on the same machine as the DB.
Eh? Where does that script mention amount of RAM? Shared buffers is usually recommended to be 25% of RAM, but that's a recommendation, not a rule. For some workloads, keeping small shared buffers is actually a good thing.
MGRID is also a PostgreSQL fork that scales to multiple machines: http://www.mgrid.net/
apt-get install -y libreadline5-dev /usr/bin/make /usr/bin/gmake
/usr/local/pgsql/postgresql-9.0.4.tar.gz /usr/local/pgsql/postgresql-9.0.4.tar
apt-get install -y zlib1g-dev
apt-get install -y zlibc
ln -s
gunzip
tar xvf
su postgres
gmake check
exit
echo >> /etc/sysctl.conf
# in postgresql.conf
chmod aguo+wx /etc/init.d/postgres
ln -s /etc/init.d/postgres /etc/rc1.d/K02postgresql /etc/init.d/postgres /etc/rc2.d/S98postgresql
...
ln -s
For god sake read the manual of your OS and use the packages!
For god sake? You think god would like me to use packages instead?
You can't handle the truth.
There is no "useless" crap being introduced here, these are higher end features that many need and have been waitnig. Just because you as say a small business or hobbyist or startup venture need a small set of database features, remember there are others who are needing bigger features, and as your needs grow over time so you may need these things too.
You can stick with a particular postgresl version of a distribution that is still getting fixes and patches, for example the postgresql 8.1.9 in Redhat / Centos / Scientific Linux 5.x
Meanwhile, there are applications in big business that need and want the synchronous replication. There are geolocative applications needing nearest neighbor indexing. There are applications that handle multiple character sets that need multiple collations of table data.
I myself have clients that need synchronous replication with a couple stand-by servers, we've been making do with an inferior "bandage application" approach to do the same thing.