PostgreSQL Inc. Open Sources Replication Solution
Martin Marvinski writes "PostgreSQL Inc, the commercial company providing replication software and support for PostgreSQL, open sourced their eRServer replication product. This makes PostgreSQL one step closer to being able to replace Oracle as the de facto RDBMS standard. More information can be found on PostgreSQL's website."
There is an enormous distance between "viable alternative" and "defacto standard" and the path between them is not paved with features.
While this is a good thing, pseduo replication is possible at the application layer. Think using PHP to squirt a table in one DB into anther one.
Now that Postgress can replicate at DB level other, more interesting things are possible. You can use replication for both failover and performance clustering.
I can't think of anything witty right now
Check monster.com. More companies look for people with mySQL experiences. Check the book stores. You will see more books about mySQL. Even though PostgreSQL has more features and is more promising and powerful, mySQL gets more publicity. This means that mySQL will be the open source database that will replace most commercial databases. It's sad but true.
Ever heard of InnoDB? MySQL lets you choose -- on a table-by-table basis -- exactly what parts of your application need to support transactions, foreign keys, etc.
Odd, neither Slashdot nor Yahoo! Finance seem to be having corruption problems...
Adding extra memory, CPUs, or slave servers obviously has no impact on server performance. (Yes, replication is... clumsy, at best, but depending on the application, it can work quite well.)
Compared to Postgres?
I've been waiting anxiously for this. Postgres will definately be running my startup now... lack of replication was the only thing holding Postgres back. Wonder how long it will take for this to migrate to debs stable branch...
-1 Uncomfortable Truth
You would still have to pay boatloads for support...even with postgres... Open Source does not mean 24/7 Support calls...
I can't say a bad thing about postgresql; this was really the only thing I felt the need for. For anyone who hasn't tried it you really should. Although I don't want to start a MySQL v postgresql flamewar, after trying both I think that postgres has the edge. Mysql was undisputably easier to work with and (at the time) was faster. PostgreSQL has moved on at a much faster rate though. In particular postgresql has solid support for transactions, large objects, subselects, object oriented tables. I'm convinced that if you use databases long enough you'll want every last one of these and won't be able to do without.
Carpe Daemon
This is indeed good news, as free software always is. But eRServer can only operate in single-master mode, which makes it unsuitable for high-availability kind of work. Single-master systems are good for load-balancing on installations where most of the queries to the DB are SELECTs.
eRServer comes a bit late. We already have PostgreSQL Replicator, which is multi-master. Unfortunately PG Replicator is not supported anymore. The latest version it can work with is 7.1, and the project's latest news are timestamped nearly two years ago.
They also have been working on a procedual language for PostgreSQL for server side triggers, and functions. Information can be found here, plPHP.
PostgreSQL has made some pretty nice advancements post version 7+, performance and feature wise. I worked on a intranet where the company spent lots of money trying to get an Oracle solution to work, but found it was way to slow. The suggestion of PostgreSQL, and MS SQL came up. We tested PostgreSQL, and it was acutually faster, and easier to maintain then our Oracle database. The best part was, it was free!
Every Super Villan uses Linux.
InnoDB Hot Backup is a tool which allows you to backup a running InnoDB database without setting any locks or disturbing normal database processing. You get a consistent copy of your database, as if the copy were taken at a precise point in time. InnoDB Hot Backup is also the ideal method of setting up new slaves if you use the MySQL replication on InnoDB tables.
For how many server computers you want to order an evaluation copy, or 1-year licenses (390 euros or 450 US dollars each), or perpetual licenses (990 euros or 1150 US dollars each); discounts are available for large volume orders.
This is from http://www.innodb.com/hotbackup.html
So I get up and take a step toward the coffee machine, I guess I am ALSO "one step closer" to China!
While this is a nice step forward, the real reason large sites utilize Oracle is because of synchronous replication.
The replication needs to be able to keep all data consistent across multiple servers, without any conflicts. Then, if a particular server goes down, the DNS can simply fail over to a second server.
Once the above has been achieved, then we have a viable alternative to Oracle.
> This makes PostgreSQL one step closer to being able to replace Oracle...
Please! While this may help win the hearts and minds of OOS geeks, it does little to improve their standing in the business world.
Oracle is as established in the database world as Microsoft is on the desktop. This alone would doom any OOS wannabe to quiet places like web server back ends where they already do well anyway ( e.g. mySql ).
Put aside the technical considerations, support, client base, etc and PostgreSQL still offers as much of a threat to Oracle as mySql or dBase. The only real threat I've seen to Oracle supremacy is Microsoft's SQL Server but, of course, that's only in MS shops.
Starting with the next release SAPDB will be rebranded as MaxDB by MySQL AB.
This will probably mean that PostgreSQL will have a very hard time competing with MySQL ! (also see the info on the SAPDB webpage)
At the same time the licencing will change to pure GPL (no more LGPL libraries !!!)
Please let me know if I'm wrong...
I visited the site, and the commercial site too and it seems this is only simple replication with the master being a single point of failure. F.E.
1. update a row in the master
2. master replicates the update to multiple slaves
3. clients perform select operations against the slaves (nice load balancing opportunity)
4. the master crashes
5. No one can write until the master comes back online.
Here are the steps that seem to be missing:
6. the slaves elect a new master
7. if the old master comes back up it must realize a new master is present and become a slave.
8. clients using JDBC would need some mechanism of finding out what the new master is when an update/insert/delete fails.
Cheers.
Schedule your world with ScheduleWorld.com http://www.ScheduleWorld.com/ (Java Web Startable)
I like PostgreSQL, and Open Source deserves capitalization, but I'd like to hear an enterprise DBA's perspective on if this really compares to Oracle's configurability, clustering capabilities, or the seamless swapping of redundant database packages when deployed on, say, an EMC 1000, for reliability and failover. BTW, for this request, "enterprise" = Fortune 100, not Joe's Web Hosting.
Like the subject says, I'm not a DBA, but I know some pretty heavy-duty ones that say nothing beats Oracle running on HP Superdomes with EMC storage.
We use PostgreSQL a lot. Since all the of the sites related to this story are in the process of being /.ed, can someone tell me what this replication thingy does?
[ shameless_plug ] /shameless_plug ]
StarFish is a block-level storage system allowing on-the-fly geographic replication, that would work with any database. It was OpenSourced by Lucent a few months ago. It won the Best Paper award at Freenix '03.
[
By the time PostgreSQL is capable of replacing Oracle as the de facto standard, MS SQL Server will already have done the job and be the new standard to beat. No matter how much people despise MS SQL Server, nor how much they over rate Oracle, it is now the standard for companies needing a low or mid range solution. As with all MSFT products, it improves with each iteration. Our needs aren't high-end, but it handles our 300GB databases with 170 million row tables remarkably well.
So now we have two major OSS databases with 99% of the features that commercial offerings have, and lots of features that they don't (I'm a MySQL guy, so I know what those extras are for that database, but knowing OSS development paces, I'm sure the same is the same for PostgreSQL).
I listen to folks at work talk about why we "need to move to a *real* database at some point", and it always comes down to the fact that they've bought into the marketting, and when they examine their reasons (if they are willing to), solutions like PostgreSQL or MySQL are a whole lot better choices than the "real" database choices out there.
Bravo guys!
I've got about 17 years experience with RDBMS', covering oracle, db2, postgresql, sql server, etc. So....
Postgresql looks like it's better positioned to eat SQL Server's lunch than oracle's to me. First Off, back in the day (3 years ago), when oracle was licensing by 'power unit' - it cost about $1000 / CPU / Mhz. So - a single CPU license for a 1-ghz machine would set you back $100k! Since then they've had to drop prices - because of the market, and because of DB2 - which is far less expensive.
Anyhow - if you're going to invest a cool million or two in a top-end, enterprise box - a Sun E10K or an IBM Regatta - then you don't rig up some cheap AC solution, use surplus wiring, or a free database on it typically. You put Oracle or DB2 on it. Sure - Postgresql is fine database and it'll save you some bucks, but when you're putting your reputation on the line and have build a business case that justifies (say) $2m in hardware, and probably $4m in labor - it's foolish to try to save $1m on oracle by going with postgresql.
Without *any* support for parallelism, without stronger replication, without better 3rd-party support (think of Toad for instance), without thousands of experienced develpers & dbas out there, without more robust availability functionality - it simply isn't ready to tackle the biggest projects. Or those projects with extremely high availablity requirements. Or relatively large reporting projects (no parallelism). Or just about any projects in a really dedicated single-vendor shop with its act together.
But that's ok - that leave 30-50% or so of the other database work that it can do just fine right now. That's a huge market. And unlike mysql - if your construct a database application using postgresql and then later want to port it to oracle, db2, or sybase - it's just a normal porting of the application. You construct the architecture in just about the exact same way (for most applications anyhow), so the porting is straight-forward. Not so in the case of mysql - where it's severe limitations result in applications doing a ton of the database work - and porting ends up being a complete rewrite.
Even though the article is about an improvement in the PostgreSQL community, the comments are mostly pgsql vs mysql. People in the bazaar need to have personal motivations to work on opensource projects, mostly to have something against Microsoft, but increasingly, it is becoming a series of team wars. Linux vs BSD, then we had KDE vs GNOME and now qmail vs postfix and mysql vs pgsql. More than a decade ago we had vi vs emacs and BSD vs SYSV.
What the posters here need to realize is that it is exactly this competition that is driving the projects. If MySQL was not given the press and did not have its cult following, we would not see this pace of development for pgsql. The developments for FreeBSD really improved to compete with Linux although their developers claim they are not competing... they do have the fear Linux will supplant them.
What is interesting to note is that in most of these project wars, both projects really survive and get two different niches of their own. This was true of bash vs csh, BSD vs SYSV, BSD vs Linux, KDE vs GNOME, and now MySQL will become the standard entry level database and pgsql the higher level.
I use pgsql because my databases have complicated requirements that MySQL cannot meet. Yet MySQL is the quick and dirty solution when I have to set things up fast. For all new learners I always suggest MySQL. For people thinking of replacing or duplicating their ERP systems, I always suggest pgsql. I even know how to program in sleepycat's db and know where it should replace mysql in smaller embedded systems and where the mysql license cannot be used.
I believe this competition is coming to a close since pgsql has taken a big lead over MySQL in features, and therefore made itself more difficult to deal with especially for newbies. All I can say about the postgresql replication is bravo, and hope MySQL doesnt follow suit so it remains the simple fast and easy database in its own niche.
"Give orange me give eat orange me eat orange give me eat orange give me you." -Nim Chimpsky
PostgreSQL still needs better transaction handling before I'd call it ready for prime time. I found its limitations the hard way, attempting to port 13K lines of Oracle Embedded SQL. Imagine transaction A searching a database. Now you want transaction B to do something. It might be just two user queries open at the same time or one transaction reading one table while the second writes to another table. All well and good until one transaction closes. Then all transactions close! That just plain sucks.
However, I was able to port the entire application to Firebird--IMHO a far better database.
"Love is a familiar; Love is a devil: there is no evil angel but Love." --William Shakespeare ('Love's Labors Lost')
This is like shooting fish in a barrel. I often don't like to harp on the OSSDB fanboys, but Oracle's database solution is second to none, and continues to pull away from the pack.
First things first. Online replication is generally considered by professional DBAs a fools errand. You have to babysit and it fails at the drop off a hat for a variety of reasons. The are no good reasons to do replication in the manner they are talking about, unless that is your ONLY option.
There are however, reasons to replicate data. The reasons you want data replicated are usually for one of two reasons: availability or scalability.
To address availability Oracle provides several options that are just plain better than regular/triggered snapshot logging or materialized view refreshing over a network.
The best option is Oracle's Dataguard, which applies redo/archive logs to a duplicate remote databases. You can perform this option at the logical and at the physical level, and you can choose to maximize/guarantee the protection all the way down to best effort. This option provides the ability to have an absolutely current very warm site, a simple command and you're database is up and running.
As for scalability, again Postgres or mysql doesn't hold a candle. There are too many options to list, so I'll discuss the big ones.
Paritioning/sub-partitioning of data. The way Oracle lays out it's logical database block layer and physical OS block layer is absolutely perfect for being able to do anything you want with the database file layout. I can put my OLTP indexes and tables on fast raid10 devices, the historical and warehousing data on raid5 devices, but that's not all. I can increase parallelization of the hardware by putting a single table or index across N devices. The ability to sprinkle files and chop up data anywhere you want, is just one thing that makes Oracle configurable, scalable, and great.
Real Application Cluster (was Oracle Parallel Server). This is a for REAL clustering solution. Oracle allows several servers (can be dissimiliar in capabilities, i.e. some can have 64gig of memory and 12 processors, and then the others could be smaller dual processor machines.) to connect to the same storage (usually shared over a SAN or SCSI direct connect to EMC gear). Each of the servers is connected to a crossover/ipc LAN (we use gigabit) and now each of the servers has access to the same data. One node goes down or needs to go down for maintenance or reconfig, that's ok, the other nodes are online and traffic can be configured to automatically transfers over to the other nodes MID-TRANSACTION and picks up where it left off and the application is none the wiser (i.e. happens in seconds). The nodes share cached data over the fast network, so there is often little need to go to disk. This kind of scalability can not be found on any other database.
<rant>
The real gain for OSS and Oracle, is Linux and Oracle running on Linux. OSS databases are too immature to be let anywhere near real money. I'm not talking about ecommerce money, I'm talking about the millions and bajillions of dollars that flow like water through companies. Linux has Oracle validation and certification, which goes a LONG way in getting Linux into the real datacenters. The price point for the hardware, and the OS and the special deals that Oracle cuts for it are the true win for OSS. The performance is more than there for Linux/Intel solutions, and the price point for Intel hardware is very attractive to companies looking to cut expenses. You still have to pay homage to the Oracle and EMC gods, but even they have felt the crunch, and they too are providing competitive price points.
</rant>
So Postgres is one feature closer to what Oracle was several years ago. So what, this is embarassing. Mysql has had transactions for how long? a few days? Please people, Oracle is not resting on it's laurels waiting for anyone to catch up. They have real companies, with real money, that are real threats to them. IBM and Microsoft. Oracle, is pushing the edge on the database front, and doesn't show any signs of stopping.
No.
SQL Server will never run on any version of UNIX. AFAIK, there aren't even (MS-supported) SQL Server client libraries for non-windows platforms. I realize that FreeTDS is available, but such a library would never be used in a highly critical sector.
If you have to integrate multiple platforms, you cannot use SQL server. Closest similar product is Sybase ASE, but Microsoft broke Sybase compatibility on purpose.
I think if people understood how irrationally obstinate SQL Server's platform dependence was, they would look elsewhere. I hope that this attitude holds them below 10% penetration - it certainly has up to now. They are a bit player.
In any case, there is a cheap, new version of DB2 out for $500/copy.
This is not Highlander there can be more than one.
MySQL is a good solution for some tasks. Postgres is a good solution of some other tasks. I swear people are so odd. There can be room for more than one OS, Database, Office Suite, and CPU. I really like Postgres and use it for our in house database. I use Mysql for our website's database. Why? because it is what our ISP provides and it works.
How about this... Learn Both.
See my blog http://ilovecookes.blogspot.com/ for light hearted technical information.
I find it funny that so many journalists always say that KDE/GNOME or Postgres/MySQL should be stopped and only 1 solution should be allowed. Yet, at the same time, they quickly point to all the different applications in the windows space.
Myself, I use Linux/KDE/pgsql ~ 95% of the time. But there are times where I like BSD (awesome security), Gnome (i like their simple interface and their apps are nice in a number of areas), and Mysql (want a fast mostly read DB? Nothing beats Mysql in the true relational arena (dbm/gdbm/sleepycat can for simpler relations)).
Lets hope that real compitition never stops.
I prefer the "u" in honour as it seems to be missing these days.
Just to clear up any confusion:
While I apreciate your loyalty and I sincerely hope that both you and Mr. Ellison live long and productive lives, comments realy are a lot better when you have at least passing knowledge of the subject at hand.
What a rotten party, have we run out of beer or something?