PostgreSQL Inc. Open Sources Replication Solution
Martin Marvinski writes "PostgreSQL Inc, the commercial company providing replication software and support for PostgreSQL, open sourced their eRServer replication product. This makes PostgreSQL one step closer to being able to replace Oracle as the de facto RDBMS standard. More information can be found on PostgreSQL's website."
While this is a good thing, pseduo replication is possible at the application layer. Think using PHP to squirt a table in one DB into anther one.
Now that Postgress can replicate at DB level other, more interesting things are possible. You can use replication for both failover and performance clustering.
I can't think of anything witty right now
Check monster.com. More companies look for people with mySQL experiences. Check the book stores. You will see more books about mySQL. Even though PostgreSQL has more features and is more promising and powerful, mySQL gets more publicity. This means that mySQL will be the open source database that will replace most commercial databases. It's sad but true.
Ever heard of InnoDB? MySQL lets you choose -- on a table-by-table basis -- exactly what parts of your application need to support transactions, foreign keys, etc.
Odd, neither Slashdot nor Yahoo! Finance seem to be having corruption problems...
Adding extra memory, CPUs, or slave servers obviously has no impact on server performance. (Yes, replication is... clumsy, at best, but depending on the application, it can work quite well.)
Compared to Postgres?
This is indeed good news, as free software always is. But eRServer can only operate in single-master mode, which makes it unsuitable for high-availability kind of work. Single-master systems are good for load-balancing on installations where most of the queries to the DB are SELECTs.
eRServer comes a bit late. We already have PostgreSQL Replicator, which is multi-master. Unfortunately PG Replicator is not supported anymore. The latest version it can work with is 7.1, and the project's latest news are timestamped nearly two years ago.
PostreSQL still dosent support inverse and shadow keys. Until they support them, We will continue to fork out 100.000 a year for Oracle.
Please let me know if I'm wrong...
I visited the site, and the commercial site too and it seems this is only simple replication with the master being a single point of failure. F.E.
1. update a row in the master
2. master replicates the update to multiple slaves
3. clients perform select operations against the slaves (nice load balancing opportunity)
4. the master crashes
5. No one can write until the master comes back online.
Here are the steps that seem to be missing:
6. the slaves elect a new master
7. if the old master comes back up it must realize a new master is present and become a slave.
8. clients using JDBC would need some mechanism of finding out what the new master is when an update/insert/delete fails.
Cheers.
Schedule your world with ScheduleWorld.com http://www.ScheduleWorld.com/ (Java Web Startable)
>At the same time the licencing will change to pure GPL (no more LGPL libraries !!!) And that is supposed to be a good thing ?? Sure if you're RMS, but most of us are not. So now on must opensource the app , or pay. Great :-/
Mnesia.
It's not perfect for everything, but if you want scalability, failure resilience and excellent query time, it's well worth a shot.
As an added bonus, it gives you a good excuse to learn erlang, so that LISP-weenies can sneer at you.
[ shameless_plug ] /shameless_plug ]
StarFish is a block-level storage system allowing on-the-fly geographic replication, that would work with any database. It was OpenSourced by Lucent a few months ago. It won the Best Paper award at Freenix '03.
[
I think PostgreSQL is infinitely superior to SAPDB. Have you actually tried using SAPDB? I have tried it on Solaris, and it failed to run on our version of Solaris (I believe it was 8). I looked at their home page and documentation and I was left feeling that SapDB is a half-baked, kludgey, crusty product.
PostgreSQL on the other hand just works, is fast and is smooth as silk. I think even plain old MySQL 4.x is better than SAPDB (yea, it lacks features, but it's a better product overall).
From the press release:
Does anyone know what the other four components are and whether they're already here?What do you mean they cut the power? How can they cut the power, man? They're animals!
replication = redundant database servers always having up-to-date data
You're kidding, right? I'll skip the redundant comments about transactions, etc, but I do have to comment on corruption and resource hogging. I have built numerous websites and client-server apps with MySQL. Despite the fact that some of these apps frequently add tens of thousands of records per minute, I have never experienced corruption (compared to SQL Server, which does corrupt at every chance) and have found that it uses far less resources than other RDBMS, to deliver results at least as fast.
<troll>Maybe you're just not building and optimizing your databases correctly, and PostGreSQL is making you look good...</troll>
-Ed
Web Design & Software Development
I've got about 17 years experience with RDBMS', covering oracle, db2, postgresql, sql server, etc. So....
Postgresql looks like it's better positioned to eat SQL Server's lunch than oracle's to me. First Off, back in the day (3 years ago), when oracle was licensing by 'power unit' - it cost about $1000 / CPU / Mhz. So - a single CPU license for a 1-ghz machine would set you back $100k! Since then they've had to drop prices - because of the market, and because of DB2 - which is far less expensive.
Anyhow - if you're going to invest a cool million or two in a top-end, enterprise box - a Sun E10K or an IBM Regatta - then you don't rig up some cheap AC solution, use surplus wiring, or a free database on it typically. You put Oracle or DB2 on it. Sure - Postgresql is fine database and it'll save you some bucks, but when you're putting your reputation on the line and have build a business case that justifies (say) $2m in hardware, and probably $4m in labor - it's foolish to try to save $1m on oracle by going with postgresql.
Without *any* support for parallelism, without stronger replication, without better 3rd-party support (think of Toad for instance), without thousands of experienced develpers & dbas out there, without more robust availability functionality - it simply isn't ready to tackle the biggest projects. Or those projects with extremely high availablity requirements. Or relatively large reporting projects (no parallelism). Or just about any projects in a really dedicated single-vendor shop with its act together.
But that's ok - that leave 30-50% or so of the other database work that it can do just fine right now. That's a huge market. And unlike mysql - if your construct a database application using postgresql and then later want to port it to oracle, db2, or sybase - it's just a normal porting of the application. You construct the architecture in just about the exact same way (for most applications anyhow), so the porting is straight-forward. Not so in the case of mysql - where it's severe limitations result in applications doing a ton of the database work - and porting ends up being a complete rewrite.
PostgreSQL still needs better transaction handling before I'd call it ready for prime time. I found its limitations the hard way, attempting to port 13K lines of Oracle Embedded SQL. Imagine transaction A searching a database. Now you want transaction B to do something. It might be just two user queries open at the same time or one transaction reading one table while the second writes to another table. All well and good until one transaction closes. Then all transactions close! That just plain sucks.
However, I was able to port the entire application to Firebird--IMHO a far better database.
"Love is a familiar; Love is a devil: there is no evil angel but Love." --William Shakespeare ('Love's Labors Lost')
Right on target -- MSSQL isn't cheap, but compared to Oracle, it's looks really good. Lots of companies already have MS infrastructure in place, so slapping SQL Server in there is no problem at all. It's also honestly not that bad of a product for the low to mid range, which is where a lot of the money is nowadays.
In my opinion, Postgres can, to a large degree, fill the same slot that MSSQL is filling now, but that just doesn't appear to be what's happening. Still, MSSQL does cost a pretty penny, especially if you're doing several installations. Perhaps as Postgres gets better, more people will realize that and make a switch.
This is like shooting fish in a barrel. I often don't like to harp on the OSSDB fanboys, but Oracle's database solution is second to none, and continues to pull away from the pack.
First things first. Online replication is generally considered by professional DBAs a fools errand. You have to babysit and it fails at the drop off a hat for a variety of reasons. The are no good reasons to do replication in the manner they are talking about, unless that is your ONLY option.
There are however, reasons to replicate data. The reasons you want data replicated are usually for one of two reasons: availability or scalability.
To address availability Oracle provides several options that are just plain better than regular/triggered snapshot logging or materialized view refreshing over a network.
The best option is Oracle's Dataguard, which applies redo/archive logs to a duplicate remote databases. You can perform this option at the logical and at the physical level, and you can choose to maximize/guarantee the protection all the way down to best effort. This option provides the ability to have an absolutely current very warm site, a simple command and you're database is up and running.
As for scalability, again Postgres or mysql doesn't hold a candle. There are too many options to list, so I'll discuss the big ones.
Paritioning/sub-partitioning of data. The way Oracle lays out it's logical database block layer and physical OS block layer is absolutely perfect for being able to do anything you want with the database file layout. I can put my OLTP indexes and tables on fast raid10 devices, the historical and warehousing data on raid5 devices, but that's not all. I can increase parallelization of the hardware by putting a single table or index across N devices. The ability to sprinkle files and chop up data anywhere you want, is just one thing that makes Oracle configurable, scalable, and great.
Real Application Cluster (was Oracle Parallel Server). This is a for REAL clustering solution. Oracle allows several servers (can be dissimiliar in capabilities, i.e. some can have 64gig of memory and 12 processors, and then the others could be smaller dual processor machines.) to connect to the same storage (usually shared over a SAN or SCSI direct connect to EMC gear). Each of the servers is connected to a crossover/ipc LAN (we use gigabit) and now each of the servers has access to the same data. One node goes down or needs to go down for maintenance or reconfig, that's ok, the other nodes are online and traffic can be configured to automatically transfers over to the other nodes MID-TRANSACTION and picks up where it left off and the application is none the wiser (i.e. happens in seconds). The nodes share cached data over the fast network, so there is often little need to go to disk. This kind of scalability can not be found on any other database.
<rant>
The real gain for OSS and Oracle, is Linux and Oracle running on Linux. OSS databases are too immature to be let anywhere near real money. I'm not talking about ecommerce money, I'm talking about the millions and bajillions of dollars that flow like water through companies. Linux has Oracle validation and certification, which goes a LONG way in getting Linux into the real datacenters. The price point for the hardware, and the OS and the special deals that Oracle cuts for it are the true win for OSS. The performance is more than there for Linux/Intel solutions, and the price point for Intel hardware is very attractive to companies looking to cut expenses. You still have to pay homage to the Oracle and EMC gods, but even they have felt the crunch, and they too are providing competitive price points.
</rant>
So Postgres is one feature closer to what Oracle was several years ago. So what, this is embarassing. Mysql has had transactions for how long? a few days? Please people, Oracle is not resting on it's laurels waiting for anyone to catch up. They have real companies, with real money, that are real threats to them. IBM and Microsoft. Oracle, is pushing the edge on the database front, and doesn't show any signs of stopping.
No.
pgAdmin II is somewhat similar.
And pgAdmin III just reached beta and is purported to be quite a bit better (than II).
------
Where are the slash-groupies? I distinctly remember being promised slash-groupies!
If they are going to replace the bloody guts of MySQL with the bloody guts of SAP, or they are going to drop MySQL as we know it today and call SAP MySQL, then perhaps. Otherwise, I think MySQL will have the same level of threat the PostgreSQL will have. The MySQL AB will just own two DBs, and one of them was a commercial product.
I'm not sure why you dislike Postgres so much, since your comment has like a "hahah screw you postgres" feeling to it. Whatever man.
Wow, so I can use a script to vacuum and analyze a few times a day, and maybe, if I really want to, use a script to reindex things every few months.
:)
OR, I can hire an Oracle DBA for $100/hr to slave away maintaining Oracle.
Woot! What a choice. That's it, I'm going back to Oracle. I really miss the contants bugs and patches and the obscene support costs just to call an 800 number and be told "yes, that's a bug, you'll have to upgrade/patch/sacrifice a goat".
I long for that piece of crap SqlPlus, I can't stand using a SIMPLE explain plan, being able to rename columns, JDBC drivers that work, installations that take 3 minutes on a bad day.
I just don't get that with Postgresql, not matter how hard I try to fuck it up.
Yeah, Oracle has some features that Postgres doesn't have, but after 8 years of using Oracle (since 7.1.4 or so through 8.1.7.4 currently), I don't really miss anything except PERHAPS the ability to allocate my own datafiles to distribute I/O.
However, I can't come up with any real reason for that - machines have gotten so fast now, I see vastly better performance on my new dual Xeon 0+1 Linux Postgres boxes than I do on my EMC-backed E6500 Oracle boxes.
I also miss dealing with Oracle sales - people that make used car salesmen seem honest, with obfuscated licensing practices that make Microsoft's BSA invasions and SCO seem reasonable.
I know plenty of crusty Oracle people that swear "dag-nabbits' and 'gosh darnits' and spit on the floor when you mention something like Postgresql. I guess I'm just not old enough to be stuck in such a rut as to not give something else a spin.
I did, and I'll never go back.
MSSQL server is in a very odd place.
It's not as scalable as oracle or db/2 but at the same time it costs 15K per processor which is a hell of a lot of money.
Postgres has just about all the features (and more) of mssql server and it costs nothing. You can buy the EMS postgresql manager for a few bucks and it's much better the enterprise manager.
BTW. The next version of postgresql will have world class replication the database partitioning.
Honestly in the long run I don't see how people are going to justify spending 15K per processor on a database which only runs on windows, only works well with windows clients (the JDBC driver sucks and is slow as hell), has quite possibly the worst stored procedure language known to mankind. I work with it every day and every day it makes me want to poke sticks in my eye to distract me from the pain. Just today I got pissed because none of the string functions work against text fields (varchar only!). Postgres is so much more fun to work with.
War is necrophilia.
No, you aren't the first. And yeah, any SQL coder knows what you'd have done. Keep all the data that must change together in one table and use a single update statement to change it, relying on the fact that individual statements are atomic.
That way, if you were changing someone's address, and phone number, you'd be sure to have a valid address, and a valid phone number. They might be out of sync, but then you'd have a dirty bit to indicate it.
It's pretty easy to do this, with minimal risk, on inserts. If you get into deletes you can quickly end up with addresses for clients who aren't in the DB anymore and so on.
However, the risk, while minimal, is there. This is an okay idea for goofing around on your website where all you lose is a link to a new picture or something, but is totally unacceptable for a commerce site.
However, you attitude is/was common. I used to see it in the context of people putting inline assembly in their C code. Whenever they knew a clever trick to save a cycle or two they'd use it. The silly thing is that compilers are usually better at the silly tricks. Compilers could use the LEA instruction to perform basic integer multiplication just as well as people (and did, when integer multiplication was slow and cycles mattered). Now, we're not all using the same CPUs as we were back in the 486 days. What's a quick thing on an Athlon might save work on a P4, and might hurt performance on a P3. Too many variables to keep in your head, and too ugly of code, for the incredibly minute gains. Especially since people didn't profile their code before going and optimizing it.
Now, the "proper" thing to do is write the cleanest code you can and let the compiler have a shot. If it runs too slowly, consider throwing more CPU at it. If that fails, profile it and try to fix it with a new algorithm, only after all that, go in and hand-tweak certain sections.
No flames here! I like MS SQL server too but it does have it's own hidden depths of weirdness from time-to-time.. and I'd happily tackle a tough install of Oracle if the result was a db that actually had very few bugs.
..gives an error that the string hasn't been closed.
My personal favourite (in MS SQL) is the way it parses chunks of SQL looking for GO statements. If you've got the word GO in a block comment or in a string, it treats it as a legitimate instruction!
PRINT CAST(@cnt AS VARCHAR) + ' RECORDS TO GO'
I'd rather not have to mess around with this sort of crap.
"SQL has proven itself both faster and more stable than Postgres. I'm not a Microsoft fan, but if there's one thing they got right it's SQL Server."
I work with both every day. I don't know why you think they got SQL server right. Lots of lock escalation bullshit, weird locking semantics, goofy functions that only work on some text types and not others, 8K row limit, crappy SP language, the list goes on and on.
Things you totaly take for granted on mysql or postgres trip you up on sql server.
Try passing some XML out of a stored procedure sometime and then tell me how great SQL server is.
War is necrophilia.