PostgreSQL Inc. Open Sources Replication Solution
Martin Marvinski writes "PostgreSQL Inc, the commercial company providing replication software and support for PostgreSQL, open sourced their eRServer replication product. This makes PostgreSQL one step closer to being able to replace Oracle as the de facto RDBMS standard. More information can be found on PostgreSQL's website."
That will be an excellent and much appreciated addition to this excellent database. FP
There is an enormous distance between "viable alternative" and "defacto standard" and the path between them is not paved with features.
While this is a good thing, pseduo replication is possible at the application layer. Think using PHP to squirt a table in one DB into anther one.
Now that Postgress can replicate at DB level other, more interesting things are possible. You can use replication for both failover and performance clustering.
I can't think of anything witty right now
Check monster.com. More companies look for people with mySQL experiences. Check the book stores. You will see more books about mySQL. Even though PostgreSQL has more features and is more promising and powerful, mySQL gets more publicity. This means that mySQL will be the open source database that will replace most commercial databases. It's sad but true.
So we can get them neatly sorted out of the way of the interesting comments. Thanks.
Ever heard of InnoDB? MySQL lets you choose -- on a table-by-table basis -- exactly what parts of your application need to support transactions, foreign keys, etc.
Odd, neither Slashdot nor Yahoo! Finance seem to be having corruption problems...
Adding extra memory, CPUs, or slave servers obviously has no impact on server performance. (Yes, replication is... clumsy, at best, but depending on the application, it can work quite well.)
Compared to Postgres?
You would still have to pay boatloads for support...even with postgres... Open Source does not mean 24/7 Support calls...
I can't say a bad thing about postgresql; this was really the only thing I felt the need for. For anyone who hasn't tried it you really should. Although I don't want to start a MySQL v postgresql flamewar, after trying both I think that postgres has the edge. Mysql was undisputably easier to work with and (at the time) was faster. PostgreSQL has moved on at a much faster rate though. In particular postgresql has solid support for transactions, large objects, subselects, object oriented tables. I'm convinced that if you use databases long enough you'll want every last one of these and won't be able to do without.
Carpe Daemon
This is indeed good news, as free software always is. But eRServer can only operate in single-master mode, which makes it unsuitable for high-availability kind of work. Single-master systems are good for load-balancing on installations where most of the queries to the DB are SELECTs.
eRServer comes a bit late. We already have PostgreSQL Replicator, which is multi-master. Unfortunately PG Replicator is not supported anymore. The latest version it can work with is 7.1, and the project's latest news are timestamped nearly two years ago.
A Good Thing(tm)
Corporations are the people Open-Source needs to get on its side. (And, I might add, the OS community is doing a very good job here). They give a project name-recognition, thousands of users, good infastructure, and credibility. PostgreSQL will hopefully begin to compete seriously with Oracle. Another feather in the Open-Source cap.
((lambda x ((x))) (lambda x ((x))))
PostreSQL still dosent support inverse and shadow keys. Until they support them, We will continue to fork out 100.000 a year for Oracle.
They also have been working on a procedual language for PostgreSQL for server side triggers, and functions. Information can be found here, plPHP.
PostgreSQL has made some pretty nice advancements post version 7+, performance and feature wise. I worked on a intranet where the company spent lots of money trying to get an Oracle solution to work, but found it was way to slow. The suggestion of PostgreSQL, and MS SQL came up. We tested PostgreSQL, and it was acutually faster, and easier to maintain then our Oracle database. The best part was, it was free!
Every Super Villan uses Linux.
InnoDB Hot Backup is a tool which allows you to backup a running InnoDB database without setting any locks or disturbing normal database processing. You get a consistent copy of your database, as if the copy were taken at a precise point in time. InnoDB Hot Backup is also the ideal method of setting up new slaves if you use the MySQL replication on InnoDB tables.
For how many server computers you want to order an evaluation copy, or 1-year licenses (390 euros or 450 US dollars each), or perpetual licenses (990 euros or 1150 US dollars each); discounts are available for large volume orders.
This is from http://www.innodb.com/hotbackup.html
So I get up and take a step toward the coffee machine, I guess I am ALSO "one step closer" to China!
So stop using Oracle?
How does charging lots of money = evil?
The surprise isn't how often we make bad choices; the surprise is how seldom they defeat us.
While this is a nice step forward, the real reason large sites utilize Oracle is because of synchronous replication.
The replication needs to be able to keep all data consistent across multiple servers, without any conflicts. Then, if a particular server goes down, the DNS can simply fail over to a second server.
Once the above has been achieved, then we have a viable alternative to Oracle.
Hehe, you're silly.
Where I work, we have a data warehouse, which grows about 8 GB per day, and holds 13 months of live data. Lets see you hold that in RAM !
> This makes PostgreSQL one step closer to being able to replace Oracle...
Please! While this may help win the hearts and minds of OOS geeks, it does little to improve their standing in the business world.
Oracle is as established in the database world as Microsoft is on the desktop. This alone would doom any OOS wannabe to quiet places like web server back ends where they already do well anyway ( e.g. mySql ).
Put aside the technical considerations, support, client base, etc and PostgreSQL still offers as much of a threat to Oracle as mySql or dBase. The only real threat I've seen to Oracle supremacy is Microsoft's SQL Server but, of course, that's only in MS shops.
Starting with the next release SAPDB will be rebranded as MaxDB by MySQL AB.
This will probably mean that PostgreSQL will have a very hard time competing with MySQL ! (also see the info on the SAPDB webpage)
At the same time the licencing will change to pure GPL (no more LGPL libraries !!!)
Please let me know if I'm wrong...
I visited the site, and the commercial site too and it seems this is only simple replication with the master being a single point of failure. F.E.
1. update a row in the master
2. master replicates the update to multiple slaves
3. clients perform select operations against the slaves (nice load balancing opportunity)
4. the master crashes
5. No one can write until the master comes back online.
Here are the steps that seem to be missing:
6. the slaves elect a new master
7. if the old master comes back up it must realize a new master is present and become a slave.
8. clients using JDBC would need some mechanism of finding out what the new master is when an update/insert/delete fails.
Cheers.
Schedule your world with ScheduleWorld.com http://www.ScheduleWorld.com/ (Java Web Startable)
I like PostgreSQL, and Open Source deserves capitalization, but I'd like to hear an enterprise DBA's perspective on if this really compares to Oracle's configurability, clustering capabilities, or the seamless swapping of redundant database packages when deployed on, say, an EMC 1000, for reliability and failover. BTW, for this request, "enterprise" = Fortune 100, not Joe's Web Hosting.
Like the subject says, I'm not a DBA, but I know some pretty heavy-duty ones that say nothing beats Oracle running on HP Superdomes with EMC storage.
Comment removed based on user account deletion
We use PostgreSQL a lot. Since all the of the sites related to this story are in the process of being /.ed, can someone tell me what this replication thingy does?
MySQL and PostgreSQL have entirely different purposes driving them. MySQL = fast, small, low-footprint at the sacrifice of features. PostgreSQL = full-featured at the sacrifice of performance (though, just a little performance). So when you say a leap "ahead" of MySQL, you'll have to qualify that. Myself, I prefer PostgreSQL, simply because of the features. I USE MySql because it is fast a 99.9% of hosting providers offer no other alternative.
-----
Web Hosting @ HostForADollar.com
Mnesia.
It's not perfect for everything, but if you want scalability, failure resilience and excellent query time, it's well worth a shot.
As an added bonus, it gives you a good excuse to learn erlang, so that LISP-weenies can sneer at you.
I had a bit of a giggle when I read the parent but seriously guys (those replying to him/her). and this will seem as a troll but I don't care.
Have you actually used MySQL and tried to break it? The damn thing is hopeless in comparison to PostgreSQL, Oracle and even SQL Server:
InnoDB transactions don't include the DDL so your create table/index etc... WON'T roll back when you cancel a transaction - so really mysql transactions are for inserts, updates and deletes ONLY. Don't give me this crap about innodb being the be all and end all..
it will not perform validation checking on dates correctly, inserting 29/02/2003 works! It allows you to insert 00/00/0000 when that doesn't even EXIST!
it doesn't obey the datatypes you tell it to use and will happily insert 100.00 into a numeric(4,2) field but no -100.. why? cause the programmers use an extra bit for signing and instead you'll wind up with -99.99. This is correct (although your data is fucked) but whats with 100 being legal?? It will even allow you to insert a CHARACTER into a numeric field WITHOUT complaints - I want my database to tell me when something is wrong and enforce my business rules.
You should always try and build as much of your rules into the db app server - thats what it is - an application server, don't put all your logic in your client app. I'm not surprised that Slashdot is fine and all - all the logic is probably in the perl.
I'm sorry for the trollish tone but I could NEVER recommend someone use MySQL. Now MaxDB might be different and I'm all for it if it is but lets just hope it doesn't inherit to much from the MySQL codebase...
And then watch the DBA break down and cry as the UPS fails during a blackout...
I am TheRaven on Soylent News
"PostgreSQL Inc, the commercial company providing replication software and support for PostgreSQL, open sourced their eRServer replication product"
Many thanks you guys....we've been waiting for this a long time
What a rotten party, have we run out of beer or something?
troll troll troll.
Er, well, I have mysql installed on my windows dev machine and it's scurrently consuming 2.848 MB. I suppose that doesn't fit into 640k, but there you go.
Troll.
Invoicing, Time Tracking, Reporting
Until SAP, PeopleSoft, and Oracle applications support it, which will happen, respectively, probably no time soon, probably never, and never, it won't "replace Oracle as the de facto RDBMS standard."
Orange whip? Orange whip? Three orange whips.
[ shameless_plug ] /shameless_plug ]
StarFish is a block-level storage system allowing on-the-fly geographic replication, that would work with any database. It was OpenSourced by Lucent a few months ago. It won the Best Paper award at Freenix '03.
[
I know that in business it's down to risk management, but if you really need 24/7 support to keep your database running, that tells me your DB Admins aren't up to the task, and that your DB Software is too flakey. Afterall, it's just another piece of software...
Forget thrust, drag, lift and weight. Airplanes fly because of money.
And you clueless. There are some technologies that can dump the hd storage devices (no pun intended) and unify the whole memory speed hierarchy. If that happens, say goodbye to rdbms's and swallow something like object prevalence.
From the press release:
Does anyone know what the other four components are and whether they're already here?What do you mean they cut the power? How can they cut the power, man? They're animals!
By the time PostgreSQL is capable of replacing Oracle as the de facto standard, MS SQL Server will already have done the job and be the new standard to beat. No matter how much people despise MS SQL Server, nor how much they over rate Oracle, it is now the standard for companies needing a low or mid range solution. As with all MSFT products, it improves with each iteration. Our needs aren't high-end, but it handles our 300GB databases with 170 million row tables remarkably well.
But....but....but....it's easy and free.
You'll have that sometimes...
So now we have two major OSS databases with 99% of the features that commercial offerings have, and lots of features that they don't (I'm a MySQL guy, so I know what those extras are for that database, but knowing OSS development paces, I'm sure the same is the same for PostgreSQL).
I listen to folks at work talk about why we "need to move to a *real* database at some point", and it always comes down to the fact that they've bought into the marketting, and when they examine their reasons (if they are willing to), solutions like PostgreSQL or MySQL are a whole lot better choices than the "real" database choices out there.
Bravo guys!
And then watch the DBA break down and cry as the UPS fails during a blackout...
Geez... Now what whould be the point of object prevalence? Perhaps the idea is to make sure the objects prevail, no matter what happens. Perhaps you don't know what you are talking about?
pg_dump -f mydb.dump -U mydbuser mydb
Allows to backup a running PostgreSQL database.
No locks.
Does not disturb normal database processing.
Consistent copy.
Free.
Original Ingres product -> more features -> Postgres -> compliance with SQL standards -> PostgreSQL
You're kidding, right? I'll skip the redundant comments about transactions, etc, but I do have to comment on corruption and resource hogging. I have built numerous websites and client-server apps with MySQL. Despite the fact that some of these apps frequently add tens of thousands of records per minute, I have never experienced corruption (compared to SQL Server, which does corrupt at every chance) and have found that it uses far less resources than other RDBMS, to deliver results at least as fast.
<troll>Maybe you're just not building and optimizing your databases correctly, and PostGreSQL is making you look good...</troll>
-Ed
Web Design & Software Development
It's a damn good thing you posted anonymously. How DARE you say a SINGLE BAD word about MySQL!! How dare you. It is a proven scientific fact that things like triggers, subselects, stored procedures, etc. are only needed by TERRORISTS. Do you hear me? Terrorists. If you had posted using your real name, God help you. You had better switch back to MySQL, use it, and stop asking so many damn questions. DO YOU UNDERSTAND
"You would still have to pay boatloads for support...even with postgres... Open Source does not mean 24/7 Support calls..."
Let's see, with Oracle, I can call 24/7 and get access to a complete idiot with a phone who can say he's escalated it and give me a cute little number. I have to trust them that they do actually have qualified engineers working on it and not some lame-o that just got hired last week. The amount of time it takes to reach a senior developer is unknown.
With PostgreSQL, most emails to psql-general are looked at by a VERY experienced crew (many of the lead developers), who are very happy to help you out if you are polite.
If you need more than that, it is available, but a lot of the "support" that people need to pay for with Oracle simply isn't required w/ PostgreSQL or open-source in general.
Engineering and the Ultimate
WHO MODDED THIS UP!?? TROLL!
It doesn't. However, Oracle _is_ evil, at least for their Enterprise Applications. Their sales reps are complete liars - the are trained to always say "yes" to "can it do X". Why? Because if it's not in the base install, you can code it yourself. Well, Golly Gee, don't I purchase an enterprise application so I don't have to code it myself?
In addition, you can't really take it for a test drive, because it takes MONTHS to set up, even for small installations (it was a year process for a 300-employee company I used to work for, but we also had customizations as well).
Also, they never feel a need to actually ship a _working_ product. The version we used would not let you enter an order quantity over 9. 9??? Was this tested by ANYONE? I've heard worse stories from their salesforce management software. They basically ship code straight from the developers to the end-users, and allow YOU to test it. And, when you discover a bug, it may take several months for them to get around to a fix, even if you have it escalated to SEV-1 (okay, maybe SEV-1s are fixed within a few weeks, but still...).
And their consulting staff... Ahhh!! We were paying hundreds of dollars and hour for someone who had only used Oracle for 8 months! This was from Oracle headquarters. The guy had only done COBOL programming before, and had never clued into the concept of local variables. We had to recode everything he did.
And, on top of that, we paid a HUGE amount of money for this privilege.
It's not really the money that's the problem. It's that you might expect such service from a two-bit company that charged you $10/hour, but to be charged hundreds an hour for a totally crappy product, you just feel screwed in the end.
Engineering and the Ultimate
Here's a pretty good article comparing MySQL and Postgres.
I've got about 17 years experience with RDBMS', covering oracle, db2, postgresql, sql server, etc. So....
Postgresql looks like it's better positioned to eat SQL Server's lunch than oracle's to me. First Off, back in the day (3 years ago), when oracle was licensing by 'power unit' - it cost about $1000 / CPU / Mhz. So - a single CPU license for a 1-ghz machine would set you back $100k! Since then they've had to drop prices - because of the market, and because of DB2 - which is far less expensive.
Anyhow - if you're going to invest a cool million or two in a top-end, enterprise box - a Sun E10K or an IBM Regatta - then you don't rig up some cheap AC solution, use surplus wiring, or a free database on it typically. You put Oracle or DB2 on it. Sure - Postgresql is fine database and it'll save you some bucks, but when you're putting your reputation on the line and have build a business case that justifies (say) $2m in hardware, and probably $4m in labor - it's foolish to try to save $1m on oracle by going with postgresql.
Without *any* support for parallelism, without stronger replication, without better 3rd-party support (think of Toad for instance), without thousands of experienced develpers & dbas out there, without more robust availability functionality - it simply isn't ready to tackle the biggest projects. Or those projects with extremely high availablity requirements. Or relatively large reporting projects (no parallelism). Or just about any projects in a really dedicated single-vendor shop with its act together.
But that's ok - that leave 30-50% or so of the other database work that it can do just fine right now. That's a huge market. And unlike mysql - if your construct a database application using postgresql and then later want to port it to oracle, db2, or sybase - it's just a normal porting of the application. You construct the architecture in just about the exact same way (for most applications anyhow), so the porting is straight-forward. Not so in the case of mysql - where it's severe limitations result in applications doing a ton of the database work - and porting ends up being a complete rewrite.
Even though the article is about an improvement in the PostgreSQL community, the comments are mostly pgsql vs mysql. People in the bazaar need to have personal motivations to work on opensource projects, mostly to have something against Microsoft, but increasingly, it is becoming a series of team wars. Linux vs BSD, then we had KDE vs GNOME and now qmail vs postfix and mysql vs pgsql. More than a decade ago we had vi vs emacs and BSD vs SYSV.
What the posters here need to realize is that it is exactly this competition that is driving the projects. If MySQL was not given the press and did not have its cult following, we would not see this pace of development for pgsql. The developments for FreeBSD really improved to compete with Linux although their developers claim they are not competing... they do have the fear Linux will supplant them.
What is interesting to note is that in most of these project wars, both projects really survive and get two different niches of their own. This was true of bash vs csh, BSD vs SYSV, BSD vs Linux, KDE vs GNOME, and now MySQL will become the standard entry level database and pgsql the higher level.
I use pgsql because my databases have complicated requirements that MySQL cannot meet. Yet MySQL is the quick and dirty solution when I have to set things up fast. For all new learners I always suggest MySQL. For people thinking of replacing or duplicating their ERP systems, I always suggest pgsql. I even know how to program in sleepycat's db and know where it should replace mysql in smaller embedded systems and where the mysql license cannot be used.
I believe this competition is coming to a close since pgsql has taken a big lead over MySQL in features, and therefore made itself more difficult to deal with especially for newbies. All I can say about the postgresql replication is bravo, and hope MySQL doesnt follow suit so it remains the simple fast and easy database in its own niche.
"Give orange me give eat orange me eat orange give me eat orange give me you." -Nim Chimpsky
You are incorrect, and if you think about it, you'll realize that it's very simple to make sure that changes you make only have an effect on the state of your data once they are fully integrated.
One of my favorite ways to do this is with a table that consists only of a primary key and a "dirty bit". It makes my work harder, but the number of places that you make lots of changes that all depend on eachother SHOULD be low. If that's the case, then the complexity that managing this state information adds will not be too painful, and actually makes you THINK about your data.
When you select data of of your database, you just make sure that you don't pull any data that's dirty. I've done this many times, and it keeps my eye on what I'm actually doing and why those changes matter to eachother. That alone as saved me hours (perhaps days) of debugging.
PostgreSQL still needs better transaction handling before I'd call it ready for prime time. I found its limitations the hard way, attempting to port 13K lines of Oracle Embedded SQL. Imagine transaction A searching a database. Now you want transaction B to do something. It might be just two user queries open at the same time or one transaction reading one table while the second writes to another table. All well and good until one transaction closes. Then all transactions close! That just plain sucks.
However, I was able to port the entire application to Firebird--IMHO a far better database.
"Love is a familiar; Love is a devil: there is no evil angel but Love." --William Shakespeare ('Love's Labors Lost')
What's your beef with SQL Server? I worked with Sybase for years, and SQL Server is much better. As soon as the two of them split, every decision Sybase made was the wrong one. Replication and data transformation is dead simple in SQL Server. If it can handle a 300 gig database, it's more powerful than I'll ever need. It's easy to maintain with utilities like the SQL Profiler and the index tuning wizard. I'm not a DBA, and with SQL Server I don't need to be. Yeah, yeah, yeah, Microsoft shill, whatever....
two questions:
Is it a bug or just a known limitation?
Is this still an issue with current version?
*sigh* back to work...
What if your network connection flakes out for a second? Then what? That's why there is such a thing as replication, for all the times when simple copying doesn't work for some reason - there needs to be extremely thorough error handling and methods for getting the data back in synch.
This is like shooting fish in a barrel. I often don't like to harp on the OSSDB fanboys, but Oracle's database solution is second to none, and continues to pull away from the pack.
First things first. Online replication is generally considered by professional DBAs a fools errand. You have to babysit and it fails at the drop off a hat for a variety of reasons. The are no good reasons to do replication in the manner they are talking about, unless that is your ONLY option.
There are however, reasons to replicate data. The reasons you want data replicated are usually for one of two reasons: availability or scalability.
To address availability Oracle provides several options that are just plain better than regular/triggered snapshot logging or materialized view refreshing over a network.
The best option is Oracle's Dataguard, which applies redo/archive logs to a duplicate remote databases. You can perform this option at the logical and at the physical level, and you can choose to maximize/guarantee the protection all the way down to best effort. This option provides the ability to have an absolutely current very warm site, a simple command and you're database is up and running.
As for scalability, again Postgres or mysql doesn't hold a candle. There are too many options to list, so I'll discuss the big ones.
Paritioning/sub-partitioning of data. The way Oracle lays out it's logical database block layer and physical OS block layer is absolutely perfect for being able to do anything you want with the database file layout. I can put my OLTP indexes and tables on fast raid10 devices, the historical and warehousing data on raid5 devices, but that's not all. I can increase parallelization of the hardware by putting a single table or index across N devices. The ability to sprinkle files and chop up data anywhere you want, is just one thing that makes Oracle configurable, scalable, and great.
Real Application Cluster (was Oracle Parallel Server). This is a for REAL clustering solution. Oracle allows several servers (can be dissimiliar in capabilities, i.e. some can have 64gig of memory and 12 processors, and then the others could be smaller dual processor machines.) to connect to the same storage (usually shared over a SAN or SCSI direct connect to EMC gear). Each of the servers is connected to a crossover/ipc LAN (we use gigabit) and now each of the servers has access to the same data. One node goes down or needs to go down for maintenance or reconfig, that's ok, the other nodes are online and traffic can be configured to automatically transfers over to the other nodes MID-TRANSACTION and picks up where it left off and the application is none the wiser (i.e. happens in seconds). The nodes share cached data over the fast network, so there is often little need to go to disk. This kind of scalability can not be found on any other database.
<rant>
The real gain for OSS and Oracle, is Linux and Oracle running on Linux. OSS databases are too immature to be let anywhere near real money. I'm not talking about ecommerce money, I'm talking about the millions and bajillions of dollars that flow like water through companies. Linux has Oracle validation and certification, which goes a LONG way in getting Linux into the real datacenters. The price point for the hardware, and the OS and the special deals that Oracle cuts for it are the true win for OSS. The performance is more than there for Linux/Intel solutions, and the price point for Intel hardware is very attractive to companies looking to cut expenses. You still have to pay homage to the Oracle and EMC gods, but even they have felt the crunch, and they too are providing competitive price points.
</rant>
So Postgres is one feature closer to what Oracle was several years ago. So what, this is embarassing. Mysql has had transactions for how long? a few days? Please people, Oracle is not resting on it's laurels waiting for anyone to catch up. They have real companies, with real money, that are real threats to them. IBM and Microsoft. Oracle, is pushing the edge on the database front, and doesn't show any signs of stopping.
No.
Support contracts would seem to be a good way to gauge enterprise use and give some idea where the money^h^h^h^h^h I mean where people were putting their attention, and 'voting' in a way for better future releases. Support experience should ideally trickle back into new releases.
Be Free: Free Software Tuition
It didn't have row-level locking before? Okay, I can see that. But it didn't have FOREIGN KEYS??? And people used to bristle whenever someone suggested it wasn't a real database....
This wouldn't affect RDBMSs at all. Object prevalence is very fast, but the hierarchical databases before RDBMSs were very fast as well. The reason that RDBMSs took precedence was not speed, it was data integrity and data manipulation. It still has these advantages. If you just view an RDBMS as a bit-bucket, you should have just used hierarchical databases all this time, as they are much faster.
Engineering and the Ultimate
It is a limitation with PostgreSQL 6. I thought at first that I must be doing something wrong. Then I found Postgresql: Developer's Handbook by Ewald Geschwinde, et al (published by SAMS) at a local B&N and looked up transaction management. It spelled out the bad news in a way the PostgreSQL docs only hinted at--sure enough, you can't independently manage transactions. Close one and you close them all.
I haven't heard of anything being done to add this capability to PostgreSQL 7. I hope someone puts it on top of their list. There were things I prefered about embedded SQL with PostgreSQL vs. Firebird but PostgreSQL's limited transaction management killed all my work with it.
"Love is a familiar; Love is a devil: there is no evil angel but Love." --William Shakespeare ('Love's Labors Lost')
It is a known issue. However, nested transactions are currently being worked on...
LedgerSMB: Open source Accounting/ERP
Mr. or Ms. Coward is on to something here. Competition drives open source development teams, but, ideally, the competition doesn't drive them into head to head battles in the end, but drives them into different niches.
For my part, I run a site on top of MySQL. I started using it as a newbie to Linux RDBMSes. It's never going to be so big that I'll need to run anything else. However, I've got a new project on the horizon where Postgres is the perfect solution. The best thing is that in a Linux environment, there's much less lock-in than in other environments, so you aren't forced to be a partisan about things like this.
Online citizen journalism from the inner city: The View From The Ground
The increased load of checking this bit constantly is faster than the load of using transactions?
What about the decrease in code maintainability?
Also, what about updates to rows, if you insert a new row it's easy to mark something as not ready, but if you're changing multiple pieces of data, transactions are almost the only way to go, otherwise you have to manually implement rollback into your code, which at that point, who's not to say you don't loose the connection, or the DB shuts down at that exact moment.
Transactions aren't that hard to work with, (Gee this depends on something I did above, I better wrap it in a transaction.) That combined with Row Level locking, you should rarely, if ever run into a deadlock.
Sorry, but me being a DBA, you just struck a cord with this one, You implemented a semi-cleaver hack, on top of something that you should have abstracted to the database, in the name of a performance increase you probably didn't need. If a developer where I work submitted a new schema that used that method without a REALLY good reason, It would never make it to a production server, the data just couldn't be trusted in the case of a failure.
Not functional yet, but heh, every project's got to start somewhere. </shameless plug>
SQL Server will never run on any version of UNIX. AFAIK, there aren't even (MS-supported) SQL Server client libraries for non-windows platforms. I realize that FreeTDS is available, but such a library would never be used in a highly critical sector.
If you have to integrate multiple platforms, you cannot use SQL server. Closest similar product is Sybase ASE, but Microsoft broke Sybase compatibility on purpose.
I think if people understood how irrationally obstinate SQL Server's platform dependence was, they would look elsewhere. I hope that this attitude holds them below 10% penetration - it certainly has up to now. They are a bit player.
In any case, there is a cheap, new version of DB2 out for $500/copy.
You're reaching for a problem. Suffice to say that I've been doing this for years, and in those cases where I care, I do this and it works just fine, and plenty fast enough. YMMV
Everywhere else, I structure my data so that atomicity on a statement level is all that I care about.
Way cool.
However, the only other RDBMS's to do this (to the best of my knowledge) are all closed source or otherwise very much non-free.
File under 'M' for 'Manic ranting'
Yeah, that would be a mistake. I suggest not doing it that way.
Honestly, has no one ever written code like this? I cannot be the first person on the planet to have decided that letting the system decide how I might want to approach data integrety was a bad thing....
That's also true for Oracle. DDL statements just aren't transactional, and that's not a problem. (Some might be in PostgreSQL...I seem to recall something weird about that. But I don't consider it an advantage.) If you are care about DDL statements in transactions, you're doing something seriously wrong. It's something that you do manually (not from an application!). You don't have simultaneous DDL statements going on. And you should be able to figure out the opposite statement yourself.
As for the rest, that's pretty bad. I design my databases with data types for a reason, so when the database doesn't enforce it...ugh. And inserting anything other than what I say (-99.99 instead of -100) is no good, even if I told it to do something impossible (it should throw an error instead).
This from a proud user of PostgreSQL (and Oracle at work).
You should always try and build as much of your rules into the db app server - thats what it is - an application server, don't put all your logic in your client app.
Yeah, the DB vendors love for you to do this because it locks you in to their DBMS -- portability of stored procedure (etc) code not being one of the database world's high points.
That said, you ought to be able to rely on the DBMS to enforce at least some constraints, and you should be taking advantage of at least the minimal subset of same that all real DB servers support. (And if you're building an enterprise-specific application where portability is much less of an issue than performance, go ahead and embed the code in the DB.)
-- Alastair
This is not Highlander there can be more than one.
MySQL is a good solution for some tasks. Postgres is a good solution of some other tasks. I swear people are so odd. There can be room for more than one OS, Database, Office Suite, and CPU. I really like Postgres and use it for our in house database. I use Mysql for our website's database. Why? because it is what our ISP provides and it works.
How about this... Learn Both.
See my blog http://ilovecookes.blogspot.com/ for light hearted technical information.
> I structure my data so that atomicity on a statement level is all that I care about.
So, you throw away the R in RDBMS, just for a small potential performance gain. You then hack at basic transaction features. You don't have atomicity, but you have a way of telling if you've screwed up a record.
But then what? What if you do screw up a record? Do you have a query that runs at startup and goes and deletes these now unsafe records? Is it okay that you've mangled a record (ie, dirty bit set, data may or may not have been written)? These can't be important records then. Not customer data or anything.
You'd be better off taking the money that pays for this extra, wasted, coding and simply buying a slightly faster CPU, or more RAM, if you feel speed is an issue. That way you'd get the safety of transactions. You could also use the relational aspects of the database and make your queries run a *lot* faster. (You can't index properly (quickly) between tables unless they're keyed with a a real foreign key, not just the same piece of data in both tables but unlinked.)
So when data are in the process of changing, they just don't show up in queries? That's horrible. And what happens if the system goes down in the process of changing? You can't roll back; the old data are lost. And you have to manually go in and clear the dirty bit for the broken, half-new data to be even remotely accessible. Or code logic checking for this everywhere, which would be a nightmare of duplicated, unnecessary code.
With this method, you will eventually run into a case where you will have to pull an old backup because a transient failure caused your stupid method to corrupt a lot of data. Transactions are pretty much essential to ensure correctness; you'd have to reimplement the transaction system to get its guarantees, and it's much smarter to use the one that's already there and tested.
And what about procedures that need a consistent view of the database? There's more in ACID than the "A". Transation isolation is necessary for a lot of applications. They can ensure that multiple queries were run on the same set of data.
If people aren't thinking about their data and doing stupid things, that's entirely separate from their using transactions. There are methods for rigorously proving that your transaction use is correct. I strongly recommend to you that you study them. You sound like you care a lot about correctness. You're not achieving it at all now. If you used transactions, you could.
Does mySQL have an embedded SQL preprocessor yet?
Until it does, my C-with-embedded-SQL code will contine to run fine with Postgres, Oracle, Sybase, DB2 and other real datatabase servers, but not mySQL.
(Okay, there are a couple of very minor differences between some of the above, in particular slight variations on the EXEC SQL CONNECT syntax between Oracle and Postgres, but that's easy to code around. The rest is the same.)
-- Alastair
By forcing myself to deal with those consequences manually by doing my own locking and my own data-integrety management, I find that I can rely on my data far more than most people can, and the likelihood that one of my programs is going to "go bad" and rip out whole transactional units just because an non-essential field was initialized oddly is much, much lower.
And since your ad hoc locking and transaction management code never contains bugs (unlike the piddly implementation in the RDBMS that's only been tested by millions of users), everyone lives happily ever after.
No, you aren't the first. And yeah, any SQL coder knows what you'd have done. Keep all the data that must change together in one table and use a single update statement to change it, relying on the fact that individual statements are atomic.
That way, if you were changing someone's address, and phone number, you'd be sure to have a valid address, and a valid phone number. They might be out of sync, but then you'd have a dirty bit to indicate it.
It's pretty easy to do this, with minimal risk, on inserts. If you get into deletes you can quickly end up with addresses for clients who aren't in the DB anymore and so on.
However, the risk, while minimal, is there. This is an okay idea for goofing around on your website where all you lose is a link to a new picture or something, but is totally unacceptable for a commerce site.
However, you attitude is/was common. I used to see it in the context of people putting inline assembly in their C code. Whenever they knew a clever trick to save a cycle or two they'd use it. The silly thing is that compilers are usually better at the silly tricks. Compilers could use the LEA instruction to perform basic integer multiplication just as well as people (and did, when integer multiplication was slow and cycles mattered). Now, we're not all using the same CPUs as we were back in the 486 days. What's a quick thing on an Athlon might save work on a P4, and might hurt performance on a P3. Too many variables to keep in your head, and too ugly of code, for the incredibly minute gains. Especially since people didn't profile their code before going and optimizing it.
Now, the "proper" thing to do is write the cleanest code you can and let the compiler have a shot. If it runs too slowly, consider throwing more CPU at it. If that fails, profile it and try to fix it with a new algorithm, only after all that, go in and hand-tweak certain sections.
Honestly, has no one ever written code like this? I cannot be the first person on the planet to have decided that letting the system decide how I might want to approach data integrety was a bad thing....
Yeah, I think you might be alone here. I cannot imagine how I could safely update records in my database safely without transactions. It's not unusual for me to have 25 or so dependent queries from eight application servers (plus external sources) that absolutely must be done atomically. No amount of coding at the application level could get this done safely.
RDBMS theory works, and people spend a lot of time at that layer making it work so you don't have to.
Similarly, filesystems work, and people spend a lot of time at that layer making it work so you don't have to (but you wouldn't be the first person to use a raw partition on a UNIX system thinking you could do better than a filesystem).
-- The world is watching America, and America is watching TV.
I find it funny that so many journalists always say that KDE/GNOME or Postgres/MySQL should be stopped and only 1 solution should be allowed. Yet, at the same time, they quickly point to all the different applications in the windows space.
Myself, I use Linux/KDE/pgsql ~ 95% of the time. But there are times where I like BSD (awesome security), Gnome (i like their simple interface and their apps are nice in a number of areas), and Mysql (want a fast mostly read DB? Nothing beats Mysql in the true relational arena (dbm/gdbm/sleepycat can for simpler relations)).
Lets hope that real compitition never stops.
I prefer the "u" in honour as it seems to be missing these days.
Still, I guess I learned a lot from going with PostgreSQL so I'm not too upset.
When you select data of of your database, you just make sure that you don't pull any data that's dirty.
To reiterate what an earlier poster asked -- how do you stop somebody else from pulling dirty data?
If you and your app are the only client the DB has, that might be okay. It's hardly a general solution though.
-- Alastair
My only hope that someone in PostgreSQL Inc will read what you are saying. Postgresql ORDBMS needs the same hype as Linux already got. otherwise, sooner on /. we'll see stupid comments "Postgresql is dying" in a same style as they are posted about *BSD.
Seriously, BSD is not that bad, but why it is not that popular as Linux? B/c there is no virtually any publicity around it. Say "open source operating system" to any boss, the reaction will be "Linux?". Now say "BSD". The answer will be "BS-who?". In a same way if you say "open source DBMS" to any boss, the reaction will be "MySQL?". But if you say "Postgresql" the reaction will be "Postgres-who?".
I understand that for piblicity PostgreSQL Inc needs money, and the only way to get a big money is to make PostgreSQL to be interesting for big software corps, like IBM or HP. Why IBM has chosen Linux vs BSD to support? B/c BSDL doesn't protect their source code cotributions. Here might be a key: the license. Perhaps PostgreSQL Inc should reconsider their BSDL license. Well, that might be impossible as most of their core team hate GPL (that's why they usually hate Linux).
Less is more !
... More likely the authors had taken the Graduate Records Exam after college and done very poorly, so they decided to write database software instead.
Thus it really is Post GRE SQL.
Dude-
As a long-time Oracle user, developer, and DBA, I can say that I think you are misguided in this attempt.
In essence, in a few (relatively) short lines of code you are attempting to replace that which Oracle has spent many years developing and proving in thousands of installations.
Your "lack of faith" (for lack of a better term) of Oracle's transactional capabilities, in my opinion, is misguided. As well, your implementation has integrity and performance concerns. (Mind you, if you're doing small, uncomplicated apps, it might not be that big of a deal, but if you're using Oracle, then I'm assuming that it IS an issue).
Your implementation will not have the capabilities or features of Oracle's implementation (distribution, fault-tolerance, high availability, performance, etc.), is open to potential data corruption due to failure, and it will take much longer to develop the application. You say you work hard on the implementation, but I'm thinking that you're working TOO hard.
Also, transactions and rollbacks are a great tool if/when used appropriately.
$0.02 (CDN)
Just to clear up any confusion:
I'm developing a high volume, high availability site where SELECT speed and scalability are very important. I originally investigated Postgres because it supposedly uses less locking than MySQL does (when using the InnoDB transactional tables) but I decided against Postgres because of lack of a free replication solution. It would be easier to do a huge, multi-server site like Hotmail, Ebay, finance.yahoo.com, etc. if you have replication.
I'm now wondering if I should make the switch to Postgres. Where are some of the most recent, reliable benchmarks for MySQL vs. Postgres SELECT performance using transactional tables for typical web applications?
Also, I'm using some MySQL specific features like AUTOINCREMENT. Is the Postgres trick to doing AUTOINCREMENT just as fast as MySQL (or fast enough, anyways)?
Does Postgres have something like MySQL's convenience variables?
I think I might be much happier with Postgres's feature richness if I can learn more about it . . . .
from the mySQL docs: "Note that MySQL allows you to store certain 'not strictly' legal date values, for example 1999-11-31. The reason for this is that we think it's the responsibility of the application to handle date checking, not the SQL servers. To make the date checking 'fast', MySQL only checks that the month is in the range of 0-12 and the day is in the range of 0-31."
The 0/0/0000 is the same as a NULL, IIRC
if (!signature) { throw std::runtime_error("No sig!"); }
More apps should do this, such as DB2. Point is gotta think outside the box a little.
Hardly - I'd say that is MySQL which owns the low end, not MSFT. PostgreSQL is competing with MySQL and MSFT on the low end, and Oracle, DB2 on the high end.
I work for the City of Newport News, Va. We're using PostgreSQL for some stuff here and getting ready to use it for some high profile application. Unfortunately, we're using SQL Server along with Laser Fiche. Did you guys write your own imaging software?
Perhaps I'm too supid to understand, but here goes.
Why would you need/want a transaction for a query. I don't see any benefit there. If you're not updating any data, you're wasting time and resources using a transaction.
I'm not arguing that a lack of concurrent transactions doesn't drastically reduce the usefulness of the database. I just think that you used a bad example...
They who would give up an essential liberty for temporary security, deserve neither liberty nor security
That being said, nowadays I tend to use MySQL for a simple datastore (eg, limited writes and light reads) or DB2 or Oracle for heavy duty work. All of the above three could learn a thing or two about user friendliness... I especially like DB2's error messages: "Err 560: DB2MAN" (made up error message, but representative of the cryptic shit DB2 throws out).
I am about to start some work on Postgres again after a couple of years away from it. I am glad to see it has progressed.
Maybe that was a bug in earlier versions, but it works for me using Query Analyzer and SQL Server 2000.
DECLARE @cnt INT
SET @cnt = 2
PRINT CAST(@cnt AS VARCHAR) + ' RECORDS TO GO'
Returns: "2 RECORDS TO GO" (and no error messages)
Transaction? No. But a LOCK can be handy (if implemented correctly, of course) to do a query of data from a fixed point in time. Perhaps that's what the OP meant. I sincerely hope so.
Before we start, remember: a database is a collection of data. A DBMS is the system used to store, retrieve, maintain, etc. the data IN the database. A Relational Database Management System is a DBMS which relies on predicate logic, logical data independence, and set theory to maintain the data in the database. Codd invented this in (I think) 1969/70.
This makes no difference if you write your application correctly and check your data going in for VALIDITY. It shouldn't be the DB server's job to enforce this...
No. No, no, no, No, NO, NO. This backwards, uneducated thinking is why DB-driven applications are increasingly becoming more bloated, buggy, and just plain wrong.
The Relational Database Management System (RDBMS) is NOT just a place to stick data in. Codd, fed up with basically inventing a DBMS every time an application was written and being forced to implement checking and other things in the application, penned his ideas on Relational Theory.
The RDBMS, through application of predicate logic, guarantees correctness (consistency) of your data. It's that simple.
If you don't allow the RDBMS to guarantee your data is correct, then you are a fool, ignorant, or both.
the database shouldn't know that such and such is not a valid doohicky because the wotsit is set to foo.
*jumping up and down mad*
(ignoring the misuse of 'database') YES IT SHOULD! That's the WHOLE POINT of a DBMS! Certainly a good portion of logic should live in your application. But 'business rules', constraints, anything that deals with your data etc. BELONG in the RDBMS, because how else are you to ensure that the data you pull out of it is correct? You can't.
Thanks,
--
Matt
If the app attempts to put bad data in the db, you're in trouble anyway. No amount of db-logic is going to solve this.
I suspect this boils down to dbas vs software developers, and the right answer is somewhere in the middle. But I don't see a good reason to put data integrity logic (beyond transactions) in the database and I do see a good reason (db platform independence) to put it in the application. This is assuming you have only one application that accesses the db, as is often the case.
I feel better getting that off my chest. Now let me school you little punk beotch newbie dba-wannabes on the what's and wherefore's of enterprise class database administration. Rule #1: Protect Your Data! There is no rule #2. The real reason so many corporations use Oracle is that, if you know what you're doing (and so many of you obviously haven't the first clue), Oracle RDBMS will always be able to recover any committed transaction no matter how severe or catastrophic the failure. Can you say, "archive log mode?" If not, I cannot in good conscience say that you can protect a companies most valuable asset--it's data.
And another thing, while I'm hot on a rant, protecting the data against server or disk failure is one thing, but protecting it against the vagaries of doofus programmers is quite another thing entirely. A good dba accomplishes this via something called database constraints, at which Oracle excels. Can you say, "foreign key?" If not, eventually your database will resemble swiss cheese, with less referential integrity than President Bush's Iraqi WMD speeches.
Finally, everyone is always comparing MySQL and PostgreSQL's runtime performance to Oracle's. Please keep in mind that a database with only a handful of tables and no referential integrity is little more than a file system, and I do not care how many millions of records you stuff into the tables. Grownup databases contain hundreds of tables with multiple schemas and very complex data models. Can MySQL or PostgreSQL handle thousands of requests per second on a grownup database? I would really like to find out for myself.
Finally finally, does PostgreSQL's feature list truly compare with Oracle's? Here is a short list of Oracle features that I cannot live without:
So Postgres is one feature closer to what Oracle was several years ago.
And, while this may come as a surprise to you, most people today are running Oracle using the same feature set that Oracle had several years ago: software doesn't get hacked up to use every latest feature every time Oracle ships a new version. Therefore, when you (a UNIX/Oracle specialist) are saying that Postgres is where Oracle was a few years ago, that means that Postgres is technically good enough for most applications.
Oracle, is pushing the edge on the database front, and doesn't show any signs of stopping.
Perhaps not. But the ability to absorb, and need for, new features in database based applications is slowing down.
A transaction can provide the ACID properties throughout the duration of the transaction. Most importantly for read-only operations, you have Isolation. Isolation let you perform your queries without being affected by ongoing writes.
This is better than a lock, which another poster suggested, as a lock will block all writers, while a transaction need not block anybody.
I honestly do not see MySQL as a serious RDBMS engine. It doesn't support many things necessary in order to really manage the data in the backend. THe whole "integrity is optional" stance regarding transactions, foreign keys, etc. is a problem for many serious applications, and PostgreSQL fills this need very well.
OTOH, MySQL is an amazing lightweight datastore system exactly because it treats integrity as optional. Memory-only tables are certainly not ACID compliant, but they can be nice programmers' tools.
LedgerSMB: Open source Accounting/ERP
> I don't see a good reason to put data integrity
.NET application to your oracle database currently front-ended with Java than you are to swap the oracle database to sql server. And when you add that second application - you'll almost guarantee inconsistent data.
> logic (beyond transactions) in the database and I
> do see a good reason (db platform independence) to
> put it in the application.
Been doing this for over half my life now - relational databases have been around since 1980, and haven't really changed that much. Meanwhle I've seen a wide variety of languages come and go. The reality is that the application is far more volatile than the database.
So, you're far more likely to want to add a
Stored Procedures aren't very portable, but then again simple check constraints and simple procedures (consisting mostly of sql) are. Easily built, easily ported, and allow n-number of applications front-ends.
That's the way to go if you want to see your survive technology changes.
Online replication is generally considered by professional DBAs a fools errand.
Not if you use a capable replication solution (see rep server and open switch).
Thanks,
--
Matt
Strictly speaking, it still doesn't. Using InnoDB tables kills most of the advantages of using MySQL, so what's the point?
sic transit gloria mundi
That's the reason MS SQL Server is gaining ground over Oracle, and what PostgreSQL should target in order to get a bigger annual growing.
In plain relational databases Oracle is better and Postgres is more cost effective, however now the hot stuff is bussiness intelligence and data-mining.
We are Turing O-Machines. The Oracle is out there.
2. PostgresQL does support different concurrent transactions on different connections to the database or different users
This is the kicker. I'm still on 7.2, maybe this is better in 7.3 and 7.4, but if you have connection A to database and connection B to database, you wind up with
A: BEGIN
A: DELETE FROM FOO...
A: INSERT INTO FOO...
B: SELECT * FROM FOO... (or maybe an INSERT)
A: >
before you can commit, so you lose the transaction. Depending on the frequency of data writing, you might have to give up transactions entirely if you want to write, or forever be betting the race condition.
My God, it's Full of Source!
OUTSIDE_IP=$(dig +short my.ip @outsideip.net)
the last 'A:' said 'TRANSACTION ABORTED' before slashcode had its way with it...
My God, it's Full of Source!
OUTSIDE_IP=$(dig +short my.ip @outsideip.net)
I think you may have misconstrued what I was saying. Read what I said, "...Postgres is one feature closer to what Oracle was several years ago." In other words Postgres today, isn't even close to what Oracle was several years ago.
You bring up another good point that only surprises me in that it is in support of Oracle. Oracle has backwards compatibility, and a migration option to get customers on the new version. The same feature sets are available across all newer versions of Oracle. You are right software doesn't get hacked up to use the latest features, but I'll be damned if I have to hack around incomplete SQL support, not good enough backup and recovery options, limiting/poor performance tuning variables and views, and virtually non-existant scalability.
Oracle is there for when organizations need to grow up, and take advantage of real features such as rock solid point in time backup and recovery, built in scalability (not something that is bolted on and an after thought). These are features that are absent from open source offerings, and demanded by companies that recognize a need for them.
In favor of Postgres, there's is the only open source contender I have ever seen that as a development culture kind of "gets it" as far as where a database engine needs to be. (i.e. Postgres seems like Oracle compared to MySQL.) However, they are still on the D- end of the bell curve.
Postgres is technically good enough to provide a SQL interface for manipulating data for most applications. However, this is only half the answer for professional organizations, but worse it's akin to saying "Look, I can brush my teeth and tie my own shoes." It's not impressive, it's expected.
<letmetellyouastory>
I used to be a mysql fan boy. It was really cool. I felt cool, because I could do all of these things, and feel really clever because I could do them for free. Then I got a job as a webmaster (used to mean *nix/oss/web/network ninja) for a real company, and all they used was Oracle. They just wanted me to make those annoying web customers shut the hell up. It was about a months worth of work, and then it was boredom.
I started volunteering to do other work that popped up during the Monday meetings. Applying patches doing admin work to Unix OSes I hadn't touched (i.e. HPUX, AIX, OSF). I then started watching the Oracle guys, and figured out what they did. I told them, that looks like MySQL. They thought I was cute, when I explained to them what it was. They then explained that MySQL sounds like a SQL interface to flat files, which it mostly was, and still is.
Long story short, I started installing Oracle on our sandbox machines, and trying to do things with it. I thought it was incredibly complex, for what it was doing. The DBAs took notice that I was becoming midly interesting, and had the company pay for Oracle training.
When I went to training I was enlightened. Not because the training is stellar, but because I then realized all of the stuff I had given no thought to. Every day I was thoroughly impressed with what was going on. "This was how a database was suppose to be run" I thought. MySQL is light years behind. If you really think about what is going on and how they are doing things, you begin to marvel at the ingenuity and understand the reason someone is rightfully asking for money.
I had my classes, I took the certification tests (mandatory to be in the DBA department), got my pay raise bump, and I haven't looked back.
</letmetellyouastory>
No.
I know. I know. It's like I said. Embarassing. For a long time the MySQL culture didn't recognize the need or even understand ACID or anything transaction related. Was "a waste of time", I think they said, since an application developer could attempt to build their own implementation into their application if they needed it.
<sarcasm>
No need to put cruft like transactions or SQL92 support into an RDBMS
</sarcasm>
No.
I'll have to a look at this. I don't know enough about Sybase to speak intelligently about it.
However, my experience with online replication is the wierd things that bite you in the ass, like storage constraints, network problems, changed passwords, wierd locking issues, and other bug a boos that made it a big pain in the ass.
It was a far better idea to replicate the entire database offsite via redo/archive logs, than to try to run many online at the same time and keep the transactions in sync between them. This is just my experience your mileage may vary, and Sybase may have a for real solution to this problem, however, I am suspect when they mention store and forward queues as "the answer".
No.
Want to do hot backups? Then all of your tables must be transaction-aware InnoDB tables (and you have to pay for the priviledge of course). Ever perform multi-row inserts? All of the affected tables should be transaction-aware InnoDB tables. If any of your data relates to each other (*cough* relational database *cough*), you need foreign keys. Here come those InnoDB tables again. Got multiple clients modifying the data at the same time? InnoDB again.
So basically what you are saying is that InnoDB tables should be used unless the table in question has no data interaction whatsoever with any other table, should only ever have one-row-at-a-time inserts, single-user access, etc. So...ummm...why would anyone use non-InnoDB tables?
I don't browse Yahoo Finance so I can't comment on that site or the scope of MySQL's use. As far as Slashdot...ummmm... Are you kidding!?! You are using Slashdot as a metric for the reliability of MySQL? Slashdot is your example of 24/7 uptime and consistency? The site that's regularly down and relies on static renderings of their site to avoid advertising the outages: this is one of your great examples?
I must say that I greatly enjoyed InnoDB's benchmark page. The comparison in E-Week is conveniently listed next to the PostgreSQL comparison giving the impression that E-Week also prefers MySQL over PostgreSQL even though PostgreSQL wasn't in the review. Nice marketing indeed. That aside, I also like the PostgreSQL comparison section: only two years old after all. Neither package has changed substantially in two years have they? <sarcasm>And sure, I could be persuaded that the one test query is indicative of all normal queries one would encounter in normal database operation.</sarcasm> Could it be that perhaps a certain group went hunting for items that InnoDB was markedly better at and avoided the items in which it was weaker. Naw. Couldn't be.
And this part I love the best: not only was the hardware the same, but the tuning techniques were identical. 24MB shared memory buffers. That's it on a 512MB test server with two tables of 100,000 rows. Hunh? No, that doesn't sound like they tuned MySQL overall to the detriment of PostgreSQL. How well do you think MySQL+InnoDB would fare if the app tunings were done with PostgreSQL in mind and just mapped the same settings to MySQL+InnoDB?
Note: The shared buffer setting has little to do with the overall memory usage in a working system (and even less in this benchmark). Why? After shared buffers are filled up, the OS will start to aggressively cache filesystem access (like for example, the ever-accessed database tables). MySQL is written with this situation in mind. PostgreSQL depends more heavily on the shared buffers setting (among others). If both DBs are using all available memory -- which is likely when querying a couple of 100K-row tables -- which database do you think will perform much better with the artificial constraint of a 24MB shared buffer?
And don't start with the "harder to configure" crap. This wasn't an out-of-the-box install of MySQL+InnoDB here. Someone specially configured the 24MB buffer. It took as much effort as it would to properly configure PostgreSQL. There is a wealth of information about tuning PostgreSQL for hardware, for good database organization techniques that work for more than just PostgreSQL, and of course the main PostgreSQL technical document site. (All of this was found in less than two minutes
- I don't need to go outside, my CRT tan'll do me just fine.
How is my post a troll? The information is taken from their website.
And this needs to be repeated far and loud: If you aren't using 100% InnoDB tables, you are royally screwed. Folks still running any MyISAM tables take note!
- I don't need to go outside, my CRT tan'll do me just fine.
It should be noted however that PostgreSQL was proportionally even further behind Oracle in the past. They're far behind, but think they're running faster.
- I don't need to go outside, my CRT tan'll do me just fine.
"defacto standard" is a bit of hyperbole. I like the PostgreSQL team's insistance on transactions and integrity, unlike MySQL's original denial that these mattered (which pretty much destroys any credibility they might have had in my mind, even if they now support transactions with InnoDB). But PostgreSQL is surprisingly primitive in some respects.
I was trying to write an OLTP application with 7.3.4 and the current API does not support bind variables. Most OLTP queries will use the same SQL repeatedly, with some variables changing for each transaction.
The difference between sending:
('insert into table foo values (:1)', 42)
('insert into table foo values (:1)', 137)
('insert into table foo values (:1)', 69)
and
'insert into table foo values (42)'
'insert into table foo values (137)'
'insert into table foo values (69)'
is that in the second case, as the SQL text varies for each request, it has to be reparsed and a new optimizer plan recomputed for each query, adding tremendous overhead.
PostgreSQL 7.4 (currently in beta) fixes this,but to me it shows a certain level of immaturity in the product for high-performance applications.
I am sure PostgreSQL will get there eventually, but it will take a while. UNIX did not become an enteprise-class OS overnight either.
I grabbed full copies of Oracle 9i (3 cds) last night so I'm just awaiting the opportunity to install and try them out.
Transaction? No. But a LOCK can be handy (if implemented correctly, of course) to do a query of data from a fixed point in time. Perhaps that's what the OP meant. I sincerely hope so.
Yes, a transaction. I suggest you read up on MVCC, which is supported by PostgreSQL, Firebird, and Oracle.
I don't understand what you mean. Can you post some SQL code, along with the result compared to what you expect?
Here's what I did:
A: create table foo(i int);
A: insert into foo values(1);
A: insert into foo values(2);
A: insert into foo values(3);
A: BEGIN;
A: delete from foo where i = 2;
A: insert into foo values(5);
B: select * from foo;
A: COMMIT;
and everything went through without errors. I now have a table 'foo' with tuples 1, 3, and 5.
What would you expect?
Jeff
Social scientists are inspired by theories; scientists are humbled by facts.
Two other MySQL products I found interesting (neither of which is open source at this time):
- CLUSTERING IN TUNE WITH APACHE AND MYSQL (Free registration might be required. Also see Emic Application Cluster
for MySQL)
- InnoDB Hot Backup (with point in time backup)
The rest of this comment is quoted verbatim from InnoDB NewsMySQL/InnoDB-4.0.1 and Oracle 9i win the database server benchmark of PC Magazine and eWEEK. February 27, 2002 - In the benchmark eWEEK measured the performance of an e-commerce application on leading commercial databases IBM DB2, Oracle, MS SQL Server, Sybase ASE, and MySQL/InnoDB. The application server in the test was BEA WebLogic. The operating system was Windows 2000 Advanced Server running on a 4-way Hewlett-Packard Xeon server with 2 GB RAM and 24 Ultra3 SCSI hard drives.
eWEEK writes: "Of the five databases we tested, only Oracle9i and MySQL were able to run our Nile application as originally written for 8 hours without problems."
The whole story. The throughput chart.
Trusted Computing FAQ | Free Dawit Isaak!
Screw free JBuilder. Use Eclipse. Or try JDeveloper from Oracle. Its free for dev. use.
My blog: http://jkratz.dyndns.org/~jason/blog/
Good point! Makes me feel better about the way I'm leaning - for the particular applications I'm working on (see sig), I know the app will outlive the db, so I'm safe.
For the vast majority of business applications, I think you're 100% correct.