Slashdot Mirror


Open Source Database Clusters?

grugruto asks: "A lot of open source solutions are available to scale web sites with clusters but what about databases? I can't afford an Oracle RAC license but can I have something more reliable and fault tolerant than my single Postgres box? I have seen this recent article that looks promising for open source solutions. Do anyone have experiences with clusters of MySQL , Postgres-R, C-JDBC or other solutions? How does it compare to commercial products?"

350 comments

  1. Bailing wire and duct tape by Anonymous Coward · · Score: 4, Funny

    Works everytime.

    1. Re:Bailing wire and duct tape by Anonymous Coward · · Score: 2, Funny

      Clusters? Usually, we just tape a bunch of cats together...

    2. Re:Bailing wire and duct tape by Anonymous Coward · · Score: 0

      Makes 'em easier to herd, all taped up like that.

    3. Re:Bailing wire and duct tape by Vaevictis666 · · Score: 2, Informative

      Off-topic, but it's Baling wire. As in the wire used to hold together bales of hay and whatnot.

    4. Re:Bailing wire and duct tape by DamienMcKenna · · Score: 1, Informative

      Actually its "baling twine", they don't use wire for it as the wire would cut through the materials being bound. And twine is cheaper.

    5. Re:Bailing wire and duct tape by Read+Icculus · · Score: 2, Informative

      This is just as off-topic as the grandparent, but having grown up around plenty of farms I've used and seen baling wire more than a few times. It is indeed actual wire, (galvanized steel according to one site that still sells it), and it works just fine for tying up a bale of hay. Some people still use it, but it has mostly been replaced by plastic twine, as it is indeed cheaper and easier to use. Here's a link to a fellow waxing sub-poetic on the bygone days of baling wire - Read All About It. And this is a site that sells baling wire, and has a few pictures - REC.

      --
      Anti-social? My code is just platform-specific.
    6. Re:Bailing wire and duct tape by Anonymous Coward · · Score: 0

      AC'ing 'cos it's so offtopic, but Baling wire is definitely used for some materials (ie: bales of recycled cardboard cartons).

    7. Re:Bailing wire and duct tape by BJH · · Score: 1

      It most definitely is baling wire. Do a Google search for it if you don't believe me.

    8. Re:Bailing wire and duct tape by chewy_2000 · · Score: 0

      Just to continue this completely off topic thread.. I've used baling twine every time I've harvested the bales of hay. Nasty red stuff that frays very easily, but you can just cut it off with a knife.

    9. Re:Bailing wire and duct tape by nolife · · Score: 2, Funny

      Damn, I thought is was barbed wire. No wonder I was the only one having a hard time.

      --
      Bad boys rape our young girls but Violet gives willingly.
    10. Re:Bailing wire and duct tape by Anonymous Coward · · Score: 1, Funny

      Am I offtopic? I thought this was a farming thread. :)

      The hay (rolls and bales) came with that cheap fiberous twine (grass of some sort), but ours wasn't red. Different parts of the world, I suppose.

      Some places used the plastic strapping, but that wasn't very popular with the farmers.

      I've seen twine, plastic strapping, steel strapping, and metal wire, depending on what the application is.

      I find, when shorting out equipment, baling wire works best. Just loop it across every bare soldier joint you can find, and apply power. Twine just doesn't have the same effect.

      Insulated wire is good if you're running a lead from the ignition coil to someone's seat. ZZzzap! Ahh, the good ol' days, when you could play a practical joke without being called a terrorist.

      "Hey Bob, catch this capacitor!"

      **OUCH!!**

      WhooHoo!

      "Hey Bob, touch this wire, tell me if it's live."

      {{ZZAP}}

      "Hey Bob, try this Windows machine out, tell me if it crashes."

      Doh.

      Silly Bob, he'll never learn.

    11. Re:Bailing wire and duct tape by Anonymous Coward · · Score: 0

      Wow! Imagine a beowulf cluster of cats!

    12. Re:Bailing wire and duct tape by (startx) · · Score: 1

      I know your just joking, but bailing wire and duct tape are what's holding my car together. Bailing wire is holding the line from my gearbox to the stick, and duct tape is patching a hole in the bumper from when my sister borrowed the car 2 years ago :-)

      %lt;/OT&gt

    13. Re:Bailing wire and duct tape by (startx) · · Score: 1

      wow, I totally screwed the < and >. I need sleep.

    14. Re:Bailing wire and duct tape by Anonymous Coward · · Score: 0

      Hay-wire? :)

    15. Re:Bailing wire and duct tape by mini+me · · Score: 1

      Nasty red stuff that frays very easily

      We started using blue twine this year. Same stuff though. I have no idea why the colour changed, though it's probably a different supplier.

    16. Re:Bailing wire and duct tape by innosent · · Score: 1

      Hey, at least barbed wire is twisted pair!

      --
      --That's the point of being root, you can do anything you want, even if it's stupid.
    17. Re:Bailing wire and duct tape by pmz · · Score: 1


      I prefer Quaker State and ball bearings, myself.

    18. Re:Bailing wire and duct tape by mph · · Score: 1
      Off-topic, but it's Baling wire. As in the wire used to hold together bales of hay and whatnot.
      Well, that explains why the damned boat is still full of water. Right tool for the job and all that.
  2. transactionality is hard by solarisguy · · Score: 5, Insightful

    For what it's worth, the commercial solutions are hard to setup, unstable and terribly difficult to maintain, and this is after a small fortune has been invested in making them work. Not to knock the open source solution, but it's hard to beleive that something that is infrequently used and difficult to understand will be truly production quality if you want to use it for money.

    1. Re:transactionality is hard by subk · · Score: 3, Funny

      Is it just me or is that last sentance rather hard to parse?

      --
      Now, if you'll excuse me, I have backups to corrupt.
    2. Re:transactionality is hard by sys$manager · · Score: 4, Insightful

      Oracle 9i RAC running on Veritas Foundation Suite HA+ Database Edition is a snap. Parallel server in 8 was hard to set up and unstable though.

    3. Re:transactionality is hard by tickticker · · Score: 1
      For what it's worth, the commercial solutions are hard to setup, unstable and terribly difficult to maintain, and this is after a small fortune has been invested in making them work. Not to knock the open source solution, but it's hard to beleive that something that is infrequently used and difficult to understand will be truly production quality if you want to use it for money.

      There are so many things wrong with this comment, I can't begin to take it apart piece by piece.

      -There is no sig to see here.... move along, just move along now

    4. Re:transactionality is hard by Anonymous Coward · · Score: 0

      it is just you.. from my perspective

    5. Re:transactionality is hard by snero3 · · Score: 1

      Don't know which databases you are using, but oracle's RAC is not all the hard to set up and is very reliable. I admit that I am only using it on linux so it might be different for solaris, HP-UX etc..

      --
      It said "windows 98 or better" so I installed Linux
    6. Re:transactionality is hard by djbckr · · Score: 2, Informative
      Our shop is running 9i RAC. It's hard to set up, (for a newbie at clustering).
      Now that I have, it's pretty cool and quite stable. We've tested transparent failover a few times (once due to an instance failure) and nobody notices. Amazing.

      In my opinion, it's worth the cost. We'll have to agree to disagree with open source solutions. For those that can't afford it, I suppose the alternative is the better solution.

    7. Re:transactionality is hard by ba_hiker · · Score: 1

      Well not only is tranactionality a problem at the time of a failure, it impacts recovery too. When a failed node is restarted, it must be brought up to date and the initial replication structure restartd, without loosing anyting. Additionaly a falure must be allowed durring this process. It gets complicated, and any solution based on rsync is not a solution.

    8. Re:transactionality is hard by Rorschach1 · · Score: 4, Insightful

      I never set up a pre-9i cluster on Windows, but I ran Parallel Server in 7.1 on OpenVMS and it worked great. Of course, OpenVMS has had real clustering for a long time - Windows still isn't anywhere near where OpenVMS was 20 years ago.

    9. Re:transactionality is hard by murali_v82 · · Score: 1

      Oracle 8 triggers an incident in my mind. When we were looking for alternates to Postgres for high-availablity, i had been to oracle and talked to some tech guys about Oracle 8 solutions. They gave us a great marketing material and presentations on Parallel server. Later i found that in India (where we wanted to deploy this) there was not a single installation of parallel server and world-over less than dozen installations. This is a few years back.
      Whatever solution is chosen, it is best to talk to a referenceable client who has used this solutions and try and get as much first hand info as possible.

    10. Re:transactionality is hard by sys$manager · · Score: 3, Insightful

      Nothing will ever touch VMS clustering, it's kind of sad that 20 year old technology is so much more stable than modern technology. I mean VMS has only advanced two major versions in ten years.

    11. Re:transactionality is hard by s00p41337h4x0r · · Score: 1

      This is exactly correct. This is exactly why open source systems will never replace commercial ones, as shown in other complex pieces of code such as operating systems.... or large servers... or compilers... or.... Wait, what was the question?

    12. Re:transactionality is hard by Anonymous Coward · · Score: 0

      Not to knock the open source solution, but it's hard to beleive that something that is infrequently used and difficult to understand will be truly production quality if you want to use it for money.

      Why don't you ask the Star Wars Galaxies people how their Oracle solution is working out? It's always going down, and slow as hell. No doubt put together by people who think one big giant database at the center of everything is always the way to go.

    13. Re:transactionality is hard by mentin · · Score: 0
      VAX got clustering in VAX V4 (Sep 1984), this is 19 years ago. And it become OpenVMS in Nov 1992 - 11 years ago. See http://www.decus.de:8080/www/common/vms/qaa/vmsboo k.pdf

      So yes, Windows is not anywhere near OpenVMS clusters 20 year ago - unlike them, Windows does exists!

      --
      MSDOS: 20+ years without remote hole in the default install
    14. Re:transactionality is hard by mentin · · Score: 1
      That remind me of an old joke when Oracle guys made lots of presentation about their future products but did not have any good current software (not that it changed much):

      Q: what is the best hardware to run Oracle
      A: overhead projector

      --
      MSDOS: 20+ years without remote hole in the default install
    15. Re:transactionality is hard by Anonymous Coward · · Score: 0

      Nit, meet pick.

    16. Re:transactionality is hard by nettdata · · Score: 2, Funny

      I mean VMS has only advanced two major versions in ten years.

      Yeah... then all the maintainers went off and started Debian.

      --



      $0.02 (CDN)
    17. Re:transactionality is hard by hughk · · Score: 1
      They will. Actually VMS Clustering is about 22 years old now.

      The problem is that a lot of Linux research comes out of academia where VMS was seen as lumbering and slow. I have worked mostly in commercial environments where VMS meant 5 nines-plus uptime and the security of knowing that if some idiot pulled the plug on your CPU, another would smoothly take over.

      Now people are asking for better reliability from Linux maybe someone will start to look at some cool stuff like the Distributed Lock Manager as a first step and then some way of cluster serving locally attached disks like MSCP by building the locking into a file system.

      --
      See my journal, I write things there
    18. Re:transactionality is hard by Tenareth · · Score: 1

      Of course, list price of RAC is $40,000 per CPU... not exactly cheap. :)

      --
      This sig is the express property of someone.
    19. Re:transactionality is hard by leandrod · · Score: 1

      > the commercial solutions are hard to setup, unstable and terribly difficult to maintain, and this is after a small fortune has been invested in making them work

      This is simply not true about Digital (now Oracle) RDB on VMS clusters, nor about IBM DB2. Perhaps not even about Oracle 9. I mean the unstability and difficulty part, price is high obviously.

      Obviously part of the difficulty and price is inherent to the problem, as there are not only products but lotsa services associated.

      > Not to knock the open source solution, but it's hard to beleive that something that is infrequently used and difficult to understand will be truly production quality if you want to use it for money.

      PostgreSQL is pretty mature, with a commercially supported 2-years old solution having just been freed, and thus in widening use.

      --
      Leandro Guimarães Faria Corcete DUTRA
      DA, DBA, SysAdmin, Data Modeller
      GNU Project, Debian GNU/Lin
    20. Re:transactionality is hard by pspinler · · Score: 1

      You mean like SSIC clusters ? It's a project to take Digital's cluster technology and port it to linux.

      They've got the IBM version of the DLM, two cluster filesystems (CFS for locally attached storage, and GFS for cluster shared storage), transparent process migration AKA mosix, and lots lots more.

      Keen stuff, check it out.

      --
      The biggest problem with communication is the illusion that it has occurred
    21. Re:transactionality is hard by Anonymous Coward · · Score: 0

      From an application development perspective a DB cluster can introduce some subtle and extremely difficult to diagnose bugs around data concurrency. For what its worth we have an Oracle 8i cluster and plan to get rid of it.

    22. Re:transactionality is hard by BuffPustule · · Score: 1
      I beg to differ: you overly generalize. The ease or difficulty of setup is entirely dependent on the particular requirements, such as topology of the replicated machines (hub and spoke, server to server, multimaster fully connected networks, uni- or bi-directional data flow), allowable propagation time, network speed, whether you need the full set of data at each site, and so on.

      For relatively straightforward projects I have set up commercial replication products in a couple of hours. You do need to know the requirements in great detail up front, however, and that means you have to understand the data model, the table relationships, the kinds of transactions written by the application, how the users and the administrators need the replicated system to run, etc.

      It helps if you can work closely with someone who knows the data model inside out and someone who understands the app inside out (preferably the same person!).

      All too often, people end up trying to shoehorn a replication product (usually the "free" one that comes with the RDBMS) without really understanding how replication will affect the app. The most complicated installs I saw were in cases like this, or where the application could not be modified to correctly handle the fact that the underlying data were being replicated. And in many of these cases, no replication product would ever work.

      (In case you're wondering, I once wore many hats at PeerDirect, which has a replication engine that works on several RDBMS', including PostgreSQL, Oracle, etc., but I no longer work there).

      The upshot: do a lot of research into how you need the app to behave, how the overall system should handle site downtime, how the system should behave once a downed site comes back online, and then make sure the replication product will do what you need. Some products really are a pain in the arse; others aren't, and it all comes back to those pesky requirements.

  3. Check out Emic Networks by venom600 · · Score: 5, Informative

    We've been evaluating the Emic application cluster for MySQL and have had pretty good results. It's a new product (so YMMV), but it looks promising.
    Emic Networks

    1. Re:Check out Emic Networks by grugruto · · Score: 2, Interesting

      One main issue I see with the Emic solution is that it does not support transactions. I saw their demo at the last LinuxWorld in SF and they are just using a multicast layer to broadcast the queries to all nodes (they don't parse SQL so they can't handle transactions properly).
      Moreover, if you have queries like UPDATE ... WHERE date=NOW() , you will just get a different result on every node! At least, solutions like C-JDBC replaces macros such as NOW or RAND on the fly so that all databases are consistent.

    2. Re:Check out Emic Networks by deKernel · · Score: 1

      My guess is that you should not be doing a 'WHERE data=NOW()' in that type of enviroment. The data should be calculated first, the query sent out to be run.

    3. Re:Check out Emic Networks by egon · · Score: 1
      From Emic's product page:

      By forming virtual database entities, which can be either local or distributed, EAC eliminates the need for a shared storage device and provides failover protection while preserving all transaction ACID requirements.

      Sounds like it would be covered to me...

      --
      Give a man a match, you keep him warm for an evening.
      Light him on fire, he's warm for the rest of his life
  4. Clustered JDBC by davidpfarrell · · Score: 1, Informative

    If you're using Java, you might want to check out the Clustered JDBC project

    -D

    --
    Cube On! (http://stores.ebay.com/PuzzleProz)
    1. Re:Clustered JDBC by Anonymous Coward · · Score: 1, Funny

      You mean C-JDBC, as linked in the story text? Wow, nice sleuthing to find that one.

    2. Re:Clustered JDBC by Trejkaz · · Score: 2, Informative

      Funny, that's what the article was about. c-JDBC is an implementation of RAIDb.

      --
      Karma: It's all a bunch of tree-huggin' hippy crap!
  5. MySQL Replication by infernalC · · Score: 4, Insightful

    MySQL has very nice replication functionality, and, in certain circumstances, you can even set up replication rings. It is somewhat flexible about the topology you choose to use, so pick the one best for your application. Load balance ala DNS and you're in business.

    1. Re:MySQL Replication by sys$manager · · Score: 2, Insightful

      Yes, as long as you don't give a damn about ACID compliance.

    2. Re:MySQL Replication by Anonymous Coward · · Score: 1, Informative

      And most websites don't. Unless you need true transaction support then this solution will work. If you need transactions and clustering then you probably need to find a way to pay for an Oracle setup.

    3. Re:MySQL Replication by gregfortune · · Score: 1

      Rings? Are you implying *all* servers involved in the replication process could handle writes rather than a master that handles writes and a bunch of slaves that handle all the read access? If this is true, point me to some docs :) That would be too cool.

    4. Re:MySQL Replication by Hamstaus · · Score: 5, Informative
      Rings? Are you implying *all* servers involved in the replication process could handle writes rather than a master that handles writes and a bunch of slaves that handle all the read access? If this is true, point me to some docs :) That would be too cool.

      Here you go.

      The part you are probably interested in is this:
      You should run your slaves with the --log-bin option and without --log-slave-updates. This way the slave will be ready to become a master as soon as you issue STOP SLAVE; RESET MASTER, and CHANGE MASTER TO on the other slaves.
      Note that if you decide to "ring" your server setups, then you are not necessarily helping distribute the load, you are simply creating redundant masters in the case that your primary machine becomes unavailable. Also, you'll have to write your own monitoring scripts. MySQL says they are working on some tools for this... I'm excited to see what they come up with.
      --
      I moderate "-1, Fool"
    5. Re:MySQL Replication by TheLinuxSRC · · Score: 1

      PostgreSQL has just open-sourced replication software as well....

      www.pgsql.com

      I grew up poor... I can't afford a sig....

    6. Re:MySQL Replication by linhux · · Score: 2, Insightful

      I used to say the same a few years ago. But nowadays I say that most website that actually has some kind of functionality at all above "insert-update-select" an article or similar, needs - or at least wants - transaction and isolation support. It makes it much easer to create logical database schemas without having to tuck everything into a single table just to make sure you only need one atomic INSERT. Instead you can spread things out and allow for a lot more flexibility - and then you'll want transactions to maintain consistency among the tables.

    7. Re:MySQL Replication by techwolf · · Score: 3, Informative

      Bah, DNS isn't load balancing.

      LVS + MySQL works really well. We've got grouped clusters of databases that we can allocate more/less resources to as needed. Reporting cluster for the slower queries, faster cluster for the real-time queries and a few specific application clusters.

      Replication keeps them in sync but there isn't a good HA solution available for the master database yet. Perhaps in MySQL 5.0. In the meantime, use DRBD + heartbeat for near HA.

      --
      I don't do this for karma, I do it for cash. It's much better.
    8. Re:MySQL Replication by csnydermvpsoft · · Score: 2, Informative

      Note that if you decide to "ring" your server setups, then you are not necessarily helping distribute the load, you are simply creating redundant masters in the case that your primary machine becomes unavailable.

      Not necessarily. The largest part of most database access is reads - searching, retrieval, etc. This often times vastly outnumbers writes, depending on the application. Reads do not have to be replicated, giving a big performance boost.

    9. Re:MySQL Replication by toddhunter · · Score: 1

      Just remember though that mySQL is NOT free for commercial applications. You will need a license for each server you setup for replication, and this can get very expensive very quickly.

    10. Re:MySQL Replication by fava · · Score: 4, Informative

      Almost right. MySQL is free to use in a commercial application, its not free to distribute or embedd in a commercial application.

      MySQL is dual licenced, and one of those licences is GPL. You can use mySQL for free anywehere and in any manner that conformed to the GPL.

    11. Re:MySQL Replication by strobert · · Score: 3, Insightful

      sorry, maybe it is just me, but the whole "ARRGG IT AIN'T ACID" is a lot of hype to me. ACID boils down to transactions. plain and simple.

      And I have found in many applications it is easier to deal with transaction type data consistency at the app layer instead of the db one.

      knowing that a DB transaction is complete doesn't help you if for in order to move forward you have to have db ops done in mtuliple servers and/or a change happen with an external vendor.

      And generally some bad code/process will at some point munge your data in a similar way as if you had a db crash in the middle of a transaction.

      I have generally seen for most applciations you are better off just coding things to treat outside input (including data from a db) as evil until you
      have verified it and cope with the abnormalities.

      Yes there are exceptions, but ACID tends to be a knee jerk reaction, and most people realyl need to be askign themselves what it ACTUALLY buys them.

    12. Re:MySQL Replication by Daniel+Phillips · · Score: 4, Informative

      sorry, maybe it is just me, but the whole "ARRGG IT AIN'T ACID" is a lot of hype to me. ACID boils down to transactions. plain and simple.

      Perhaps you need a deeper understanding.

      ACID tends to be a knee jerk reaction, and most people realyl need to be askign themselves what it ACTUALLY buys them.

      It buys them a database that you they can expect to still be there, sound and consistent, after the machine blows a fuse in the middle of 200 simultaneous updates. It buys them a database that doesn't accumulate rot over time because somebody deleted a customer at the same time somebody in another city entered an invoice. It buys them queries that give the right answer, because of only ever seeing the database in a consistent state, even while other queries running at the same time are only partially completed.

      Basically, it gives them a database capable of completely correct operation, not just mostly correct. Of course that may not matter to you, in that case I have a faulty pacemaker to sell you.

      --
      Have you got your LWN subscription yet?
    13. Re:MySQL Replication by Brian+Blessed · · Score: 1

      Almost right. You said:
      You can use mySQL for free anywehere and in any manner that conformed to the GPL.

      So this means that it can be distributed and embedded in commercial applications.

    14. Re:MySQL Replication by Anonymous Coward · · Score: 0

      No, that does NOT conform with the GPL.

    15. Re:MySQL Replication by pmz · · Score: 1

      I have a faulty pacemaker to sell you.

      Imagine...Windows CE Pacemaker Edition with bundled IIS.

      MS Technical Support: "Yes, sir, that fluttering sound is simply the Microsoft Butterfly working hard for you! The tunnel vision shows our advanced Microsoft technology that simplifies your life in real-time is working flawlessly!"

    16. Re:MySQL Replication by techwolf · · Score: 1

      No, there is no internal validation in MySQL that a write on one server doesn't conflict with another on a second (or third) master database.

      You can configure such a setup, but the results would be more prone to error the higher your modify queries get.

      You can still branch out replication (A -> B -> C), but all writes still need to go to the master. The two popular designs are A -> n or, if trying to decrease the load on A, you do A -> B -> n (where n is any number of slaves). You introduce 2 SPOF with the later, which is why I would recommend against it.

      Imagine a scenerio with 5 masters, all taking writes. The overhead of checking 4 other machines for each write would most likely negate any distribution of load benefits you get (and would probably degrade performance).

      I don't think there can be any doubt about MySQL's place. Slashdog uses it.

      --
      I don't do this for karma, I do it for cash. It's much better.
    17. Re:MySQL Replication by Anonymous Coward · · Score: 0

      But you forget that MySQL is available under a commercial license as well.

      Pay $$$ and you can then include it in an embedded commercial app.

    18. Re:MySQL Replication by Brian+Blessed · · Score: 2, Informative

      No, that does NOT conform with the GPL.

      Yes it does. Where in the GPL does it say that the software cannot be used in commercial applications?
      It does not! It only requires that you provide the source (which has nothing to do with whether the software is commercial or not).

    19. Re:MySQL Replication by Sxooter · · Score: 1

      Actually, I've been discussing the C part of ACID with one of the MySQL folks, and they're close, but I don't think they're quite there, and here's why.

      Consistent means that when doing a transaction, the database should move from one consistent state to the next, with all constraints applied. This is according to the database theory college text book Fundamentals of Database Systems by Elmasri and Navathe. Basically, the problem MySQL has is that you can apply constraints that are improperly processed. I.e. if you insert a number like 12345678.12 into a numeric (5,2) you'll get 999.99.

      Since this "constraint" is conveniently ignored, you no longer are guaranteed a consistent database. Furthermore, you can't even apply constraints like "check id >1000 AND id

      The fellow I was discussing this with at MySQL maintains that the C in ACID only applies to foreign keys, but hey, they ain't known for their indepth understanding of database theory, now are they? ;^)

      --

      --- It is not the things we do which we regret the most, but the things which we don't do.
    20. Re:MySQL Replication by strobert · · Score: 1

      Oh, don't worry I understand the IDEA of ACID. I just know that in pratice you rarely have that "prefect" of an application usage. Some constraints won't have been there from day 1, some get dropped, you have to import data from other companies, new features get added so that older records are missing some now mandatory columns, etc. Also having lots of contraints in the DB can kill you on performance. I know of one large DB company that deals with constraints in the application layer due to performance reasons.

      The blow a fuse in the middle of 200 transactions is the D part. I understand that one. From personal experience even some of the "big guys" like Oracle don't hold up to the D. I wasn't personally the DBA on that mess but was working with them. Short story is Oracle wigged out and corrupted DB files (running on on the whole VOS inititive setup) and we had to go to backup files.

      So while the thoery makes some sense, in practice I haven't seen the real gain over the long haul (for the ACI aka transaction part). for the D part, the vendor I have actually had the best business luck with has been MySQL first, MSSQL/Sybase second, and Oracle third.

    21. Re:MySQL Replication by Beliskner · · Score: 1
      Basically, it gives them a database capable of completely correct operation, not just mostly correct. Of course that may not matter to you, in that case I have a faulty pacemaker to sell you
      Trash. Even the NY Stock Exchange has only 99.99% uptime (it was knocked offline by the WTC attacks, and will be knocked offline in future by N Korea's/China's nuclear strike)

      The vast vast majority of databases worldwide contain static data or data that is under glacial change. Similarly dot-com companies always represented a very small sector; the food sector like Walmart groceries plus restaurants dwarfs the entire IT sector.

      Most companies take their database offline to make a backup, the exception is Banks and financial areas.

      --
      A caveman dreams of being us, the incalculable power and riches. We dream of being Q, then what?
  6. -1:Troll by stratjakt · · Score: 4, Insightful

    Do anyone have experiences with clusters of MySQL , Postgres-R, C-JDBC or other solutions? How does it compare to commercial products?

    They don't compare to commercial products. I know it isn't what you want to hear, and there are hundreds of kids here to tell you different, but they just dont compare. Those kids database experience doesn't extend past an address book.

    Even if you manage to get them to technically keep up, transaction wise, to Oracle or SQL Server, the ACID enforcement isn't there, the syntaxes are kludgy. Gack.

    My company ships products with SQL Server or Oracle as the back end. I've tried to put together an OSS solution so I could impress the big boss with millions of bucks of saved license fees. They just aren't anywhere close IMO.

    Run a SQL Server farm on the back end if you cant afford an Oracle license. Don't be an OSS idealogue in the business world, you end up unemployed.

    --
    I don't need no instructions to know how to rock!!!!
    1. Re:-1:Troll by venom600 · · Score: 5, Insightful

      ACID enforcement isn't there
      Actually ACID compliance is getting pretty darn good in databases like MySQL. Care to elaborate about what ACID compliance issues you have?

      Don't be an OSS idealogue in the business world, you end up unemployed.
      Actually, in our flailing economy 'OSS idealogues' as you call them are making a lot of head-way. OSS now has a viable alternative to *just about* any commercial enterprise software out there.

    2. Re:-1:Troll by stratjakt · · Score: 1

      ACID problems? Well, getting transaction support on one machine was enough of a hassle. It was just a bitch and constant swearing session. It could be done, but in the end the performance wasn't where it should be - adding servers to the cluster didn't scale it up as fast as it should have, IMO.

      I realize that stuff has improved in the year since I've seriously looked at it, but I'm doubtful it's reached the level of Oracle or SQL Server.

      OSS now has a viable alternative to *just about* any commercial enterprise software out there.

      There are a lot of great OSS products out there, it's just that none of them are enterprise class database engines. Watch 404 messages from websites for telling clues - mysql always fails before apache.

      --
      I don't need no instructions to know how to rock!!!!
    3. Re:-1:Troll by axxackall · · Score: 1, Informative
      ACID problems? Well, getting transaction support on one machine was enough of a hassle.

      Transaction problems on a single database? That explains a lot. I bet you used MySQL. Sounds pretty typical for it.

      Ever tried PostgreSQL?

      --

      Less is more !
    4. Re:-1:Troll by Elladan · · Score: 2, Interesting

      Oh come on. 99% of the time, this "failure" is due to the admin having the number of concurrent mysql sessions set lower than the number of apache sessions. Since they'll never hit that situation in their half-assed testing with one browser...

      And how exactly do you intend to compare the situation where MySQL saturates to the situation where apache saturates remotely, exactly? If apache falls over, you're getting no connection at all. Perhaps the database is working great, you'll never know. :p

    5. Re:-1:Troll by zulux · · Score: 2, Troll


      Oracle has some amazing features, but PostgreSQL kicks the crap out of MS SQL Server -

      MS SQL skips record when it queries - more info here

      MS SQL crashes for no fucking reason.

      MS SQL requires x86 hardware - No Sparc, No POWER, No MIPS. Just crappy x86.

      There is no 64 bit version os MS SQL.

      PostgreSQL has a very robust multi-version concurency controll mechanism - somthing MS SQL could only dream of.

      And if your *REALLY* need to scale PostgreSQL - run is on a SUN/SGI/IBM.

      Not a bunch of fucking Intel toys.

      --

      Moneyed corporations, non-working 'poor' and criminal prisoners are turning productive citizens into tax-slaves.

    6. Re:-1:Troll by venom600 · · Score: 4, Insightful

      I realize that stuff has improved in the year since I've seriously looked at it, but I'm doubtful it's reached the level of Oracle or SQL Server.
      You should look again. MySQL, for example, has full transaction support with InnoDB table types.....AND it's pretty damn fast.

      Watch 404 messages from websites for telling clues - mysql always fails before apache.
      I'm sorry, but that doesn't seem like a very accurate way of measuring database reliability. One of the cool (and sometimes harmful) things about open databases is that there is no entry fee...meaning anybody and their brother can set up a MySQL server. This means that the number of ill-managed MySQL servers out there probably out-numbers Oracle or SQL Server installations (which, typically, have a somewhat knowledgeable admin behind them) by 10 to 1. A MySQL database managed by somebody who knows what they are doing will go head to head with Oracle or SQL Server installations which are also managed by someone who knows what they are doing.

    7. Re:-1:Troll by Kunta+Kinte · · Score: 4, Insightful
      Run a SQL Server farm on the back end if you cant afford an Oracle license. Don't be an OSS idealogue in the business world, you end up unemployed.

      And I would fire the IT guy who causes my company to spend $10,000 for SQL Server in a situation where the free MySQL or Postgres would do.

      Just focus on the right tool for the job. If the database is a simple one. If it is regularly backed up and your company can stand a small period of downtime, why on earth would you buy Oracle or MS SQL Server?

      This is not to say that MySQL is unreliable. I have *never* seen MySQL crash, or lose any of my data. So it would be silly of me to go with Oracle, just because everyone else is doing it.

      The right tool for the job people.

      --
      Based on upvotes, Ageism is the only "-ism" Slashdotters care about and think isn't SJW
    8. Re:-1:Troll by Sevn · · Score: 1

      Love your sig. Is that from Carl from ATHF? I could have swore that's what he said when he had the Foreigner power belt on.

      --
      For every annoying gentoo user, are three even more annoying anti-gentoo crybabies. Take Yosh from #Gimp for example.
    9. Re:-1:Troll by Lumpy · · Score: 3, Insightful

      They don't compare to commercial products. I know it isn't what you want to hear, and there are hundreds of kids here to tell you different, but they just dont compare. Those kids database experience doesn't extend past an address book.


      yeah, nobody would ever run a high traffic website on OSS database.....

      Anyone know of a high traffic website that uses per and OSS database servers in a cluster?

      Oh yeah.... this place

      --
      Do not look at laser with remaining good eye.
    10. Re:-1:Troll by Anonymous Coward · · Score: 0

      A MySQL database managed by somebody who knows what they are doing will go head to head with Oracle or SQL Server installations which are also managed by someone who knows what they are doing.

      Pass me some of what you're smoking.

      Puff Puff Give, Puff Puff Give...

    11. Re:-1:Troll by Anonymous Coward · · Score: 0

      High traffic yes.

      Large amounts of data? No.

      Slashdot hardly needs a sophisticated database.

    12. Re:-1:Troll by Anonymous Coward · · Score: 0

      comments have been refered to thosekids.org

    13. Re:-1:Troll by Anonymous Coward · · Score: 0

      Are you nuts?

      count the number of stories that are available in the database.... they still hold stories from 1999 with all the comments,moderation,etc...

      Last time I read one of Rob's reports on slashdot they had 10 terabyts of data in the database.. and that was 2 years ago.

      no that's not "ALOT" but it's nothing to sneeze at.

      I dare any MS-SQL solution to handle the load that slashdot get's.... Hell I can tell you what it would do, as I watch MSSQL suffer daily struggling to handle only 500 connections on a slow 2.2Ghz Xeon 2 processor box with ony 4 gig of ram and scsi ultra 160 drives in a raid 5.

      Ms-SQL certianly cant cut it... the only choice for a real database in commercial is Oracle.

    14. Re:-1:Troll by jamie · · Score: 4, Informative
      "Last time I read one of Rob's reports on slashdot they had 10 terabyts of data in the database.. and that was 2 years ago. no that's not "ALOT" but it's nothing to sneeze at."

      Nah, our DB totals only about 6 GB. Slashdot isn't an especially big database.

      Its only claim to fame is that it delivers about 30 dynamic pages a second, 12 hours a day.

    15. Re:-1:Troll by Tmack · · Score: 4, Interesting
      I would have to second this. I use MySql at work as the main database for the NOC and service activations and circuit delivery groups. The database (running off an old Sun Netra box) handles the load of all the scripts (mostly perl) used by all those groups. This includes scripts that monitor circuit status (ala Netcool), test new circuits, keep track of customer installations, change requests, troubles, router configs, etc... The MySql server has never caused dataloss, and the only instances where it "crashed" were errent querys in alpha CGI script releases that caused basically an infinite loop around a search on the 20K+circuit entries on a non-Indexed field, that a simple restart of the mysqld fixed. Even when the Beta version was released running on a linux P4 box we never had issues, as opposed to the Oracle backended system used for the main corp. database that regularly causes much frustration among co-workers (not to mention the internal conflict between 2 development teams (corp vs us) trying to control the access and data of the corp database vs the ease of development of new utilities to make Customer installation and support easier.

      TM

      P.S.Cant wait for our Sun V280r shows up!

      --
      Support TBI Research: http://www.raisinhope.org
    16. Re:-1:Troll by afidel · · Score: 2, Informative

      There is no 64 bit version os MS SQL

      Bullshit, it's been out for months, see This article. As to the rest of your argument check out TPC-C results and say that MS SQL doesn't scale, it's the second highest scorer and has 6 of the top 10 results. This is a real world load testing benchmark that many companies base purchasing decisions on. (ok the MS solutions are a little unusual in that they are shared-nothing but the other competitiors are free to do likewise).

      --
      There are 4 boxes to use in the defense of liberty: soap, ballot, jury, ammo. Use in that order. Starting now.
    17. Re:-1:Troll by mentin · · Score: 1
      There is no 64 bit version os MS SQL.

      Not true.

      --
      MSDOS: 20+ years without remote hole in the default install
    18. Re:-1:Troll by justin.warren · · Score: 4, Informative
      An appropriate subject line.

      Some of us who compare OSS databases to commercial ones have experience that extends past address books. And no, I'll pass on the DSW if you don't mind.

      My main problem with PostgreSQL is the query optimiser. Oracle's query optimiser is definitely superior as Postgres occasionally comes up with some peculiar query plans. In a product I'm involved with, we hand tune our SQL from the ground up, so this is less of a problem for us. I find the two products to be pretty comparable in other aspects, though I haven't tried Postgres-R yet.

      I haven't played with MySQL since back when you couldn't do sub-SELECTS, so I have no idea how much it's progressed since then.

      At this stage, I'd suggest you stick with a commercial product for replication or clustering for high end work. Clustering and replication is still the bleeding edge for OSS, so use it with caution on non-critical tasks. Having said that, these are complex tasks you're talking about, and even the commercial products have their own peculiarities at times. High volume replication using Oracle materialized views over database links comes to mind.

      --
      Just because you're paranoid doesn't mean they're NOT after you.
    19. Re:-1:Troll by tzanger · · Score: 1

      Uh, that's exactly what he was talking about; the OP had asked what mysql transaction problems he had...

    20. Re:-1:Troll by zjbs14 · · Score: 1
      Just to add to the other replies, the first point was referring to a bug that was reproduced once in the now over 5-year old and four major revision old version of SQL Server.

      Way to prove your case there.

      --
      No sig, sorry.
    21. Re:-1:Troll by Anonymous Coward · · Score: 0

      Is that you John Wayne?

    22. Re:-1:Troll by yerfatma · · Score: 1
      Is that you John Wayne?

      "Who said that? Who the fuck said that? Who's the slimy little Communist shit twinkle-toed cocksucker down here, who just signed his own death warrant?"

    23. Re:-1:Troll by the_mad_poster · · Score: 1

      *cough*8k record limits*cough*

      And it crashes for a reason. The goddamn thing can't escalate locks properly to save its life.

      Still... it's not the worst DBMS on earth. It's just like most of Microsoft's crap: not worth the price tag (at least SOME people can justify Oracle's outrageous fees).

      --
      Alito: A vote for Alito is a punch in the eye to put that bitch back in her place!
    24. Re:-1:Troll by GileadGreene · · Score: 1
      Its only claim to fame is that it delivers about 30 dynamic pages a second, 12 hours a day

      Just out of curiosity, what kind of page rate does it sustain during the other 12 hours of the day? :-^)

    25. Re:-1:Troll by ThisIsFred · · Score: 1
      My company ships products with SQL Server or Oracle as the back end. I've tried to put together an OSS solution so I could impress the big boss with millions of bucks of saved license fees. They just aren't anywhere close IMO.


      Your livelihood depends on MS and Oracle products. Wow, you're an unbiased source for information. For an empirical study of a successful OSS solution, learn something about the site on which you are posting.

      Why are you posting this drivel here anyway? The original thread asks what open source clustering solutions are to be had. You just posted what amounts to a Microsoft advertisement.
      --
      Fred

      "A fool and his freedom are soon parted"
      -RMS
    26. Re:-1:Troll by fupeg · · Score: 2, Insightful

      ACID transactions using InnoDB, and performance is quite good on MySQL 4.0. My company uses a MySQL/InnoDB server for the loading of content from our partners to our site. The loader is an extremely multi-threaded piece of code that can put quite a strain on database. The loader also writes part of the data it loads to an Oracle 9i database. The MySQL db is running on a dual processor machine (the same machine the loader is running) and the Oracle db runs on its own 4-processor machine (both run Linux, though the Oracle one uses a slightly older kernel as required by Oracle.) When we crank up the loader, it's the Oracle server that becomes unresponsive first. MySQL scales better, at least in this case. We've had no problems with ACID transactions, and the loader code involves several long, distributed transactions (across dbs) running with isolation level of read-committed. We also use a third party search engine that uses SQL Server for persistence. This is typically the bottleneck in the system. When we talk about improving scalability and performance, the MySQL db is the last thing ever mentioned.

    27. Re:-1:Troll by pyrrhonist · · Score: 0, Redundant
      MS SQL requires x86 hardware - No Sparc, No POWER, No MIPS. Just crappy x86.
      There is no 64 bit version os MS SQL.

      Bullshit! On the download page, there's binaries for all of the following OSes:

      • Linux (x86, S/390, IA64, Alpha, SPARC, and AMD64)
      • Windows (x86)
      • Solaris (SPARC 64-bit, SPARC 32bit, and x86)
      • FreeBSD (x86, and SPARC 64-bit)
      • MacOSX (PPC)
      • HP-UX (RISC, PS-RISC 1 and 2, PA-RISC 2 64-bit, and IA64 64bit)
      • AIX (RS6000)
      • QNX (x86)
      • Netware (x86)
      • SCO (x86)
      • OpenBSD (x86)
      • Irix (MIPS)
      • Dec OSF (Alpha)
      Then there's a source tarball, source RPM, and a zip file so you can compile your own (which I've done on Solaris 64-bit several times).
      All in all, I's say MySQL runs on a pretty diverse set of systems.
      --
      Show me on the doll where his noodly appendage touched you.
    28. Re:-1:Troll by Anonymous Coward · · Score: 0

      *) Let's be clear on the definition of 'cluster': The loss of a node should not result in any data loss.

      *) The database must be able to be backed up without shutting it down. Not an export or dump, a copy of the entire collection of database files. That leaves out postgresql (pg_dumpall is not a backup. try to dump a 50g database and you block vacuum. An the postgres doc: '.. you may have to edit the dump!!!')

      *) Mysql (and sapdb starting in q4 2003) ARE NOT FREE. unless you give away your software, you must pay MYSQL AB.

      *) MS lies, lies, lies. Examine their full disclosure TPC-C document. The rip the concurrency mechanism out of their dbms and put loops in their code to sleep if it deadlocks.

    29. Re:-1:Troll by Anonymous Coward · · Score: 0

      I have never seen MySQL crash or lose data either. Are all these other 'crashing' horror stories just from people who don't have a good handle on computing?

    30. Re:-1:Troll by IANAAC · · Score: 2, Insightful
      One of the cool (and sometimes harmful) things about open databases is that there is no entry fee...meaning anybody and their brother can set up a MySQL server.

      Not to knock MySQL, but this is actually part of hte problem. You get anybody and their brother setting something up that then crashes and burns.
      I don't care what anybody says, setting up a database is nothing trivial.
      Regarding your comment that MySQL would go head to head with Oracle or SQL Server is, frankly, laughable.
      I would suggest, partucularly to someone familiar with Oracle, PostgreSQL over MySQL any day. In addition to MySQL's much-hyped transactions (which have been in PostgreSQL forever), PostgreSQL has a procedural lanaguage with which Oracle folk would feel quite at-home.
      Let's not forget that PostgreSQL's SQL implementation is much more standards compliant.
      Again, this is not to rant on MySQL, but it kind of irks me that someone can slap up a MySQL DB and claim to know all about databases.
    31. Re:-1:Troll by Anonymous Coward · · Score: 0
    32. Re:-1:Troll by Anonymous Coward · · Score: 0

      MySQL distibutes MS SQL now??

    33. Re:-1:Troll by AKAImBatman · · Score: 1

      PostgreSQL has a procedural lanaguage with which Oracle folk would feel quite at-home.

      Just to comment on this, as much as I love PostGres, it would be nice if they added true support for procedures instead of just the simple "function definitions" they have now. Other than that, their PL/SQL implementation isn't half bad. :-)

      Let's not forget that PostgreSQL's SQL implementation is much more standards compliant.

      Amen brother! I still want to know who's bright idea it was to use backticks instead of quotes. I talked with one of the MySQL developers (here on /. none the less) and he "promised" to improve ANSI support in the future. Right now there is supposedly a switch to turn on ANSI mode. IMHO, that only makes things worse as all the tools built for MySQL will then break. They need to just bite the bullet and be compliant. Even if they have to fork into a new development branch. (YourSQL maybe? :-P)

    34. Re:-1:Troll by An+Onerous+Coward · · Score: 1

      You've got speed down. Now let's work on uptime. :)

      --

      You want the truthiness? You can't handle the truthiness!

    35. Re:-1:Troll by stanwirth · · Score: 3, Insightful

      My main problem with PostgreSQL is the query optimiser. Oracle's query optimiser is definitely superior as Postgres occasionally comes up with some peculiar query plans.

      I had the same experience. You basically have to optimse large queries combined with joins and subselects on Postgresql yourself -- and often with Oracle, as well, if its for tables with > 1-10M records. ish. You might want to check out DB2. Awesome clustering -- IMHO more sophisticated and flexible than Oracle's. YMMV depending on the application, as always. Also, if it's a development environment, you can test DB2 and Oracle on linux boxen to your heart's content for the same price as PostGreSQL -- free .

      MySQL may be able to handle subselects, but it's still struggling with triggers and stored procedures.

    36. Re:-1:Troll by nconway · · Score: 1
      My main problem with PostgreSQL is the query optimiser. Oracle's query optimiser is definitely superior as Postgres occasionally comes up with some peculiar query plans.


      Can you elaborate on the problems you've had with the PostgreSQL query optimiser? It's worth noting that the query optimizer in PostgreSQL 7.4 (which is in beta right now) should be significantly improved, but if you've had specific problems, I'd be interested in hearing about them.



      (FWIW, in the future reporting bugs about poor optimization of a given query would be nice, too :-) )

    37. Re:-1:Troll by Doctor+Faustus · · Score: 1

      And how often does it go down?

    38. Re:-1:Troll by gfody · · Score: 1

      I haven't looked into MySQL but I see "basically an infinite loop" and "20K+ entries" and cringe.

      20K is no huge number in my book. Being able to run a table scan (you can't always just put an index) should not throw you for a reboot, hell especially on a P4 box. I'm running relatively older hardware (P3 xeons) and I know of a few places I'm doing table scans on subsets of indexes that have upwards of a million records that return in under a few seconds.

      --

      bite my glorious golden ass.
    39. Re:-1:Troll by defile · · Score: 1

      ACID problems? Well, getting transaction support on one machine was enough of a hassle. It was just a bitch and constant swearing session. It could be done, but in the end the performance wasn't where it should be - adding servers to the cluster didn't scale it up as fast as it should have, IMO.

      It was pretty easy to achieve simply from reading the MySQL documentation on InnoDB. About my only gripe with MySQL is that you need to stop all traffic to the server in order to start off replication (unless your environment facilitates ibbackup) for the first time.

      There are a lot of great OSS products out there, it's just that none of them are enterprise class database engines. Watch 404 messages from websites for telling clues - mysql always fails before apache.

      I have a business of 20 employees, 40 vendors, and about 275,000 customers depending on MySQL in high load environments. It may fail before Apache, but it's only like 100x more complicated than Apache is. And with fairly reasonable hardware (an under $10k investment) we can keep MySQL uptime above 99.9% a year.

      I suspect most of the businesses demanding "enterprise reliability" with Oracle don't actually receive anything close to that kind of uptime anyway -- be it due to Oracle's inherent crappiness or some other completely unrelated source. ie, a blackout, a worm disables your clients, a worm makes your web site unreachable due to routers overloading, or your sys admin by mistake set the wrong DNS entries and it'll be 48 hours before all of your client's resolvers flush their caches.

    40. Re:-1:Troll by Anonymous Coward · · Score: 0

      Obviously you haven't used PostgreSQL in the last few years because you'd be aware that the 8k limit is non-existent now. And as far as locks are concerned - I've had a 3gb database with web and odbc/vb clients accessing it quite regularly, some rows in the millions and I've NEVER experienced a crash hmmm 3 month, 5 month uptimes (only had to reboot because of kernel recompile) or strange locks. Next thing you know you'll claim it's slow compared to other db's - have you ever configured the thing correctly or do you run your systems with default settings all the time?

      Do your research before you flame on. Ignorance is no excuse.

    41. Re:-1:Troll by localman · · Score: 1

      Those kids database experience doesn't extend past an address book.

      In general this may be true, but I currently manage the MySQL DB for a site that does in excess of 5 mil/mo in sales. You can go a lot further than you might think with OSS solutions. We are in the process of setting up replication/load balancing for future growth.

      With our revenue, one might wonder why we'd balk at picking up Oracle or someother. The answer is that we started on the cheap three years ago and haven't found any compelling reason to switch. No religion -- just feel that we've got the right tool for the job.

      And yes, coding for bad data in the application layer seems more robust in the long term than relying on the DB to keep things clean. Just my 2 cents on the ACID issue.

      Cheers.

    42. Re:-1:Troll by Anonymous Coward · · Score: 0

      Well...
      There is this small used car ad system [mobile.de].

      They happen to use MySQL.

      They also happen to be, according to [ivw] (who monitor these things), the web site with the most page impressions in all of Germany. In August 2003, this amounted to 562.971.634 page impressions.

      The site is, naturally, completely database driven.
      So yes, there are fairly large setups using open source databases.

    43. Re:-1:Troll by Anonymous Coward · · Score: 0

      >> Do anyone have experiences with clusters
      >> of MySQL , Postgres-R, C-JDBC or other
      >> solutions? How does it compare to commercial
      >> products?
      >
      > They don't compare to commercial products.
      > I know it isn't what you want to hear, and
      > there are hundreds of kids here to tell you
      > different, but they just dont compare.

      Sadly, that has been my experience also.

      I currently admin Oracle and MySQL. Sorry to say it, but MySQL is a toy compared to Oracle.

      This all depends on how important your data are to you. If loss of data == loss of life, limb or property, get a real database server.

      If you are writing a chat page or a blog or similar, MySQL can be sufficient. Backup regularly. Cache your data in the web server code. Expect to spend time fixing problems.

      I have no experience with Postgres; it does worry me a bit that it is spoken of with the same ...shall I say zeal... as MySQL.

      Go for free software: FreeBSD, Mozilla, OpenSSH, emacs, etc etc (gotta love BSD's /usr/ports). But for a mission critical database server, prepare to be disappointed by the free ones.

    44. Re:-1:Troll by Daniel+Phillips · · Score: 1

      ok the MS solutions are a little unusual in that they are shared-nothing but the other competitiors are free to do likewise

      Omigawd. At least you're honest about that. So it's fast as long as you rewrite all your applications and aren't worried about ACID.

      In which case I'd wonder why you don't use MySQL.

      --
      Have you got your LWN subscription yet?
    45. Re:-1:Troll by mcrbids · · Score: 1

      PostgreSQL has a very robust multi-version concurency controll mechanism - somthing MS SQL could only dream of.

      MVCC is truly awesome. It isn't until you do a full dump of a live database, knowing that all your data is referentially complete, while it's busy handling dozens of other requests, simultaneously, that MVCC really shines.

      It just plain rocks. When you do a query of any kind, anything from "select name from users where id=12355" to a full dump, you get the state of the database as of the moment that the query was submitted.

      Transactions underway but not yet complete are ignored. Transactions that begin, or actions performed while the database dump is underway is similarly ignored.

      The result? You can take a live, working database server and back it up at 3:00 in the afternoon, during heavy load, and have a clean dataset with high integrity.

      My only complaint about PG is lack of replication - I'd KILL for decent replication. Oh, and I just don't "do" compiling if I can possibly avoid it. As soon as you compile, then you have to worry about compile flags, dependencies, and all that jazz, which usually isn't a problem but just increases the workload.

      Make a
      postgresql-replication-7.3.4-51.rpm

      and I will be one happy dude.

      -Ben

      --
      I have no problem with your religion until you decide it's reason to deprive others of the truth.
    46. Re:-1:Troll by Anonymous Coward · · Score: 0

      Much less than K5.

    47. Re:-1:Troll by Anonymous Coward · · Score: 0

      Please forgive him, he's such a high UID.

    48. Re:-1:Troll by jadavis · · Score: 1

      instead of just the simple "function definitions" they have now

      what would you like them to do? PostgreSQL already has "table functions" that return result sets, and has normal functions, aggregate functions, and they can all be done in a lanugage of your choice.

      You can also do prepared queries which take parameters (which can be somewhat like a table function or a view).

      What functionality or flexibility is missing?

      They need to just bite the bullet and be compliant. Even if they have to fork into a new development branch.

      We may like them to, but that doesn't mean that they need to. Think about it this way, what does compatibility gain mySQL? They aren't exactly converting people away from Oracle (as postgres is). As far as their concerned, they're doing just fine with the lock-in strategy. It makes other databases look like something's missing also when you try to convert an app to something like postgres. Once I got a sql error saying that postgres didn't like a command starting "INSERT IGNORE". The first thing I though was the same as you probably are thinking now: what the hell is "INSERT IGNORE"?! It turns out to be, as you might expect, a totally useless extension that causes nothing but confusion and inconsistency. If you use it you're probably confused about how your application works. However, it is a good example of MySQL's embrace-and-extend development process that makes it hard for people to use their data with any other software.

      My point is that they aren't going to go out of their way to help their competitors. The PostgreSQL developers seem to be more interested in working well with other software (like in a mixed database environment) by being as compatible as possible and not adding too many strange features. After all, the end goal is (or should be) to help people manage their data best.

      --
      Social scientists are inspired by theories; scientists are humbled by facts.
    49. Re:-1:Troll by nettdata · · Score: 1

      yeah... I've never seen a UID break 700k before.... wow...

      --



      $0.02 (CDN)
    50. Re:-1:Troll by Anonymous Coward · · Score: 0

      I ALMOST completely disagree. There are places for oracle, and there are places for OSS. I work for a large ISP that only buys 1 piece of software for its servers... oracle. Yes, we use it for our financial and customer service databases, but when it comes to fast transactions, we have found that it is much cheaper for us to use MySQL (i.e. for our email/web servers). By cheaper I mean we use several smaller servers running either linux, or FreeBSD and MySql with replication. We are able to process over 10 million email messages per day on a system that is completely OSS. Oh, and I still have a job ;)

    51. Re:-1:Troll by leandrod · · Score: 1

      > what does compatibility gain mySQL?

      It is not about what it gives MySQL, but users. I for one want standards compliance so that I can change DBMSs as a user, or port my software as a vendor.

      --
      Leandro Guimarães Faria Corcete DUTRA
      DA, DBA, SysAdmin, Data Modeller
      GNU Project, Debian GNU/Lin
    52. Re:-1:Troll by larien · · Score: 1
      And if your *REALLY* need to scale PostgreSQL - run is on a SUN/SGI/IBM
      *sigh* the poster was asking about clustering for replication/failover, not scalability.

      PS: I was going to correct your (you're) grammar mistake before I noticed the other errors and figured I couldn't be bothered.

    53. Re:-1:Troll by the_mad_poster · · Score: 1

      Boy I hate when people post "retorts" AC...

      I'm backing him up, nitwit. Hence "And there's a reason it crashes" which references the original poster's gripe about MS SQL always crashing. Part of the reason it crashes is because it can't escalate the locks for shit. I use PostgreSQL or Firebird anytime I have the option

      Read and understand the context before you flame on. Being an AC is no excuse.

      --
      Alito: A vote for Alito is a punch in the eye to put that bitch back in her place!
    54. Re:-1:Troll by Anonymous Coward · · Score: 0

      -1 Pinhead Basic reading comprehension has passed you by.

    55. Re:-1:Troll by jamie · · Score: 2, Interesting
      It's been a long time since Slashdot went down for any significant amount of time.

      We do planned code upgrades once a week and have to kick each webserver, but the load balancer keeps the site up transparently. We probably lose a total of a few hundred incoming connections each time we do that (a total of maybe 5 seconds worth, once a week).

      In the last year, I think there was once that we had to roll code back and were probably down for a few minutes, and I think one other time when we were down for an hour, I forget what exactly.

      And then of course we've had network troubles occasionally, but that could be us or it could be you :)

      None of that has been because of database failure (to get back ontopic sorta :) ... the MySQLs just all keep humming.

    56. Re:-1:Troll by robby2 · · Score: 1

      I agree (mostly).
      404's because of MySQL problems aren't rare but almost always are caused by bad administration. This is a somewhat educatsites selling some cheap products.ed guess based on the fact that the site's I see the errors on are mostly amature sites running on home systems or bussiness sites selling some cheap products.
      El-cheapo systems are maintained by El-cheapo administrators.

      However, if the system is not supposed to be administrated El-cheapo way, there is probably also money available to buy Oracle licences.

      So seeing MySQL failing a lot says nothing about the MySQL product itself...

      Robin

    57. Re:-1:Troll by EastCoastSurfer · · Score: 1

      20K is no huge number in my book.

      Heh, I agree. I manage and maintain apps/dbs that insert upwards of ~300k rows per month. It does go back to the addage of using the right tool for the job though. mySQL probably works great as a web site backend or an MSAccess replacement. I haven't seen anything yet showing it could handle really large amounts of data though.

    58. Re:-1:Troll by AKAImBatman · · Score: 1

      You can also do prepared queries which take parameters (which can be somewhat like a table function or a view).

      What functionality or flexibility is missing?


      Not much. Just the annoyance of having to run a select statement to execute my procedure. Most procedural SQL dbs, allow you to call procedures in this fashion:

      "exec My_Proc @Blah = '1234', @Foo = '2003-09-12', @Bar=7"

      This works nicely, because I can happily insert optional parameters and not have to change my calls. On top of that, I can call them through a normalized coding interface, like the JDBC Procedure object. Functions really don't work that way, but they do make a nice alternative to subselects and case statements. (Both of which PostGreSQL supports, BTW. :-)) I suppose it's possible that I'm just misinterpretting the docs, so feel free to correct me if I'm wrong.

      Think about it this way, what does compatibility gain mySQL? They aren't exactly converting people away from Oracle (as postgres is). As far as their concerned, they're doing just fine with the lock-in strategy.

      That lockin strategy is exactly why I wouldn't choose MySQL. (That is, besides its many other faults.) That lockin strategy prevents me from writing a generic application that's portable between PostGreSQL, MySQL, Oracle and others. Either I have to write my application to work specifically for them, or I have to ignore them. If my customers have big bucks, I'll ignore them.

      See my sig? I produce a tool for managing databases. I can't even begin to tell you how *#$#% annoying MySQL was to implement. They were determined to do just about everything different from everyone else. In order to make them work right, I had to unnecessarily complicate my code just to support all of their idiosyncrasies. The only other database that was anywhere near as annoying to implement (from my "tool vendor" perspective) was MSSQL because of their [tablename] notation instead of "tablename". None the less, that was their *only* major variation from the SQL standard, so it wasn't too bad.

      MySQL is not a serious database contender, and I don't see them being one for a long time.

    59. Re:-1:Troll by einhverfr · · Score: 1

      Actually ACID compliance is getting pretty darn good in databases like MySQL. Care to elaborate about what ACID compliance issues you have?

      Well, depends. ACID is completely optional in MySQL which can have some severe drawbacks if you want to use it for something serious. But then, MySQL is designed from the ground up to be flexible and extensible AT THE EXPENSE OF ACID ENFORCEMENT. Want to know what I mean?

      HEAP tables CANNOT be ACID compliant because the D stands for durability. The ACI all have to do with transactional support which are not supported on all table types. So, if you are doing many updates in a transaction, and some of them are in nontransactional tables, what happens when you have to rollback?

      But then, IMO, this has never been what MySQL is for. It is a quick and dirty database solution for developers who want a simple and easy to learn tool. It does this very very well. Another problem is if you try to insert 1000.00 into a numeric (4, 2) field and get 99.99 :) Bye bye accounting.... The heavier weight solutions will refuse to insert and raise an exception :) This saves your accounting database from becoming useless because of a slight programming mistake....

      If you want ACID enforcement, Firebird and PostgreSQL are better bets. There are commercial replication solutions for Firebird, and several open source ones for PostgreSQL. They are no where near the commercial ones YET, but they are getting closer.

      --

      LedgerSMB: Open source Accounting/ERP
    60. Re:-1:Troll by altamira · · Score: 1
      My only complaint about PG is lack of replication

      On Aug 28, there was a press release regarding the eRServer on the PostgreSQL website (link here). Project page for eRServer is here. Haven't checked it out yet though.

      -Jan
    61. Re:-1:Troll by Sxooter · · Score: 1

      QUOTE:
      Run a SQL Server farm on the back end if you cant afford an Oracle license. Don't be an OSS idealogue in the business world, you end up unemployed.
      UNQUOTE

      You do know that afilias runs the .org and the .info domains in Postgresql using the erserver replication system?

      Which means when you type in "http://ask.slashdot.org" your request is resolved by Postgresql using a (now) OSS solution for replication?

      I haven't seen the .org or .info systems go down lately, seems to be working well enough.

      --

      --- It is not the things we do which we regret the most, but the things which we don't do.
    62. Re:-1:Troll by Sxooter · · Score: 1

      Being able to call stored procs in that way isn't nearly as big a deal as the lack of packaging for postgresql. It's nice that in Oracle you can install and uninstall a package that has all that built in and ready to go, and that they can bascially live in their own name spaces.

      with the introduction of schemas, it may be possible for someone to make a packaging system for Postgresql.

      Oh, and just an additional point, postgresql supports way more than just the one stored proc language.

      Stored procedures / user functions can be written in C, Python, Perl, PHP, PL/PGSQL, PL/SQL, Ruby, R, and a couple more I can't remember the name of right now.

      --

      --- It is not the things we do which we regret the most, but the things which we don't do.
    63. Re:-1:Troll by Anonymous Coward · · Score: 0

      My databases are so HUGE!!!

      And they can touch anything else in the world, except themselves...

    64. Re:-1:Troll by Sxooter · · Score: 1

      Gee, how about the .org and .info TLDs. They both run on Postgresql using ERServer for replication.

      that big / critical / high performance enough for ya? :-)

      --

      --- It is not the things we do which we regret the most, but the things which we don't do.
    65. Re:-1:Troll by Sxooter · · Score: 1

      Have you looked at Postgresql?

      --

      --- It is not the things we do which we regret the most, but the things which we don't do.
    66. Re:-1:Troll by Sxooter · · Score: 1

      Just wondering, did you tune Postgresql at all? Set effective_cache_size, shared_buffers, sort_mem, cpu_index_tuple_cost, cpu_tuple_cost etc... to appropriate values for your data set and load?

      did you run analyze?

      Did you have a vacuum job cronned up to run every X hours?

      --

      --- It is not the things we do which we regret the most, but the things which we don't do.
    67. Re:-1:Troll by AKAImBatman · · Score: 1

      h, and just an additional point, postgresql supports way more than just the one stored proc language.

      I'm well aware of that. Although, in reality PostGreSQL supports one language (PL/PGSQL) and allows plugin modules to be linked and called. I tried setting up the C/C++ interface once. Less fun than a trip to the dentist. :-/

      Being able to call stored procs in that way isn't nearly as big a deal as the lack of packaging for postgresql.

      This is true. But to be perfectly honest, you may actually be better off with Oracle if you need that kind of scalability. PostGreSQL is getting there, but I'd still pay for Oracle where it makes sense. :-)

    68. Re:-1:Troll by afidel · · Score: 1

      Actually if you read the report they maintained ACID compliance, that's a requirement of the test.

      --
      There are 4 boxes to use in the defense of liberty: soap, ballot, jury, ammo. Use in that order. Starting now.
    69. Re:-1:Troll by LadyLucky · · Score: 2, Interesting
      Actually ACID compliance is getting pretty darn good in databases like MySQL. Care to elaborate about what ACID compliance issues you have?

      Bull pucky: From someone who with their only deployment of MySQL into a live environment went completely pear shaped, MySQL crashing several times per day. The damned thing doesn't report ANYTHING to the error log, except "I'm starting up again, and oohhh look at all that corrupt data, I hope I can do something about that!". I would never touch the database again, not with a 10 foot bargepole.

      We're dropping that pile of crap faster than you can click the hyperlink on the MySQL website which says it may take up to two weeks to get any kind of support even in the case of an emergency.

      We're now using MSDE for low powered embedded installations that the MySQL crowd had pushed prior to this. Who would have thought, use the Microsoft solution because the open source one doesn't cut it.

      Sorry, it's been a long week of conference calls and VPNs in the middle of the night because MySQL decided to crash once again.

      MySQL isn't there, and it's lost all trust from anyone who knows about databases.

      --
      dominionrd.blogspot.com - Restaurants on
    70. Re:-1:Troll by Anonymous Coward · · Score: 0

      of course this is coming from a chick. you shouldnt paint your nails in the server room, duh! no wonder nothing works.

    71. Re:-1:Troll by justin.warren · · Score: 1
      I'll have to dig up the details on what was sub-optimised in Postgres. Not being a guru DBA, I was simply comparing the same SQL in Postgres vs. what I thought the most intelligent query would be. From memory it was something about using appropriate indexes on a multi-table join. We've sinced fixed the database design so that the join isn't necessary. PostgreSQL arrays rock!

      It is entirely possible that it was sub-optimal database tuning, as mentioned by another poster. I'll be sure to check out the new optimizer when I get back to the SQL side of the project. Also, it wasn't enough of a problem for us to log a bug, since we assumed that the appropriate wizards were probably already working on such things. We're not on such a tight release schedule that we need such things fixed ASAP.

      Having read all the replies to my post (wow) I realise I've only scratched the surface of Postgres. I look forward to delving more into its depths.

      --
      Just because you're paranoid doesn't mean they're NOT after you.
    72. Re:-1:Troll by Tmack · · Score: 1
      I should probably clarify that a bit. The query matched across several tables, linked by non-indexed keys, and ran against the same table twice. The error in the query caused every row in the first instance of one table to match every row in the second instance of the same table, generating 20K*20K=400M+ matches on that table alone. The mysqld restart was actually just a "show process-list" then "kill [PID]" of the errent process... My mind was a bit off when I posted this.

      TM

      --
      Support TBI Research: http://www.raisinhope.org
  7. Not personally, but by revividus · · Score: 5, Interesting
    I've been looking into MySQL for a bit, and I saw this article recently, which is directly concerning clustered database servers running MySQL.

    Maybe it will be of interest...

    1. Re:Not personally, but by g_goblin · · Score: 0

      Only 400 simultaneous connections... That's not that many.

      I've run a lot of different DBMS and find SQL Server to be the best based upon Performance and Price.

      My hardware guy and I clustered our database w/ SQL 2000 Enterprise over a year ago and it has only gone down once and that was for maintenance. We spent $65K on the hardware, OS and SQL 2000 licenses. To do the same with Oracle on an x86 platform was going to cost almost $200K

      I don't doubt the capabilities of PostGres and MySql, but when your DBMS is doing memory dumps, you don't want to have to scour google for an answer. You want the vendor on the phone as fast as possible and MS, in my exeperience, is very good at phone support when it comes to SQL Server.

      Eventually OSS DBMS will catch up ,but right now I don't see how u can use replication as a viable clustering alternative.

      Just my .02

    2. Re:Not personally, but by egon · · Score: 1

      If you don't want to scour google, purchase a support contract. I don't know about postgres (though I would suspect it's available) but I know for certain that mysql has such a thing available.

      If the support is import, purchase it. It'll still come out significantly less expensive than going with MS SQL, or so I would imagine.

      --
      Give a man a match, you keep him warm for an evening.
      Light him on fire, he's warm for the rest of his life
  8. Clustering is never fun by scosol · · Score: 2, Insightful

    Open-source or not...

    I would say just get a bigger box for your PostgreSQL solution and do semi-realtime remote replication on the tables you dont want to lose.

    --
    I browse at +5 Flamebait- moderation for all or moderation for none.
    1. Re:Clustering is never fun by Anonymous Coward · · Score: 0

      yep. we've weaned a few corps off oracle to pg.
      not a hard sell, my dears.

      CoyBoy

      -

  9. Huh? by Wakko+Warner · · Score: 4, Funny

    You can "cluster" MySQL? Does it involve "rsync" and "cron"?

    - A.P.

    --
    "Remember when the U.S. had a drug problem, and then we declared a War On Drugs, and now you can't buy drugs anymore?"
    1. Re:Huh? by Anonymous Coward · · Score: 0

      Please go Read The Fine Manual at www.mysql.com. MySQL v4.x already supports replication, which is the basis for clustering.
      Just add some load balancing and you're done.

    2. Re:Huh? by Anonymous Coward · · Score: 0

      5, Funny? More like -1 troll.

  10. The big problem is replication by MarkusQ · · Score: 5, Interesting

    IMHO, the biggest problem is replication; keeping them all consistent in the face of asyncronous updates. It can also reduce/eliminate the advantages of clustering if you have a significant number of updates compared to the number of quieries.

    I guess the best answer depends on how dynamic your data is. If it's static, there are all sorts of easy answers. If all the updates come from a central source, or on a predictable schedule, you're almost as well off. If updates come from the great unwashed but the data can be partitioned in some way (say, geographically) you can still do it. If updates come from all over but queries can be centralized, or if your database is tiny, or if latency isn't a problem, or if you have a machine that prints money, it can still be done.

    If you want to do everything for everyone everywhere, right now if not sooner, for under twenty bucks, you're screwed.

    So, what are your needs?

    -- MarkusQ

    1. Re:The big problem is replication by wfrp01 · · Score: 3, Informative

      PostgreSQL has released their replication technology under an open source licence.

      --

      --Lawrence Lessig for Congress!
    2. Re:The big problem is replication by MarkusQ · · Score: 1

      PostgreSQL has released their replication technology under an open source licence.

      Cool beans. I hadn't even noticed that. I swear, take your eyes off the internet for a weekend or two and you've got catching up to do. And since the last time I looked into the situation was about six months ago, so I'm probably hopelessly out of date.

      -- MarkusQ

    3. Re:The big problem is replication by tmuller · · Score: 1

      The biggest problem with replication is that you don't realize that there already exists a product that does EXACTLY what you are bitching about.

      For the past 5 years or so, Informix (remember them?, now IBM) has had "time based" replication, where the "time" of the transaction dictates who's data is more current and gets to update the replicants. It means that the systems have to be "sync'd" with NTP to maintain that everyone is saying the same thing, but other than that, you get replication across all replicants, because they are ALL masters.

      Next time, please check into this before making this type of statement.

    4. Re:The big problem is replication by Anonymous Coward · · Score: 0

      this is an old problem (solving consensus and doing message ordering) that has been solved many times. i used free tool called ensemble for a commercial project a few years ago which did all that you describe and it worked perfectly for at least our project which was a large scale wide area distributed data store with dynamic content.

    5. Re:The big problem is replication by MarkusQ · · Score: 1

      1. Last I heard, Informix was not open source, which was a key part of the question.
      2. Just because a packaged "solution" exists, it doesn't mean the problem ceases to exist. For example, real-time replication takes bandwidth. For some systems, this is going to eliminate any advantage from clustering (which is why shared-storage solutions also exist).
      3. I wasn't "bitching" about anything. I was pointing out the major area where I have seen problems when people attempted to cluster database servers. Saying "watch out for the grues" isn't bitching (that would be something like "Who let all these @#$#! grues in here?").
      4. The whole point of "Ask Slashdot" is that people can provide information they personally know (thus at low cost to them); it doesn't obligate us to go out and do people's research for them.
      5. The decafinated brands taste just as good, and you can catch more flies, etc.
      --MarkusQ
    6. Re:The big problem is replication by nconway · · Score: 1

      FYI, "PostgreSQL" is an open-source database, whereas "PostgreSQL Inc." is a Canadian-based company that offers PostgreSQL support, hosting, and development. The replication product, which was previously being sold, has been released by PostgreSQL Inc..

    7. Re:The big problem is replication by caluml · · Score: 2, Funny
      I swear, take your eyes off the internet for a weekend or two ...

      Your request to surrender your Slashdot licence has been noted, and accepted.

    8. Re:The big problem is replication by Anonymous Coward · · Score: 0
      If you want to do everything for everyone everywhere [...] you're screwed.

      Well, the biggest problem isn't replication is the updates. Configure a centralized database and then try to make more updates than queries using the ACID propreties. Updates will cause the abortion of your queries. So the issue you raise is not new for replication :-P

    9. Re:The big problem is replication by MarkusQ · · Score: 1

      Configure a centralized database and then try to make more updates than queries using the ACID propreties. Updates will cause the abortion of your queries. So the issue you raise is not new for replication

      You can get around this by serialization (e.g. a request fifo), but you can't get around the bandwidth/multi-server syncronization issues in the same way.

      -- MarkusQ

    10. Re:The big problem is replication by Anonymous Coward · · Score: 0

      I'm sure that was causing a lot of confusion.

  11. Karl rules. by Anonymous Coward · · Score: 0

    Nuff said. Your sig, like Loverboy at Madison square, kicks ass.

    1. Re:Karl rules. by stratjakt · · Score: 1

      Everytime I hear that song I totally think about that passed out broad covered in vomit.

      --
      I don't need no instructions to know how to rock!!!!
  12. PostgreSQL and pg_dump by zulux · · Score: 4, Interesting



    Check out the new replication at postgresql.org: it's master -> multiple slave replication.

    Then have your slave database query the master database - and if it no longer responds, it could promote itself to master.

    The replication is the easy bit - the slave promotion is the hard and gritty bit.

    --

    Moneyed corporations, non-working 'poor' and criminal prisoners are turning productive citizens into tax-slaves.

    1. Re:PostgreSQL and pg_dump by tcopeland · · Score: 4, Informative

      Another hard bit is that the Postgres replication doesn't support sequences - see the details in the aptly named "Things to Remember" section of the installation documentation.

      So if your master fails, presumably you have to recreate the sequences starting at a number high enough to avoid conflicting numbers before switching over to a slave. Seems like this could be a problem.

      Nonetheless, Postgres is cruising away on RubyForge; 300,000 records and counting...

    2. Re:PostgreSQL and pg_dump by zulux · · Score: 2, Informative


      I love PostgreSQL sequences - I think definatly a feature I would miss.

      PGSQL stores it's sequences seperate from the tables. When you need a sequence, you can have PGSQL create one at table creation time, or you can link your table to an already available sequence.

      The sequences are under your full controll - you can reset them, roll them back or set them to any value you want.

      Clever things you can do with this:

      Set one server to have high squence numbers and another to have lower ones - in the middle of the night they can swap data that they have dominion over. This poor-man replicaiton works well over flaky internet connections. Imagine a remote office with a 56K modem that swaps over in the middle of the night.

      When you need to update the entire database to take advantage of another type of data abstrations - you can have your new style of tables use the old sequence numbers as well. The older tables grow and the new tables grow - but they never share a sequence number. The every night you can import the newer sytle of table with the old style data - you can let some users use the old front end, and test the new front end and the entire migration can be seamless.

      --

      Moneyed corporations, non-working 'poor' and criminal prisoners are turning productive citizens into tax-slaves.

    3. Re:PostgreSQL and pg_dump by jbelton · · Score: 1

      > the slave promotion is the hard and gritty bit For controlling the master roll, check out heartbeat (http://linux-ha.org/). There may be existing scripts to allow automatic failover of PostgreSQL when the new replication logic is being used.

    4. Re:PostgreSQL and pg_dump by Anonymous Coward · · Score: 0

      if all you need is a high enough starting number you can do a max(column)+1 query before you open it up to changes.

      if you really care about not re-useing the sequence number for a while you can look at trying to replicate the sequence table onto the slave

      David Lang

    5. Re:PostgreSQL and pg_dump by tcopeland · · Score: 1

      > do a max(column)+1 query before you
      > open it up to changes

      Yup, right, that makes sense. Seems like the thing to do would be to have a script to do that for every table prepared in advance. That way when the master failed one wouldn't be caught short.

      > trying to replicate the sequence table
      > onto the slave

      Hm, yup, maybe, and then just add 1000 or some other safe interval to the sequence to catch any extra txns.

  13. MySQL + BigIP by zarthrag · · Score: 2, Interesting

    What *I* would probably attempt would be to setup a replication ring, and use a bigIP to make them all look like the same server. Then you get your load balancing, and scalability. I have yet to try this, but I will in the (very) near future.

    --
    Why can't all fpga/microcontroller manufacturers just release free optimizing compilers???
    1. Re:MySQL + BigIP by innosent · · Score: 1

      I was thinking along the same lines, but you'd have to handle the replication issues, which MySQL is still pretty weak on. PostgreSQL would probably be better in that case. If speed is really an issue (and downtime isn't as bad), you could split the data logically between systems, and use stored procedures running on a server that executes commands on multiple servers concurrently.

      --
      --That's the point of being root, you can do anything you want, even if it's stupid.
    2. Re:MySQL + BigIP by dusty123 · · Score: 1

      This will only work when you use only SELECT statements. While load balancing is possible with Linux (there is some neat stuff in Linux-2.6) this still does not solve the database replication issue.
      Every UPDATE or INSERT statement has to be replicated to the other servers.

  14. High Availability by mcdrewski42 · · Score: 5, Insightful


    HA is always crapshoot/tradeoff between cost and risk. Throw enough $ at the problem and you'll approach 100% availability.

    I know that 'more robust' is a nice thing to want, but you really need to think about what you really need. If it takes 15 minutes to switch over to a backup copy (using some magic RAID disk mirroring maybe?) and 15 minutes to restart the app and let it checkpoint it's way up to a decent operational speed again, is that good enough?

    If it takes an hour, how about that?

    How much time/heartache or money is it worth for you to have system downtime, and how much are you willing to expend to reduce it by 5, 15, 30 minutes?

    So, there's really a continuum of availabilty you have to pick your point in. At the low end, you have no backups and recreate everything from scratch. At the high end you use Vendor X's real clustering solution and 24x7 monitoring, then have zero downtime even in a disaster. Somewhere in the middle is you.

    Now I realise this an overtly commercial view of things, but if needs be replace money with effort and season to taste.

    --
    /* affect != effect */ void affect(int *thing,int effect) { *thing += effect; }
    1. Re:High Availability by Anonymous Coward · · Score: 0

      Beautifully written.
      However that 30 minutes often blows out to the time it takes to restore (often longer than the backups). like 4-5 hours.
      30 minutes will grow, grow grow, and the database boys will need time to do compresses/reorgs, once you start adding new fields from the waterfall development model.

    2. Re:High Availability by Anonymous Coward · · Score: 0

      HA is always crapshoot/tradeoff between cost and risk. Throw enough $ at the problem and you'll approach 100% availability.
      No, not really. Read Jim Gray's book on Transaction Processing or read his seminal paper "Why Computers Stop". Using simple two-way clustering or three-way with voting you can raise your availability a thousand fold. So clustering (especially if you can combine it with transactions) is a really good thing to do.
      Also, the "why bother" opinion isn't worthwhile since everyone knows that already. When people are asking for high availability, they usually have a good reason for it.

    3. Re:High Availability by mcdrewski42 · · Score: 1

      Nice try Troll, he's working for MICROSOFT! :)

      No, seriously, thanks for this. For anyone out there also interested, the ...

      I'll give it a read.

      --
      /* affect != effect */ void affect(int *thing,int effect) { *thing += effect; }
    4. Re:High Availability by mcdrewski42 · · Score: 1

      Whoops - should have previewed. Link is here...

      --
      /* affect != effect */ void affect(int *thing,int effect) { *thing += effect; }
    5. Re:High Availability by cookd · · Score: 1

      Two concepts are often confused: High Availability and Scalability. Probably because they have some solutions (clustering) in common. But they aren't the same, and I got the impression that the Comment was asking more about Scalability.

      I think HA is a somewhat easier nut to crack than scalability. Make everything redundant, keep the copies in sync, and you're probably ok. Or as you mentioned, come up with a way that any single point of failure can be handled within a reasonable amount of time.

      Scalability is a bit more difficult, esp. when the database must be updatable, or where the queries must be real-time and in sync with each other. You can't simply have mirrors or replacement components -- the systems have to talk to each other.

      --
      Time flies like an arrow. Fruit flies like a banana.
    6. Re:High Availability by innosent · · Score: 1

      Another thought....
      If you have a transaction server, where you can split the load logically to the servers where the data is located, why couldn't you get both? If you have two major tables (as an example), and a few dozen trivial ones, split the tables between them (either at the table level, or at the row level, according to a hash on the key), and use queries on the transaction server to run commands on the individual servers.

      If a server is unavailable, go to it's alternate, etc... This way, a complete copy of every table should be on 2-3 machines, but the load from an actual query run in your application is split (on average) evenly between all the servers in the cluster. If only 2-3 machines have the table, replication is easier, but you still maintain availability, and by splitting the data (since smaller tables/fewer queries mean less work), you scale performance pretty linearly. Sounds like a win-win to me, but I haven't tried it yet. Do any OSS transaction servers (for distributed transaction coordination) exist? If not, you'd have to do most of the coordination manually, but it shouldn't be that bad, as long as you do everything through well-designed stored procedures.

      --
      --That's the point of being root, you can do anything you want, even if it's stupid.
  15. Well.... by agent+dero · · Score: 2, Insightful

    If you're working with enough data that would require a CLUSTER, then I would suggest a commercial product.

    But if you need that SPEED, but not a lot of data storage, I'd say a decent sized MySQL cluster would cover you, depending on what your needs are.

    If you are in the position to actually need a cluster to do that much work, you should be able to get something commercial and more large-scaled oriented

    --
    Error 407 - No creative sig found
  16. eRserver by linuxwrangler · · Score: 5, Interesting
    I have found PostgreSQL to be nearly bullet-proof. I routinely have connections up for months at a time (that's individual persistent connections - the server is up much longer and the connections usually get dropped when I upgrade the client software). Still, sh*t happens and replication has been a sore point for many databases both open and commercial.

    You should investigate eRserver. It was originally a commercial replication product for Postgres but has been open-sourced. I haven't tried it yet but it's on my to-do list.

    --

    ~~~~~~~
    "You are not remembered for doing what is expected of you." - Atul Chitnis
    1. Re:eRserver by TheFuzzy · · Score: 5, Informative

      Well, the .ORG domain runs on PostgreSQL + eRServer, so that's one scalable solution ...

    2. Re:eRserver by murali_v82 · · Score: 1

      Couple of years back i was involved in the development of India's leading adserver. We used postgres as the database. We had this algo (requirement) to ensure that ads were served in the ratio of leftover page-views booked. This required that we update into the database numbers of ads served, as this will affect which ad would be served in the next page view. We used postgres initially. After every ad was served we updated back to the database. Software worked like a dream in the test environment. We launched the adserver and started bringing web sites live on it. It worked fine the first few weeks. Within no time, i was getting frantic calls that the Postgres server has shutdown and programs were bombing (charecterized by broken gifs is different web sites).
      We struggled through a week of sleepless nights to fix the problem. Basic problem was, if you try updating online, continuously, the database just can't take it. I know for sure postgres couldn't but haven't checked with commercial dbs. Being wiser we moved to a more sober solution using tied-hashes with Perl combined with compromises in requirement (update the ads served every half-hour). Solution worked like a dream. We scaled upto 8 million ads a day with a single intel box.
      I have found some of my old postings on this.
      Take a hardlook at your problem to see whether you REALLY require real-time updates.

    3. Re:eRserver by Sxooter · · Score: 1

      Let me guess, you forgot to vacuum, right?

      --

      --- It is not the things we do which we regret the most, but the things which we don't do.
    4. Re:eRserver by Sxooter · · Score: 1

      Oh dear GOD! That was way back in January of 2000! No wonder you were having issues. You should realize there's been a lot of work done on Postgrsql since then, and it is MUCH more robust with respect to vacuuming and index growth.

      I'd recommend testing it again. Real testing. You should hit it with millions and millions of updates, just like the live site, and see how it runs. It really is light years ahead of where it was back then.

      --

      --- It is not the things we do which we regret the most, but the things which we don't do.
  17. Shared storage? by crstophr · · Score: 5, Informative

    You can make a High Availability cluster out of most any software if you have some kind of shared storage.

    People have used firewire drives connected to two different computers to accomplish this cheaply. Oracle is giving away a cluster filesystem (so they can sell RAC on linux) there is OpenGFS as well for filesystem usage.

    Just write some basic monitoring scripts that will bring up your postgress database on the second server should the first one fail. Just make sure those scripts completely take down the old database on the first server in the case of a partial failure. Having two databases try to open the same data would be a really bad thing.

    Here are some links to articles that should help:

    Overview

    Howto

    Cluster Filesystem

    These are mainly geared for Oracle/RAC, all you need is the firewire shared storage and cluster filesystem. You're on your own to write the monitoring and failover scripts. Hope this helps. --Chris

    1. Re:Shared storage? by Lumpy · · Score: 1

      People have used firewire drives connected to two different computers to accomplish this cheaply.

      I have done this with scsi and it works great. each computer needs a different device id (that's why they let you change it :-) and all 4 servers were tied to the same scsi drive rack.

      the only bitch was to be sure the drive chassi was powered up first and then each computer HAD to be powered up at the same time.

      Hell I saw networking done in linux via the scsi bus that way...

      --
      Do not look at laser with remaining good eye.
    2. Re:Shared storage? by Kashif+Shaikh · · Score: 2, Insightful

      Oracle is giving away a cluster filesystem (so they can sell RAC on linux) there is OpenGFS as well for filesystem usage.

      I was saying "Wow! Oracle has released a clustered filesystem!", until I discovered it only works with shared-storage. Meaning it won't create a filesystem image across a cluster network, where data is distributed. But rather the cluster filesystem is stored in a centralized location, but can be accessed by multiple members of the cluster at the same time for both read and write.

    3. Re:Shared storage? by ba_hiker · · Score: 1

      This works fine until a disk system fails or a drive crashes. RAID is somewhat of an aid here, but all RAID systems have some single point of failure.

    4. Re:Shared storage? by vidarh · · Score: 1
      Why do you assume all RAID systems have some single point of failure? Ever locked at IBM's ESS/Shark? Multiple power supplies. Two independent AIX servers acting as storage controllers. At least two separate bays of SCSI or fibrechannel controller cards (if you connect your server to both bays, you can hotswap controllers in one of the bays without taking your system offline). A large number of hot swappable disks, raid'ed. Both AIX servers can access all drives, and will take over all access if one AIX server fails.

      And to guard you against catastrophic failures of the entire ESS it can replicate the entire dataset over fiber to a second ESS in a different location.

      It's all a question of what price you're willing to pay.

    5. Re:Shared storage? by flopiano · · Score: 1

      That's the solution my company is going to adopt (hopefully within the end of the year).

      We run Postgres on a pair of RH Advanced Servers mounting disks from a SAN (Only one of the two nodes runs the postgres service at a given time, and only it mounts the partition with the data). A custom application accessing the database runs on the other node, mounting another disk partition (or both run on the same when one node is down).

      This is only an HA cluster (the DB runs only on one node) but the service switches to the other node if the first goes down.

      The application we run is not too critical (I mean, we can tolerate a downtime of a couple of minutes when the service switches from one node to the other).

      Anyhow, it is not for an e-commerce web site, so we don't have thousands of transactions per minute, but still it is for commercial purposes (real money is involved).

      Surprisingly, from what I read so far, it seems that commercial solutions based on OSS are not so common yet.

      --
      sigh

    6. Re:Shared storage? by DJerman · · Score: 1
      I was saying "Wow! Oracle has released a clustered filesystem!", until I discovered it only works with shared-storage. Meaning it won't create a filesystem image across a cluster network, where data is distributed. But rather the cluster filesystem is stored in a centralized location, but can be accessed by multiple members of the cluster at the same time for both read and write.

      Yah, that's clustering. The other thing you're calling a cluster is only replication. In a cluster every node sees a consistent image all the time, and access from any node to any data is consistent (and read/write). Locks propagate across nodes and read-consistency is maintained. No OS DB is capable of clustering, as far as I know, nor are they likely to be (reliably, on recent hardware) until some more patents expire.

      When you can show me the results of the write that occured on the node next door a split millisecond ago, even though I have an old version in my node's cache, I'll start to beleive in it. Until then, it's the big boys for clusters.

      --
  18. ACID is in place by linuxislandsucks · · Score: 1

    ahem have youchecked nay opensource db lately?

    most of them have acid enforcement despite your claim to the contrary..

    --
    Don't Tread on OpenSource
    1. Re:ACID is in place by Unordained · · Score: 1

      Agreed. Firebird (here) and PostgreSQL(here) both do. And it works. InnoDB in mysql tries really hard, but it's behind, and it slows down MySQL (compared to ISAM/VSAM) enough you might as well use another database that, at the same speed, has more features.

  19. Emic, InnoDB Hot Backup by vinsci · · Score: 5, Interesting
    Two MySQL products I found interesting (neither of which is open source at this time): The rest of this comment is quoted verbatim from InnoDB News

    MySQL/InnoDB-4.0.1 and Oracle 9i win the database server benchmark of PC Magazine and eWEEK. February 27, 2002 - In the benchmark eWEEK measured the performance of an e-commerce application on leading commercial databases IBM DB2, Oracle, MS SQL Server, Sybase ASE, and MySQL/InnoDB. The application server in the test was BEA WebLogic. The operating system was Windows 2000 Advanced Server running on a 4-way Hewlett-Packard Xeon server with 2 GB RAM and 24 Ultra3 SCSI hard drives.

    eWEEK writes: "Of the five databases we tested, only Oracle9i and MySQL were able to run our Nile application as originally written for 8 hours without problems."

    The whole story. The throughput chart.

    --

    Trusted Computing FAQ | Free Dawit Isaak!
    1. Re:Emic, InnoDB Hot Backup by rushfan · · Score: 1

      I get a 404 when trying the URL. Someone have a URL that works? (or is eWeek hosed?)

      Rob

    2. Re:Emic, InnoDB Hot Backup by vinsci · · Score: 2, Informative
      Sigh, the folks at eWEEK have revamped their website and in the process managed to kill most old links...

      Use this link to the article instead:
      Database Server Clash Revisited
      http://www.eweek.com/article2/0,4149,1238712,00.as p

      --

      Trusted Computing FAQ | Free Dawit Isaak!
    3. Re:Emic, InnoDB Hot Backup by vandan · · Score: 1

      Go to the MySQL web site. They either have a link that works, or have mirrored the article.

    4. Re:Emic, InnoDB Hot Backup by wilhelm9 · · Score: 1

      There are a few annoying facts with this benchmark.

      First of all, the download package includes setup instructions for most of the other products _except_ MySQL. It is not possible for me to learn database setup options and test for myself. For all I know, if the benchmark is not audited by an organization like TPC, the "winning" product might as well run their transactions at a lower level of isolation or enable certain disk buffering options that many products have.

      Secondly, the entire Nile application is written around a concept of dynamically generating SQL statements, and execute them. Most more sophisticated products have elaborate schemes to allow applications to precompile statements and to reuse compile queries. As it happens, the MySQL JDBC driver used by the benchmarketers does not support parameter markers. Even if an application would issue SQL-statements that were reusable, the MySQL JDBC-driver would replace all parameter markers with the actual parameter value at execution time, and them compile the entire query. SQL products in general offers the possibility to very easily compile ad-hoc queries, but it does not necessarily give the optimum performance.

      Thirdly, the nile application makes pretty sure to avoid MySQL pitfalls, such as large result sets. I didn't see any code at all specifally designed to avoid the other products pitfalls. Either, they don't have any, or the benchmark figures they published have the pitfalls "embedded".

      I'll give one credit to MySQL though, it appears as it is compiling their SQL-dialect pretty quickly.

    5. Re:Emic, InnoDB Hot Backup by Anonymous Coward · · Score: 0

      Next time, you may want to scroll beyond the title: From the article: http://www.eweek.com/article2/0,4149,1250820,00.as p The key issue, according to DataDirect, was the use of a batch SQL statement (one containing more than one command) combined with a bidirectional cursor. In particular, one commonly used page issues a SQL command to retrieve a set of books in a catalog query. The application limits the number of books returned to 500 by placing "set rowcount 500" and "set rowcount 0" commands around the SQL select statement. "This statement causes problems for the driver because the server will not give the driver a server-side scrollable cursor if the statement is a batch," said Royce Willmschen, director of research and development at DataDirect, in Morrisville, N.C. "Therefore, we must emulate the scrollable cursor on the client side and process through the result set to find the last row." Willmschen suggested using JDBC to programmatically set a limit on the number of rows returned rather than doing this in SQL directly. Microsoft tested our application in its own lab and determined that replacing the "set rowcount" commands with an alternate syntax, "select top 500," resulted in much-improved performance ( see Microsoft SQL Server performance chart ). This change folded the query limit constraint right into the select statement, eliminating the batch SQL statement. As in the IBM DB2 case, this Microsoft test should not be directly compared with our numbers, but it's clear the change did have a substantial positive impact on performance. Microsoft tested on a four-way server using slightly slower Intel Corp. Xeon CPUs than we used (550MHz versus 700MHz) and recorded a peak throughput of about 370 pages per second versus our peak throughput of 220 pages per second. Microsoft didn't have exact performance figures for the application on its setup with our original "set rowcount" version, so we don't know precisely the performance delta this change introduced. However, the difference was substantial, according to Microsoft officials: "As I recall, it was over a doubling of performance with that change--it was a huge difference," said Greg Leake, director of Microsoft .Net Competitive Labs, Developer Division, in Redmond, Wash. One other important subtlety that Microsoft noticed and we'd like to pass along was a difference between the JSP (JavaServer Pages) version and the ASP (Active Server Pages) .Net version of the test application we used. In the JSP version, we did catalog queries using an "=" exact-match operator. With the ASP.Net version, a "like" and "%" wild-card operator was used that places an extra CPU burden on Microsoft SQL Server. So, RTFA before posting

  20. What is slashdot doing? by rtnz · · Score: 5, Interesting

    What does Slashdot do for this? I recall way back in the day there was some information about what the Slashdot tech looks like, anyone have info regarding their database setup? L

    1. Re:What is slashdot doing? by sbszine · · Score: 2, Informative

      Slashdot runs MySQL db on a couple of boxes. Check the FAQ and the IRC interview log. According to the FAQ, Slashdot is / was financially contributing to replication in MySQL.

      --

      Vino, gyno, and techno -Bruce Sterling

    2. Re:What is slashdot doing? by tf23 · · Score: 1

      Last I recall this was discussed they said they had a beast of a quad-cpu mysql4 server as primary, with slaved replicators for read-only things.

      They're using innodb tables more and more (see the slashcode), along with http://www.danga.com/memcached/

  21. is it just me? by gfody · · Score: 3, Funny

    or does this term sound kind've like a made up buzzword like ".NET powered Java schemas!" or "SOAP servlet toaster oven with X-M-L!"

    --

    bite my glorious golden ass.
  22. Replicated MySQL by Jack+Auf · · Score: 3, Informative

    Using one server as a master and n servers as slaves. Just make sure to write everything to the master. Replication to the slaves generally takes about a second or maybe two depending on load.

    OK, not quite the same thing but this works quite well for ready heavy applications, and is very reliable unless you get a slave out of sync.

    This was on v3.n.n - the good folks at MySQL have made many improvements to the replication facilities in the 4.n series I believe.

    --
    "They that can give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety" - BF
    1. Re:Replicated MySQL by Anonymous Coward · · Score: 0

      What kind of asshole buys an SUV in the middle of an oil war? All too often I'm ashamed to be an American.

      So what car am I going to put my 23" rims on? Somehow I don't think the Toyota Prius faggots like you drive will work.

    2. Re:Replicated MySQL by one_who_uses_unix · · Score: 1

      The kind of asshole that lives off 2 miles of dirt road that becomes impassable to non-4WD vehicles once winter sets in and has 3 kids (often plus his brother's 3 kids) to get to school every morning.

      If you can suggest an alternative vehicle that can carry 7 people (safely/seatbelted) and can make it over our roads then I will happily buy it.

      --
      KK4SFV
    3. Re:Replicated MySQL by Anonymous Coward · · Score: 0

      how about a horse and buggy, jerk-o

    4. Re:Replicated MySQL by KnightStalker · · Score: 1

      Ooh, look, a shiny thing! *CHOMP*

      I recommend a hovercraft. Massive geek factor -- and no washboarding! Or maybe that SUV comment was a rhetorical question and wasn't really aimed at you.

      Just yesterday, I saw a woman screaming at some poor teenager, because the woman, driving an Cherokee in downtown Portland by herself, had nearly run over the teenager, riding some kind of razor-scooter thing. The scooter had scratched the truck's paint, you see. It was probably the teenager's fault.

      --
      * And remember, it's spelled N-e-t-s-c-a-p-e, but it's pronounced "Mozilla."
  23. three types of clusters by u19925 · · Score: 4, Informative

    there are basically three type of clusters:

    1) shared nothing: in this, each computer is only connected to each other via simple IP network. no disks are shared. each machine serves part of data. these cluster doesn't work reliably when you have to aggregations. e.g. if one of the machine fails and you try to to "avg()" and if the data is spread across machines, the query would fail, since one of the machine is not available. most enterprise apps cannot work in this config without degradation. e.g. IBM study showed that 2 node cluster is slower and less reliable than 1 node system when running SAP.

    IBM on windows and unix and MS uses this type of clustering (also called federated database approach or shared nothing approach).

    2) shared disk between two computers: in this case, there are multiple machines and multiple disks. each disk is atleast connected to two computers. if one of the computer fails, other takes over. no mainstream database uses this mode, but it is used by hp-nonstop. still, each machine serves up part of the data and hence standard enterprise apps like SAP etc cannot take clustering advantage without lot of modification.

    3) shared everything: in this, each disk is connected to all the machines in the cluster. any number of machines can fail and yet the system would keep running as long as atleast one machine is up. this is used by Oracle. all the machine sees all the data. standard apps like SAP etc can be run in this kind of configs with minor modification or no modification at all. this method is also used by IBM in their mainframe database (which outsells their windows and unix database by huge margine). most enterprise apps are deployed in this type of cluster configuration.

    the approach one is simpler from hardware point of view. also, for database kernel writers, this is the easiest to implement. however, the user would need to break up data judiciously and spread acros s machines. also adding a node and removing a node will require re-partitioning of data. mostly only custom apps which are fully aware of your partitioning etc will be able to take advantage.
    it is also easy to make it scale for simple custom app and so most of TPC-C benchmarks are published in this configuration.

    approach 3 requires special shared disk system. the database implementation is very complex. the kernel writers have to worry about two computers simultaneously accessing disks or overwriting each others data etc. this is the thing that Oracle is pushing across all platforms and IBM is pushing for its mainframes.

    approach 2 is similar to approach 1 except that it adds redundancy and hence is more reliable.

    1. Re:three types of clusters by Pro_Piracy_Guy · · Score: 3, Informative
      approach 3 requires special shared disk system. the database implementation is very complex. the kernel writers have to worry about two computers simultaneously accessing disks or overwriting each others data etc. this is the thing that Oracle is pushing across all platforms and IBM is pushing for its mainframes.

      I recently attended an Oracle Convention, and the one thing everyone (except Oracle) will admit about RAC is that it is very difficult to implement and very very expensive. Of the many vendors, the cheapest RAC solution I came across was in the $30,000 to $50,000 range (scaleable turn-key solution - price does not include Oracle license fees). Most of the reps I spoke with said unless you are a huge enterprise with lots of cash to blow, RAC is NOT the way to go.

      Just my $0.02

    2. Re:three types of clusters by ba_hiker · · Score: 1

      Well clusters of type 1 don't really implement any redundency at all. A failure of any node in the system, makes some of the data unavailable. So this might not be the best for the original posters problem. It does allow a high degree of scaleablity in terms of database size and performance (assuming you solve the moves and changes problem). In fact if access to all of the data is a requirement, the more nodes in a type 1 (shared nothing) cluster the lower the reliablity.

    3. Re:three types of clusters by Anonymous Coward · · Score: 0

      You can't seriously be saying this. fault tolerance, high availability and high performance is not cheap. For 30-50K is cheap. Keep in mind this stuff used to cost 500K just a few years ago.

    4. Re:three types of clusters by Malc · · Score: 1

      Hang on, so which model are we using? We have a two machine active-passive cluster with a shared disk array. It uses Microsoft Cluster Service. The shared resource are only assigned to one machine at a time. When fail-over occurs, the other node takes over all the shared resources (disk, IP address, etc) and brings up SQL Server. It takes 45 secs to 2 mins, often dependent upon how many transactions SQL Server has to roll forward or back during startup. It's not your model 1, and you said no mainstream DB uses mode 2. It's definitely not model 3 though.

      Incidentally, by adding a second disk array we could convert this in to an active-active cluster with almost instaneous failover as both servers would have SQL Server online. Our active-passive cluster works well for our needs, and doesn't seem as complicated as you make out - I suppose MSFT have done well at hiding the details.

    5. Re:three types of clusters by the_arrow · · Score: 1

      2) shared disk between two computers: in this case, there are multiple machines and multiple disks. each disk is atleast connected to two computers. if one of the computer fails, other takes over. no mainstream database uses this mode

      Well, what about Windows Cluster Service and Oracle Failsafe? The company I work for use it at a couple of airports and it works pretty good. The downside is that the failover is not transparent (all clients connected gets disconnected and have to reconnect).

      An open-source solution that works like this should be quite simple to set up with the help of the Linux High-availability project.

      --
      / The Arrow
      "How lovely you are. So lovely in my straightjacket..." - Nny
    6. Re:three types of clusters by Anonymous Coward · · Score: 0

      You're using type 2, active-passive. The active-passive part makes it a lot simpler as no-one needs to worry about simultaneous access. It also means you get zero performance improvement from clustering, as you only have one box banging the data out.

      That kind of HA is fairly cheap (although the simultaneous disk connection is a specialty item). But it's not scalable, as you're only as good as your biggest box, and failover mode degrades the performance of the passive server (assuming it has some other function besides standby). MCS is not needed, btw, if you're willing to wait for manual intervention. Oracle is quite capable of running a standby database that will "wake up" on command, in a cheaper type 1 setup.

      This (type 1) could be emulated by replication (although Oracle does it with transaction-replay on the passive box), if the Open Source DB's get 2pc working right. I hear PostgreSQL it working on it, and I don't know about MySQL -- SAPDB might be a contender too, I've never tried it out.

      Performance clustering (for type 2 or 3) requires a lot of cross-talk to communicate locks, pass queries, or communicate cache content between nodes. Oracle 9i RAC and DB2 on 390 are the best for this, DB2 because the hardware supports it so well and Oracle because they got very, very clever about it (and gigabit ethernet can narrow the gap between bus latency and network latency). Don't expect to see open source playing well in this realm unless someone's running it on mainframes and/or stepping on patents :)

  24. Das DB by Flip+Chart · · Score: 3, Informative

    What about SAPDB isn't it a potential choice. I thought I read somewhere that MySQL and SAPDB were merging. Chech it out http://www.sapdb.org/

    1. Re:Das DB by Anonymous Coward · · Score: 0

      but mysql and sapdb are not free. unless your software is gpl'd.

  25. MySQL replication: Flawless (so far) by allankim · · Score: 5, Informative

    I've been running a 3-4 node MySQL 3.23.x cluster on Slowlaris 9 since January. It has survived several catastrophic power outages and numerous other insults without a hiccup. Load is fairly light (about 3,000 updates daily and a similar number of queries on each server) so YMMV.

    1. Re:MySQL replication: Flawless (so far) by WellAren'tYouJustThe · · Score: 0, Funny

      Well aren't you just the porn king of your mom's basement.

  26. ZEO (Zope Enterprise Option) by Wheat · · Score: 3, Informative

    ZEO will allow you to scale the ZODB (Zope Object Database) across multiple processors, machines, and networks. Although the ZODB is a Python object database, so it's probably not an option to port your current database. There are other limitations of the database - it's not always the fastest, it's an object database so concepts like foreign keys are not fully there, but it can give you high availability. As of new Zope 2.7 in beta though, ZEO is quite easy to set-up, and it is open source.

  27. [OT]: Your sig by Elwood+P+Dowd · · Score: 0, Offtopic

    All you grammar nazis point correct me when I use affect as a noun, or effect as a verb, but I don't think even most grammar nazis really understand the difference.

    Check some definitions. Both words can be used as nouns or transitive verbs. It's complicated.

    The first definition of "affect" is "To have an influence on or effect a change in."

    Anyway. I guess not that complicated. But not the stupidest mistake someone could make.

    --

    There are no trails. There are no trees out here.
    1. Re:[OT]: Your sig by Anonymous Coward · · Score: 0
      It's true that affect and effect can both be used in noun or verb form. However, the verb "affect" doesn't mean the same thing as the verb "effect," though the meanings are, as you note, at least related. The noun "affect" (roughly meaning "emotion") and the noun "effect" (several meanings, but most commonly "the result of an action or influence") aren't interchangeable at all.

      -Your friendly neighborhood linguist

    2. Re:[OT]: Your sig by mcdrewski42 · · Score: 1

      But even grammar nazis still have to work within the 120ch sig limt... :(

      --
      /* affect != effect */ void affect(int *thing,int effect) { *thing += effect; }
  28. PostgreSQL eRServer 1.0 + Backplane by pabos · · Score: 4, Insightful

    Two options I haven't seen anyone mention yet are PostgreSQL eRServer 1.0+ (see PostgreSQL news item "PostgreSQL now has working, tested, scalable replication!" from August 28, 2003 or a lengthier press posting "PostgreSQL, Inc. Releases Open Source Replication Version") and Backplane.

    eRServer has been in development for over two years, is used in production settings and is released under a BSD license (as with PostgreSQL). It uses a single master/multiple slave asynchronous replication scheme. There are cautions in the release that replication may be difficult to setup.

    Backplane seems to be particularly well-suited to clustering data quickly across a WAN. A quote may explain it better:

    The Backplane Open Source Database is a replicated, transactional, fault-tolerant relational database core. Currently supported on Linux and FreeBSD, Backplane is designed to run on a large number of small servers rather than a small number of large servers. With Backplane, it is possible to spread the database nodes widely, allowing database operations to work efficiently over WAN latencies while maintaining full transactional coherency across the entire replication group.
    Backplane's native quorum-based replication makes it easy to increase the database capacity (by simply adding a new node), re-synch a crashed server, or take down multiple nodes for maintenance (such as an entire co-location facility) - all without affecting the database availability.

    I haven't used either yet, but you may wish to give them a look.

  29. Postgresql has free mature replication by KPU · · Score: 1, Redundant

    Recently announced on the PostgreSQL website is commercially developed free and open replication for PostgreSQL. erserver is available for download. It is single master, many slave replication only.

  30. Talk to the folks at deviantart.com by cubal · · Score: 3, Informative

    deviantart.com, IIRC, runs about 3 mysql servers behind a load-balancing cache/server, so have had to deal with a lot of the difficulties involved in that.

  31. interesting press release by Kunta+Kinte · · Score: 2, Interesting
    MySQL Teams With Veritas, SGI on Clusters - http://www.eweek.com/article2/0,4149,1208538,00.as p and http://www.mysql.com/press/release_2003_23.html

    Supposedly should be out by now.

    --
    Based on upvotes, Ageism is the only "-ism" Slashdotters care about and think isn't SJW
  32. Have you considered LDAP? by Anonymous Coward · · Score: 0

    I'm not 100% sure about this, but from what I researched, LDAP (openLDAP, etc) is built for replication. Also, LDAP is very fast at reading data and its "object-oriented" kinda, so you can do some things with it you can't do with relational databases. This is at the risk of using something entirely different (different query language, different administration, what happened to SQL, etc)

    If relational is the way you need to go, I use SQL Server and it can handle replication as well as spanned filestores. Outside of that, PostgreSQL is my next favorite. Its fast, object-oriented (kinda) and is easy to setup. I just wish they would come out with an easier installation and interface so I can start using it more!

    Anyway, Postgres sounds most promising. I like it better than MySQL and it is more enterprise-ready.

  33. Re:eRserver, more info. by ron_ivi · · Score: 3, Interesting
    .org and .info are both using it.

    The press release of ER Server becoming open source is quite informative (karma?) as well.

    Marc of PostgreSQL Inc's an incredible resource on the postgresql mailinglists too; and PostgreSQL Inc has a really cool policy that allowed them to do donate their code to the community that way:

    From their release: " "DATELINE FRIDAY, DECEMBER 15, 2000 Open Source vs. Proprietary: We advocate Open Source, BSD style :) We will consider and develop short term (up to 24 month) proprietary applications and solutions where there is a strong business and intellectual property case to be made. *All" proprietary developments that we are involved in *will* become open source within two years of implementation, without exception." ".

    Also cool, they provide hosting http://www.pgsql.com/hosting/ which donates "25% of all profit from these services ... directly back into the PostgreSQL Project. "

    Ron

    I'm not affiliated with them in any way, just appreciative of Marc's contributions on the mailingslists and to postgresql as well.

  34. How much reliability do you *need*? by crmartin · · Score: 1

    It's always hard to get this across to my clients, but you need first of all to answer the question "how much reliability/availability do you need?"

    Think of it this way: if what you need is no better than 99 percent, you need to be able to fail over fast enough to only have 864 minutes of downtime a week. Of course, that's about 14 minutes, so you can practically handle it by doing a hand cutover.

    On the other hand, if you need availability of 99.999 percent ("five nines") you can only afford to have about 40 seconds a week total downtime.

    If you need true five-nines, you need to look at some of those nasty commerical apps. If you can slack off from there, the PostGreSQL and nySQL replication schemes work just fine, and you can use DNS remapping to do the failover. (I've done it with mySQL, it worked; think about needing a heartbeat to detect if the master server has gone down.)

  35. If you are developing a new app, by Anonymous Coward · · Score: 1, Insightful

    check out prevayler

    Seems like an excellent alternative to the traditional database route (though I myself have not yet used it in an application)

    Here is a developerWorks article about Object Prevalence (of which prevayler is an implementation): An introduction to object prevalence

  36. Agreed. by oneiros27 · · Score: 4, Interesting

    Availability is one of the basic issues when sizing your system. [ie, can you have it down at night for a cold backup, or does it have to be available 24x7? Can you even get a maintenance window once a month?]

    As with sizing your UPS and/or generators, you need to determine what the cost to your business is for downtime.

    Now, yes, you might have some issues in SLAs that spell out how much it'll cost you, if you have to refund customers's money [for service based orgs]-- or how much profit you'd lose if your customers couldn't purchase items [for sales based orgs]. But unfortunately, you have to also consider the recovery costs, the costs of damage to your reputation, etc.

    If it's not worth your purchasing an Oracle or other, more expensive database, there's good odds that it's not worth the headaches of maintaining a high availability cluster with automatic failover. Instead, you can mirror the data, and keep transaction logs that you can replay.

    You can have a spare system on standby, that you can keep updated on a regular basis (again, your cost of downtime, and the necessary time to recover the system will affect your choices), and when your main system should fail, you can push the most recent diffs to your standby, reconfigure the application servers to recognize the new server as the old one, and you're back in business.

    It requires a bit of planning, and making sure that the necessary manual steps are well documented [so that anyone can do it, should the server outage be caused by something serious enough to take out your administrator, too], but it's easier and cheaper to build and maintain than a true cluster.

    --
    Build it, and they will come^Hplain.
  37. This answer is infront of you by snero3 · · Score: 1

    Why don't you ask the guys from slashdot(creator/s). I feel pretty sure that they are not running a single intance of MySql to support slashdot.org.

    Here is a good one to start with CmdrTaco They might take a while to get back to you but I am sure it will be worth your while.

    --
    It said "windows 98 or better" so I installed Linux
  38. Clusters... by Anonymous Coward · · Score: 0

    ...When you say "database clusters" I imagine a herd of mares. It's getting sick. But it's too nice to spoil it with some psychiatrist...

  39. Re:What the author really wants by mcpkaaos · · Score: 2, Funny

    I get the feeling that in a room with more than one door, it takes you all day to find your way out. :)

    --
    It goes from God, to Jerry, to me.
  40. Need to define the problem better by koreth · · Score: 4, Interesting
    Why do you want clustering? Do you need to scale up transactions per second? If so, are these primarily reads or writes? The answer to that question can make a huge difference in your clustering and replication strategy.

    Clustering read-mostly data for performance reasons is relatively easy; for many applications, where a second or two of staleness on the replicated databases is acceptable, you can make do with a bunch of independent copies of the database, with all updates going to an authoritative database and getting replicated out from there asynchronously.

    If your data can be partitioned cleanly -- that is, if you have groups of tables that are never joined with tables in other groups -- then you can perhaps get some benefit from putting different data on different servers, with no replication required. Obviously that's only worthwhile if the query load is comparable between groups.

    If, on the other hand, you require ACID-compliant updates of all the replicants as a unit, you're entering difficult territory and you might have no choice but to go with a commercial solution depending on the specifics of your needs.

    At just about all of the places where I've done database programming where this has come up, we ended up buying a much beefier database server (lots of processors and memory, good I/O bandwidth, redundant networking and power supplies) with disk mirroring, rather than get into the headaches of replication. A big Sun or HP server is certainly more expensive than some mid-range Dell or no-name PC, but it may end up being cheaper than the engineering time you'd spend getting anything nearly as robust and high-performance on less expensive hardware.

    I've also found that very often when there's a database bottleneck that looks like it requires bigger hardware, the problem is the data model or the queries (unnecessary joins, no indexes where they're needed, poorly-thought-out normalization, etc.) or the physical layout of the data (indexes competing with data for access to the same disk, fragmentation in indexes/data, frequently-used tables spaced far apart on disk.)

    If I'm dealing with Oracle, sometimes the solution is as simple as adding an optimizer hint to make the query do its joins in a sensible way. Sometimes denormalization is helpful, though you want to be careful with that. Sometimes a small amount of data caching in the application can mean a tremendous decrease in database load. And so on.

    If you can tell us more about the specifics of your situation, there are lots of people here who can offer more specific advice.

  41. DB2 ICE sets TPC-H performance standard on Linux by Anonymous Coward · · Score: 2, Informative

    Don't know how DB2 ICE would do compared to Open Source soloutions but take a look at the interesting results of the recent TPC-H benchmark performance testing on Clustered and non-Clustered 100GB and 300GB configurations. It appears that the IBM DB2 Integrated Cluster Environment (DB2 ICE) for Linux is heads above the rest.

  42. Amazing -you are wrong about almost EVERTYTHING by Glasswire · · Score: 1

    MS SQL requires x86 hardware - No Sparc, No POWER, No MIPS. Just crappy x86.
    Crappy x86 ...Intel Itanium 2 specifically is the leading TPC-C platform (both 1st (HP-UX/Oracle) and 2nd place (with the dreaded MS-SQL 64-bit version) and Intel cpus taking 8 of the top 10)

    There is no 64 bit version of MS SQL.
    Wrong again. See #2 above

    And if your *REALLY* need to scale PostgreSQL - run is on a SUN/SGI/IBM.
    Wel, aside from the TPC-C Top 10, it's interesting to note that if you trust SGI, their whole next generation is based on Itanium 2 and IBM even sells Itanium systems. Sun, well we they're in their death throes...

    Not a bunch of fucking Intel toys.
    Looks like pretty powerful toys. Perhaps you should play with some so you'll know what you're talking about.

    1. Re:Amazing -you are wrong about almost EVERTYTHING by zulux · · Score: 1

      Crappy x86 ...Intel Itanium 2 specifically is the leading TPC-C platform

      Itaniam is not x86.

      AFAIK The HP Superdomes run multiple images. That's great for TPC-C performace, not so go for everything else.


      There is no 64 bit version of MS SQL.


      Oh... great... it looks like it made it out of Alpha rather quickly.

      Hope you like being a beta tester. Enjoy.

      --

      Moneyed corporations, non-working 'poor' and criminal prisoners are turning productive citizens into tax-slaves.

  43. My baling wire story by Anonymous Coward · · Score: 0

    Once I was home visiting from University, hanging out drinking with a couple friends on a weeknight. They were town boys, whereas I had grown up in the country. I somehow talked them into the idea that it would be cool to go check out the old house where I had grown up.

    It was a crisp clear early fall night. We parked a half mile down the road at the intersection of the concession roads, and walked. At this corner there was a tiny graveyard -- 4 stones -- and I set the mood by telling about how it was a family whose house burned down with them in it and how it was said you could see them some nights, running down the road in ghostly flames.

    We snuck into the pine woods behind the house where I grew up and I showed them my childhood haunts in the moonlight. Rabbits started drumming, and my friends were freaked out. It sounds pretty spooky when you don't know what it is. We wandered over into the hayfields in what had been the old family farm. They'd been cut a month or so earlier, and we walked along the crick telling ghost stories.

    One of my friends started telling the campfire classic about the man with the hook hand and just when he got to the punchline he started jumping and going "Ah! Ah! Ah!"

    We got him calmed down, and figured out what had happened: right at the punchline of the hook man story he'd stepped on an old piece of baling wire, and it had slid up into his pant leg!

  44. Re:DB2 ICE sets TPC-H performance standard on Linu by dougnaka · · Score: 2, Interesting
    I run DB2 on Linux.
    It's been the largest pain in the ass I've ever had managing servers.
    MySQL spanks DB2, as does postgreSQL.
    Our DB2 on Linux crashed so much we spent months before we had a production ready system. We were replacing PostgreSQL and we had to rethink everything. It couldn't handle our insert load, and we were going from 4 dual 733 intel boxes to two large quad xeon boxes with 15,000 rpm disks.
    We spent $100,000 on DB2 license (that with the discounted half price DB2 EEE for linux). We are now in the process of migrating to MySQL after some large benchmarks. With a few simple indexes MySQL inserts twice as fast as DB2 and selects in 0.00 seconds on any row, vs. DB2's .460 seconds for any row in a 22 million row table.
    Throw in the support scam they pulled on us, and IBM is a joke of a company. If they weren't pushing Linux they'd annoy me more than Microsoft does. The support scam went like this. We purchased 8 CPU licenses for DB2 EEE In 8/02. In 3/03 we start recieving calls from salesmen to get our upgrade business since our 1 year support contract expires on 5/1/03. I call IBM with a serious chip on my shoulder and get the story that our anniversary date automatically defaults to any dates held by previous contracts, "it's easier that way". We had some AS/400's (talk about poor performing overpriced junk). So they wanted about $50K for "support" for another year. We declined their offer and considered suing. At $50k/year losing 4 months of support isn't acceptable to a small business.
    So I am bitter at IBM. But not without reason. During our first 3 painful months deploying DB2 I opened 15 PMR trouble tickets. Of the 15 I resolved 14 while either on hold or waiting for a call back from them. ALL of the PMR's were opened with status "critical, production down". The last PMR IBM claimed to either be a bug in the Linux kernel or in DB2, they didn't know, but when I pressed, they did offer a patched version that we could "try out" on our production box to see if it worked. Throw in that clustering didn't work as advertised (not at all under moderate load), and DB2 is a pile of junk.

    As the IT geek the fault landed squarely at my feet, so I did some thorough investigation and benchmarking. default config DB2 is considerably faster than both PostgreSQL and MySQL at everything but inserts. But throw in a few indexes and MySQL and PostgreSQL owns DB2's sorry excuse for a database.

    I AM bitter, and this probably is flamebait. But I'm past caring about IBM and their scam operation. I'm sticking with what works, and so far NOTHING from IBM has worked.

    I wasted 3 months of 7 day work weeks averaging 12 hour days on DB2 and it's so called Linux support.

    end

    --
    My Linux Command of the Day site : LCOD
  45. IBM's DB2 on OS/390 is pretty good :) by Anonymous Coward · · Score: 0

    I know it's probably irrelevant ... but I think IBM's DB2 on OS/390 rocks. You can have a couple of DB2 members on separate LPARs, which should reside on separate mainframes for a high availability (HA) setup.

    1. Re:IBM's DB2 on OS/390 is pretty good :) by Anonymous Coward · · Score: 1, Interesting

      We run a four-way data sharing cluster using DB2. Works very well us. It doesn't do peer recovery - one a instance fails, the survivours process the rollbacks and release locks. As far as I know only IMS does peer recovery.

      Of course this has very little to do with OSS.

  46. Off topic, but it's not. Why PDF and not HTML? by jerryasher · · Score: 2, Offtopic

    PDF blows.

    I hate PDF links. On Windows the experience is great, let's come to a complete halt as I watch CPU load hit 100%, wait for a splash screen, and watch the damned thing decide to show me the text at 245% zoom.

    What a load of shit.

    What's wrong with HTML as a virus free, pleasant to experience, documentation format?

    Just say no to PDF.

  47. Glasswire is an idiot by Anonymous Coward · · Score: 1

    Just a little bit of poking around the TPC website -
    look at the applications they used for testing; M$SQL server/DB2/Oracle9i. This thread is about OSS databases. Did TPC test with Postgres/MySQL? No. Why? Because IBM/Oracle/M$ are TPC partners. Gee, why do you think we don't see OSS products running circles around these bloated corp. products on these
    alpha-stage systems running Itanium???
    BECAUSE THIS SUPPOSED "TEST" ORGANIZATION IS FUNDED BY THESE COMPANIES you moron.
    Oh, and M$ SQLserver being 64-bit makes me laugh. Who is running, really running, SQLserver in 64-bit mode on an M$ wanna-be 64-bit mode OS on 64-bit hardware?
    Sun/SGI/IBM all have *REAL* 64-bit OSes running on
    *REAL* 64-bit hardware.
    Don't bring your M$-blind corp. drivel here.

    Your first clue about TPC is that they run their site on Assh*le Server Pages.....

  48. Right tool for the job? by kpharmer · · Score: 2, Interesting

    > The right tool for the job people

    Right, and a myoptic application of the above advice would lead to a dozen different database products in a typical department. They'd all be the right tool for some job - unless you're hoping to reuse skills, reuse backup solutions (TRM for DB2, Veritas for Oracle, etc), have any hope of reliable integration, etc.

    So, yeah - get the right tool for the job. But before you right that out you need to take a big step back and get a sense of what your strategic direction is, and what are all the implications of such a decision.

    I know a lot of folks converting mysql to other solutions right now - because some junior guy figured it was the best solution. It might have been for the app - but it wasn't for the department. Which is like winning a battle but loosing the war.

    1. Re:Right tool for the job? by gfody · · Score: 1

      heh.. 23. Scalability "do we want scalability?"
      *scratches all the free sql servers off the list*

      --

      bite my glorious golden ass.
  49. Re:DB2 ICE sets TPC-H performance standard on Linu by kpharmer · · Score: 2, Informative

    Ouch, sounds like you should have gotten an experienced dba to set it up for you. DB2's too complex to go with simple defaults, and clustering is definitely a high-skills endeaver.

    As far as insert loads go, we've seen 500 rows / second on five year old hardware without any problems. Although that's far short of what DB2 is capable of, it's fine for a sustained load. Beyond that batch loads hit 15,000 rows per second easily on the same box.

    And as far as pricing goes, today you could get DB2 Express for those little dual-cpu boxes for just $500. A really fast four-way will cost you $32,000 - still way shy of $100k. You don't need to hit that kind of pricing unless you're doing inter-partition parallelism. And as I mentioned above - that's just not worth doing unless you've got the right skills to pull it off.

  50. Here kiddie, kiddie by C10H14N2 · · Score: 2, Insightful
    Great, we had to go there. "Kiddie" and then resort to recommending MS SQL, which has proven itself to be, shall we say, not the most "reliable" on the planet and elaboration shouldn't be necessary in these parts. Oddly enough, although MS SQL was "licensed" (a-la CPM, sigh) from Sybase, people often simultaneously disparage Sybase and praise MS SQL, despite the fact the Sybase licenses are roughly half of MS SQL. So we're left, essentially, with Oracle, which the vast majority of businesses will find not the least bit cost effective, for that matter even necessary, certainly not at twice to four times that of the next commercial competition. No wonder people look for open source solutions.


    At any rate, many "big kids" are using the most unfairly bullied product, slandered most likely because it is a software boy-named-sue, MySQL. Why not have a read before taking childish pot-shots:


    http://www.mysql.com/press/MySQL_userlist.pdf

    In the end this silly "I'm a big boy because I use oracle and your a little gurly kiddie because you don't" bullshit is just empty bravado. Businesses generally attempt to find the most cost effective means to meet a need and often Oracle ends up being like buying a stealth fighter to deliver a pizza. It often just doesn't make sense even for a big kid with billions of dollars, which might be why the $30B+ multinational BASF uses PostgreSQL.


    Frankly, after the named-user license Oracle sold the State of California, no matter how idiotic the clearly comatose contract negotiators were, one would be remiss to not consider other companies with slightly less egregious behavior on record.

    1. Re:Here kiddie, kiddie by AKAImBatman · · Score: 1

      In the end this silly "I'm a big boy because I use oracle and your a little gurly kiddie because you don't" bullshit is just empty bravado. Businesses generally attempt to find the most cost effective means to meet a need and often Oracle ends up being like buying a stealth fighter to deliver a pizza.

      That's why REAL businesses drop the bullshit and use Pervasive (BTrieve). The best part about it is that you can mention BTrieve to just about any IT manager and he'll start waxing on about how great it was "back in his day". Guaranteed sell in just about any company. :-D

    2. Re:Here kiddie, kiddie by LurkerXXX · · Score: 1
      The problem here is you are talking about the need for clustering. That's generally a higher end function. Yes, you don't buy a stealth fighter to deliver a pizza, but most pizza shops don't keep a fleet of spare delivery cars in the back in case their regular car breaks down.

      Listen, I like opensource. I use opensource a lot of places. I use MySQL for lots of little blogs and little things. But when working for clients where serious money is on the line, I use Oracle or SQL server.

      Oh, and you have to take that userlist at MySQL with a grain of salt. Google does use MySQL for a few little jobs, but they sure as heck aren't running their big search engine on it. They use it for little, non-critical stuff. Like reasonable folks do. That doesn't make it a powerhouse database, sorry.

      Oh, BTW. BASF uses Oracle as well. Imagine, a big ol company like that using multiple databases? Maybe they use the high end one for the critical stuff? Nah.

  51. Cluster for MySQL Described by Mine_Field · · Score: 4, Informative

    Here is a description of a Cluster created on MySQL with Linux boxes - similar to Google. http://www.dwreview.com/Product_Reviews/Review_Dat a.html and http://www.dwreview.com/Data_mining/Intelligent_Da taMining.html

  52. OUCH anything cheaper by Anonymous Coward · · Score: 0

    I want to experiment!

  53. Prove this guy wrong and make millions by Anonymous Coward · · Score: 0

    If you can avoid the Oracle and SQL fees, go ahead ... show us, I want to save money.

    Slashdot is full of people who are quick to savage anyone who knocks OSS, yet rarely provide decent examples proving their argument!

    I want an OSS database cluster, now put up, or shut up!

    1. Re:Prove this guy wrong and make millions by ThisIsFred · · Score: 1

      Read the rest of the threads, you find examples there, as well as some good recommendations. When someone asks "what is a good OSS database clustering solution," and you feel compelled to answer, "MS SQL," don't bother answering.

      That was my purpose for posting the original flame.

      --
      Fred

      "A fool and his freedom are soon parted"
      -RMS
  54. I've used MySQL and PHP on a reasonably big site. by Anonymous Coward · · Score: 4, Informative

    I maintain a site that does a fair bit of traffic (Daily avgs: files served = 1.8 Million, bandwidth = 20 Gigs)

    We have 1 "master" MySQL server which gets all updates and inserts, etc. We have 2 "slave" servers which each take a signifigant portion of the select queries. All machines run the same 4.0.x version of MySQL. (Web access is PHP on Apache) All machines are dual x86s packed with RAM.

    Setting up replication is pretty easy. And for the most part things are pretty nice. The load average drops a lot on each machine when we add a new slave. (Oh don't forget to enable query caching.)

    We have had some problems though. Because the site gets so much traffic sometimes queries take a while to run and to propagate to the slave servers. This means if you update your data (via the master) and then do a select from one of the slaves your change may not show up yet. For most web apps this might not seem really big.

    But it leads to the web users changing things and not seeing the results right away. So they figure the site is "broken" and they repeat what they just did only to have it take place twice. If you have your browser "refresh" the page first usually the data has come through but many people don't do this. The result is they don't feel their account has been credited or something. These kinds of bugs are hard to track down too.

    I wrote a program to check repetatly (sleeping from 1/4 to 1/25 of second in between) and the slaves were almost always in perfect sync with the master. (as per MySQL's binary log position indicator). That was really impressive however there are times when the servers are under load that the slaves will be out of sync for 30 to 60+ seconds! (Measuring in the tens of thousands of byte offest differences in the binary log position.)

    The solution we've been using is that any time there is an update to the database and the imediate page seen next by the user relies on the changed data we do the selects from the master server. This seems to work for now but I'm not sure how long we will be able to scale this way.

    In summary so long as the laod on the machines stays around 1.0 or lower everything runs pretty smooth. If the loads hit 3 to 5 or higher then people notice (or rather mention) that things seem odd. (By the way those are linux load averages which IIRC is different than under Solaris.)

    What I would like to see is a virtual server type system where one machine accepts all queries and hands them out to a set of replicating servers without requiring the application to know about it. This is nice for developing applications but the real reason is the master can then prevent the syncing issues discussed above.

    SF

  55. What's the application? by BanjoBob · · Score: 2, Informative

    Enterprise database solutions are quite varied. Is it a data warehouse or something financial or ???

    You pick the right tool for the job. I've seen massive databases on Sun Enterprise E-6500s and Oracle do a LOT if the database is properly configured. But one structure doesn't work for all applications. Do you use stored procedures? Do you index? Do you require triple replication to reindex one system have a backup and a live production system? Do you need remote fail-safe operation? These types of questions need to be answered before you settle into one solution.

    --
    Banjo - The more I know about Windoze, the more I love *nix
  56. mysql is not free. by Anonymous Coward · · Score: 1, Informative

    unless your software is gpl.

  57. Re:DB2 ICE sets TPC-H performance standard on Linu by dougnaka · · Score: 2, Informative

    Our fast four way was under $32k, I threw in the price of the x300 storage array we bought with it and the 4 CPU licenses for DB2 EEE 7.2 we bought for $11.5K each (half off the SRP).
    Our db2 does over 15,000 rows/second in BATCH mode. It was a sad day when we had to log our transactions to text files for batch processing.
    We did end up hiring a good DBA to help us with our DB2. It's worth noting that we didn't have this extra cost or need with our PostgreSQL setup.
    I'm curious about anothers experience with DB2 on Linux, as I assume you're running on. Tell me, what versions do you run? What kernel? What kind of reliabity do you get?
    We initially ran DB2 on a Redhat 7.3 setup with a severely modified kernel, 2.4.15 I think it was. We went to RHAS 2.1 with the RedHat kernel after so many stability problems in DB2. The new version didn't fix the problems, it only threw in a slew of new problems relating to the hardware. Our new setup is Gentoo 1.4 with a 2.4.21 kernel. It runs much faster, sees all HT enabled processors and throws no APIC errors, and hasn't crashed.
    So, what are your experiences with DB2 on Linux. If you're not runing on Linux, what are you running on?

    --
    My Linux Command of the Day site : LCOD
  58. The only hard thing is shared storage by dido · · Score: 1

    In my experience the hardest part about setting Oracle 9i RAC (or even Oracle 8 parallel server) on Linux is the Fibre Channel arbitrated loop system that generally accompanies it. I've performed such a setup both on IBM FastT-type and Compaq/HP StorageWorks hardware, with both QLogic and Emulex HBA cards, and that's where all the dancing around happens. Driver support on Linux (especially for HP) is spotty, and not exactly straightforward to come by. However, once the Fibre setup works perfectly, everything else follows quickly enough.

    And these things are godawful expensive as well. The fabric switch alone usually costs more than the entire cluster of computers which are connected to it (and these are by no means cheap machines... they're IBM xSeries or Compaq/HP DL-type enterprise servers), as does the actual disk array, even if it's as small as 200 GB. The entire shared storage setup can come up as the largest single cost in an Oracle RAC or other proprietary enterprise database installation that makes use of a similar system.

    However, I imagine that none of these open source database clustering solutions can hold a candle to RAC in terms of performance. They all seem to make use of replication, which means that they will be far, far cheaper than a shared storage setup, but that is where a fibre channel disk array is difficult to beat. Fibre Channel has data transfer rates measured in GB/sec... RAC has a dedicated cluster filesystem or makes use of raw devices, with its own distributed lock manager that prevents the machines in the cluster from simultaneously writing to the storage and corrupting it. As I imagine, such a system is difficult to write and test in an open source context, as it again requires any of the horrendously expensive enterprise shared storage setups on the market (even multi-tailed SCSI disk arrays, while cheaper than fibre, are still quite pricey nevertheless).

    --
    Qu'on me donne six lignes écrites de la main du plus honnête homme, j'y trouverai de quoi le faire pendre.
    1. Re:The only hard thing is shared storage by brokenin2 · · Score: 1
      As I imagine, such a system is difficult to write and test in an open source context, as it again requires any of the horrendously expensive enterprise shared storage setups on the market.


      Or you could wait a month or so, and just use the new version of DRBD that enables a Primary/Primary configuration. It may not be what you'd want to use to run your cluster, but it's a software solution for development purposes.



      A lot of the locking issues have probably been dealt with by projects like GFS/OpenGFS already. It's basically a shared storage filesystem, but the differences between a filesystem and a database are sometimes quite negligible. I've never used it, so I can't say how well it works, but it is probably a good starting point. The primary/primary configuration of DRBD is being worked on specifically to support things like OpenGFS.

  59. Re:Off topic, but it's not. Why PDF and not HTML? by nmaeone · · Score: 1

    While I'm a huge advocate of virus free and pleasant experiences or alternatives and although my argument may not be all that cogent, how would you propose another alternative to those VARs out there that want to utilize the wonder of e-mail to send PDFs such as quotes to clientele? Without allowing the ability for them to be modified? Printed for whatever reason? Password Protected (without webserver htaccess ability)? I know and am fully aware that any *nix based platform, l33+ h/\x0r w/\R3Z, or a simple screenshot could theoretically bypass such security, but your average quote or proposal recipient probably isn't technical enough to do so. In contrast, with unsecured documents, they surely could click and type away! PDFs have a place, and although I mostly agree with your argument, [X]HT[X]ML (IMO) isn't the answer to all scenarios. Best tool for the best job and/or circumstances I always say.

    -A

  60. load balanced databases by Anonymous Coward · · Score: 0

    maybe not totally cost free as we spent a couple days writing some code to build a solution using erlang/otp and postgress that does the job fine -- has been running for almost a year non-stop. also a collegaue used open source tool called spread to do the same thing for free too and he told me it was really easy to set up.

  61. Re:Off topic, but it's not. Why PDF and not HTML? by shibashaba · · Score: 2, Interesting

    You can't embed fonts and images inside of html documents.

    On linux(with my blazingly fast duron 650) a 500 page pdf I made with OpenOffice takes a few seconds to load in konqueror. I had downloaded all the indiv. web pages that made up the book(wasn't avail as one file), used cat to put them together and then waited 20 minutes for open office to load it. Mozilla took about the same time to load the same file, and konqueror was a little bit less than ten minutes. God knows how long IE would have taken, if it would have loaded at all. While were getting off topic here, Word2000 would only bring up the first web page becuase and ignored all the rest in the file. God only knows what IE would have done.

    Basically, for big files PDF is the only option as far as I'm concerned. I am sorry that Microsoft and the creaters of pdf can't provide you with a decent computing experience for such basic tasks. There's only $50 billion dollars and decades of experience between the two companies, these poor guys are doing all they can.

    --
    ---------- Open Source is capitalism applied to IP.
  62. Complete Irony... by dillpick6 · · Score: 1

    This is from the unixtips.org pannel on the main page: --------- Warning: mysql_connect(): Can't connect to local MySQL server through socket '/tmp/mysql.sock' (61) in /usr/home/http/portico/random/ascii.php3 on line 7 Warning: mysql_select_db(): Can't connect to local MySQL server through socket '/tmp/mysql.sock' (61) in /usr/home/http/portico/random/ascii.php3 on line 8 Warning: mysql_select_db(): A link to the server could not be established in /usr/home/http/portico/random/ascii.php3 on line 8 Ack! Please notify schvin@schvin.net that there is a serious problem on portico. Thanks. ------------ Does anyone else find this ironic? :/ I am reluctant to say now, but I am a MySQL advocate...

    1. Re:Complete Irony... by techwolf · · Score: 1

      So the local MySQL db isn't running. This doesn't, IMO, indicate any lack of ability for the product to cluster, replicate or even function as a commercial DB.

      It sounds more like unixtips.org doesn't have HA. Don't blame the product for being poorly implemented. :)

      --
      I don't do this for karma, I do it for cash. It's much better.
    2. Re:Complete Irony... by dillpick6 · · Score: 1

      I was aware it wasnt running and that it had no relevance to this exact article, but I just found it to be a funny coincidance. Sorry to bother you...

  63. ACID Compliance in MySQL? You jest? by tjstork · · Score: 1



    Can MySQL as of yet do this:

    INSERT INTO TABLE XYZ
    DELETE FROM TABLE QRT
    ROLLBACK

    and keep the state of XYZ and QRT consistent with respect to other tables?

    I think I could make the argument that if you are running a database that never crashes on top of an operating system that never fails, you can be innately ACID if you do your middle tier correctly... but THAT is for another time.

    --
    This is my sig.
  64. Plenty of poorly managed SQL Server installations. by tjstork · · Score: 2, Interesting


    One of the telltale signs of a SQL Server installation is the frequent "deadlock" messages. I would say that if you are going to complain about transaction handling in MySQL, even the standard version that doesn't have it, you should probably complain about the transaction handling in SQL Server. If it deadlocks, and does not deadlock avoid, then it ain't an enterprise solution.

    --
    This is my sig.
  65. SAPDB seems right by christooley · · Score: 2, Informative

    According to the FAQ it supports clusters/high availability of several types (towards the bottom), has Oracle 7 compatability, and has the option to upgrade to commercial support (something available for Postgres, MySQL and most others as well). It's got an install base of users used to large environments and has been reasonably proven in the field. Just a thought.

  66. Yes, SAPDB and MySQL are both free by jtheory · · Score: 1

    SAPDB (was made open source by SAP a year or 2 ago) definitely bears some research, but I haven't used it myself so I can't comment (anyone else?).

    I can give you some quick info, though (partly to respond to a sibling post claiming that MySQL and SAP both cost money if your software isn't GPL).

    From sapdb.org (various pages):
    -- From Q4 2003, SAP DB will be rebranded as "MaxDB" and offered as a MySQL AB product.
    -- SAP DB can be used free of charge in non-SAP environments.

    And using MySQL *is* free as long as you don't distribute it and/or sell an application that requires it. From the MySQL licensing page:
    -- If you use MySQL in conjunction with a webserver and develop the needed tools/applications by yourself, or you use applications licensed under the GPL or compatible OSI licenses, then you do not have to pay for a MySQL license. This is true even if you run the system on top of a commercial web server.
    -- Internet Service Providers (ISPs) often host MySQL servers for their customers. With the GPL license this does not require a license.

    Note: obviously, do your own research for your particular situation... PLUS it may well be worth it to pay the license fees (only a few hundred bucks per server) to get support.

    --
    There are only 10 types of people: those who understand decimal, those who don't, and, uh, 8 other types I forget.
  67. Re:Off topic, but it's not. Why PDF and not HTML? by jerryasher · · Score: 1

    Right, well I didn't say they weren't the right tool for the job right now.

    I said the user experience, the disruptive splash screen, the disruptive way the machine grinds to a halt, the uncanny mechanism it has for displaying everything at 180% or larger, its stupid non-understanding of the page up/page down keys blows great big donkey dick.

    It's a huge slow stinking pile of shit. That's acrobat. Huge. Stinking. Slow. Pile of Shit.

    Okay Adobe? Googling up acrobat sucks? That's your program. It's shit. Fuck you Adobe for violating your users. Rot in hell.

    Hell, I'd rather just be sent tiffs than get a pdf. But sending them in emails is different than placing them at the end of a URL. My expectation for URLs is that things that are documents can be opened silently in the background in a new tab and are not bug the shit out of me ware. PDFs ARE bug the shit out of me ware, and I resent that.

    So it's just a warning and a suggestion. I publish my resume in text, html, word, and pdf. So folks can pick what they want. I don't think it's too difficult anymore to publish in multiple formats. If you want me to read what you write, respect me and don't make it hard for me.

  68. Java's turn... by jerryasher · · Score: 2, Funny

    Here's another experience I love.

    I'm in mozilla, and I middle click a link. Somewhere in the background a page starts loading in a new tab. No problem. I will continue reading. Ah, background loading into new tabs.

    Then.

    The.machine.comes.to.a.halt.

    It.takes.me.awhile.to.realize.this.

    but.

    I.can.not.scroll.I.can.do.nothing.

    And I know what's happened. Some moron has a java applet displaying something wonderfully important like the fucking time in their little corner of hell, and if I wait about thirty more seconds, I'm going to hear a little pop, and I just know the sound of that pop is like the sound of a dick popping out of my anus, cause I know that java has just ass raped me, my browser, and my machine.

    Pop! Your clock is now ready sir!

    O U C H ! ! ! ! Rapist.

    1. Re:Java's turn... by axxackall · · Score: 1
      I.can.not.scroll.I.can.do.nothing.

      And I know what's happened. Some moron has a java applet displaying something wonderfully important like the fucking time in their little corner of hell, and if I wait about thirty more seconds, I'm going to hear a little pop, and I just know the sound of that pop is like the sound of a dick popping out of my anus, cause I know that java has just ass raped me, my browser, and my machine.

      Keep your paints off and someone will find the way into your ass. That's the Windows way.

      I don't have such experience. In my Mozilla all javascripts, java applets and plugins are disabled by default (until I decide to enable when I really want it). Any content, which is not HTML (with images), gives me more hassle then it brings any useful information. 90% of information I need from the web is available in a plain HTML format. The rest is in pdf, ps, sxw and doc formats which I still read when I decide so - after mozilla downloads it to the disk and opens in an external application, like xpdf.

      Keep your paints on and your ass will be safe. That's the Linux way.

      --

      Less is more !
  69. MySql Clusters work great!!! by texasrocket · · Score: 3, Informative

    I have personally installed, setup and maintained a 5 (3 slaves, 1 master/slave, 1 slave/master) node cluster using Heartbeat and MySql replication. It works great!! My guess is that 80% of MY Mysql usage is content and needs READ-ONLY access. So I have 3 slaves that are used in a Read-Only cluster. The master is one of 2 other machines and ALL WRITES go to it. In the event of a MASTER db going down, the remaining slave promotes itself and updates the other slaves to point to itself. Been working great fo 8 months!!!

  70. Re:I've used MySQL and PHP on a reasonably big sit by Anonymous Coward · · Score: 1, Insightful

    Are your slaves using InnoDB or MyISAM? If the latter have you checked if the 30+ second delays are caused by the replication thread begin blocked by a read lock? The course grain (table level) locking of MyISAM make its impossible to get consistent behavior under load in a read/write environment (even if the only writer is the replication thread). Use InnoDB instead, it is amazing.

  71. You need a license only if: by infernalC · · Score: 2, Informative

    You need a license only if you choose to distribute the software. If this is an in-house application, simply obtain copies of MySQL Standard/Max (GPL) directly from MySQL mirrors for each server. Since you do not perform the distribution, you are in good shape (see MySQL License Policy - Licensing -2).

    However, the folks at MySQL AB are very decent folks who offer great support and warranty for their product and who have to feed their families, and licenses are cheap. IMHO, buy at least one license for a master and one for a slave. That way you get support for the program in each role.

    1. Re:You need a license only if: by toddhunter · · Score: 2

      the problem I have found with mySQL is the 'internal' distribution clause in their license. This is different to standard the standard concept of distributing it externally.
      Also from their examples page
      http://www.mysql.com/products/licensing-examples .html
      If you are selling a non-GPL application that requires MySQL and works with a webserver, then you need to provide commercial MySQL license(s) to your customers.
      Now this is all well and good, the mySQL guys should earn good money for what they do. But if you do contact them, all you will get is a response saying 'well, you better buy a commercial license just in case'. I don't like that. If they are going to charge for something then do it and make it clear what they are trying to do. IMHO they are using doubt over the GPL to sell licenses

  72. Well you sort of can! by codepunk · · Score: 3, Interesting

    I run two types of clusters, one of them is a RAC 9i on Linux. Nothing and I mean nothing has the functionality of RAC 9i. You can put a bullet through one of the nodes right in the middle of a query being returned and still get your records just like nothing ever happened. The other database I run is a postgresql on redhat advanced server and the database files are sym linked into the san (this is high availability only) . If I had to do it again I would not use postgresql because it scales for shit and I cannot under any circumstances keep it up in a 24/7 configuration. The database needs to have vaccuum run on it once a day and I have to do that manually because half the time it fails. Running a vaccum on the database while clients are connected basically locks everyone tight until it is finished.

    If you cannot spend any money and wish a fast, scalable and higly available system my advice is first sapdb and or mysql and advanced server on some sort of shared scsi.

    Now all of you big postgresql advocates flame away but it does not change the facts. I love the database but if you need heavy lifting it just does not cut the mustard.

    --


    Got Code?
    1. Re:Well you sort of can! by Anonymous Coward · · Score: 1, Interesting

      There is a nice sample in the LVM HOWTO on how to make a snapshot of a certain state of the filesysystem. This is what I use to back up databases that need to be running. Might be an option for you? Lvm rocks.

    2. Re:Well you sort of can! by Sxooter · · Score: 2, Interesting

      Just wondering, but are you on an older flavor of Postgresql? Most of the issues you mention (i.e. vacuum slowing things down) have been fixed for quite some time.

      Also, if you haven't bothered to tune your postgresql.conf file on an older install, it will run for shit. I.e. the default settings are for a small workgroup type setup, not enterprise class stuff.

      Keep in mind, Afilias runs the .org and .info tlds on postgresql, so it can't be impossible to get 24/7 operation out of, or the .org domain would be offline several times a day.

      --

      --- It is not the things we do which we regret the most, but the things which we don't do.
  73. In databases, you get what you pay for by Anonymous Coward · · Score: 1, Interesting

    I live in a mixed environment of a couple of commercial databases and a couple of popular no-cost ones.

    The bad news is, you get what you pay for.

    You'll have open source aficionados telling you how [insert brand] free db is k00l. Sadly, in reality, the few most popular such packages are simply way behind commercial systems. Some commercial systems are also way behind other commercial systems... (I mean, Windows as a db server platform? Get real & get sober, dude.)

    I wish I could recommend a free or at least cheap db server. But if you care about your data so much that you are seriously going into replicated systems, the couple of most popular free packages at least aren't there yet even in basic ACID reliability.

    I wish I could tell you different, and I wish the company where I work wouldn't have to pay the megabuck-class "maintenance fees" for our commercial dbs.

    1. Re:In databases, you get what you pay for by La+Camiseta · · Score: 2, Informative

      But if you care about your data so much that you are seriously going into replicated systems, the couple of most popular free packages at least aren't there yet even in basic ACID reliability.

      What are you talking about? PostgreSQL has supported ACID reliability for years.

      Plus, PostgreSQL also now supports replication, the same as the one that PostgreSQL, INC. has been selling as an add-on for years (they finally opensourced it).

      Another database that I would check out is SAPdb. SAP originally created it to be a competitor to Oracle, so that their customers wouldn't have to buy Oracle databases (read: pretty complicated setup, but worth it). But now they've opensourced it too, and as far as I know, it supports replication. And in the next release when MySQL takes over (Q4 2003, it'll be renamed MaxDB, and MySQL will be working on the code as well as SAP), it will have a proxy available so that you can just use MySQL database drivers to access it.

    2. Re:In databases, you get what you pay for by g_goblin · · Score: 0

      If Windows is so bad, then state your case why. I would like to know the reasons behind your statement

  74. MNESIA from ERLANG by WetCat · · Score: 2, Informative

    Check Mnesia DB from Erlang package. It's not relational, but has high-availablility replication, conflict management, etc. It's reliable and tested. By Ericsson.
    Good license.

  75. Re:SLASHDOT MYTH #3 VS. REALITY by Anonymous Coward · · Score: 0

    Fairly Well Crafted Troll. However, your cheating by omission.

    back to REALITY

    IBM Research

    Austin, Texas
    San Jose, Ca
    Westchester County, NY

    Microsoft Reseach

    Redmond
    Silicon Valley (and the Newest MSR Lab)

    GE Reseach

    Niksayuna ,NY

    Lots of other companies have labs worldwide, but that doesn't mean they are shutting down their labs in America and opening where labour is cheaper (not necessarily, im sure that perhaps its true in a few places) but rather spreading their location (and therefore business influence) worldwide. Try and troll better, as your fairly pathetic.

  76. Livejournal.com is clustering MySQL by BiOFH · · Score: 2, Informative

    http://www.livejournal.com/community/lj_maintenanc e/60984.html

    --
    - I am made of meat.
  77. Re:Off topic, but it's not. Why PDF and not HTML? by stoborrobots · · Score: 1

    No, PDF's are a useful document format. _ACROBAT_ is what you are complaining about...

    For me, moz silently downloads the file in the Download Manager, then spawns xpdf... No browser lockup, no 100% CPU... In windows, GSView achieves the same effect. No need to complain just because you choose to use an inferior (for your purposes) product...

    Why not just switch to another PDF viewer... There are tons out there...

  78. have you considered... by velska · · Score: 0

    microsoft sql server [BWEG]??? :o

    this mischievous comment prompted by the redmond folks' ad i was horrified to encounter on this page.

    *free* trial w/ the 'softies? that's about the most you'll end up paying for "free".

    --
    --v
  79. Re:I've used MySQL and PHP on a reasonably big sit by DaMoose · · Score: 2, Insightful
    OMG - as I was reading your post, I starting to think you must be one of my developers - we had the identical configuration and issues with Superdudes.net.

    It got to the point where the slave servers (P4, 2GB RAM, Hardware RAID) could not keep up with the Master replication _and_ service SELECT queries. The data was too big for RAM (filesort) and the drives were not fast enough (2 drives mirrored). The Master is dual PIII 2Ghz, 2GB RAM , and fast RAID 5 hardware.

    I ended up solving the problem with a hardware upgrade. I replaced all 4 servers with 1 Quad-Opteron 1.8GHz, 16 GB RAM, and _VERY_ Fast RAID 10 across 9 fast drives.

    Please feel free to check it out. For the first time in a long time, I'm not affraid that the MySQL server will be the bottlenect in this very dynamic web site.

    We use Linux, Apache 1.3.x, MySQL 4.0.x, and PHP 4.x to build the pages and generate XML to our Flash MX applications.

    Superdudes.Net

    Flash heavy signt. Free registration required to access the coolest features (those which beat up the MySQL server).

  80. Re:transactionality is hard -- Question by jadavis · · Score: 1

    Here's what I don't understand about multi-master database clusters:

    In a multi-master situation, if you connect to either database it should give you exactly the same results. Well, it can't do that unless all masters are online with respect to the other masters.

    So, for any multi-master it seems almost as if you have to trade some kind of reliability.

    And let's say you have two databases and both are up, but neither one can see the other. How are you supposed to provide an answer from either one? If it's an INSERT, it certainly can't commit, no?

    --
    Social scientists are inspired by theories; scientists are humbled by facts.
  81. Re:Anonymous Coward is an idiot by mentin · · Score: 1
    You don't understand what TPC council does. They don't run tests. They verify and publish results supplied by vendors. Postgres or MySQL is not there simply because neither ever submitted any configuration capable of running TPC benchmark to TPC council.

    Now about the real reasons why you would not see Postgres/MySQL results any time soon there:

    The most obvious one is that running TPC is costly. If you look at top solutions prices, it is around $5-10M, about half of which is hardware. I doubt Postgres or MySQL have money to run these tests.

    But it is also obvious that this is only part of the answer - if you look at who submits results, usually it is not DB vendor, usually it is hardware vendor. So if HP or IBM (top hardware vendors) believed they could get reasonable results with Postgres or MySQL, they would definetely do this (especially because they could save on software part of total solution price). The fact that they don't do this proves neither of these databases is ready for complex databases and huge TCP loads.

    --
    MSDOS: 20+ years without remote hole in the default install
  82. Re: Oracle Query Optimizer by dusty123 · · Score: 1

    While the Oracle query optimizer is probably superior of postgreSQL I would like to point out that it's still not the "cream of the cream". Be aware that Oracle cannot distribute one SQL statement to multiple processors/nodes. Other database systems like DB2 or Teradata do this automatically while you have to write SQL/PL in Oracle to achieve this. I always wonder why the "measurement" for OSS databases is Oracle and never DB/2.

  83. Re:DB2 ICE sets TPC-H performance standard on Linu by porlw · · Score: 1

    We've been running DB2 on Linux for nearly 4 years now - it works fine on our quad Xeon box. At first I had a few problems sorting out the weird Mainframe jargon IBM uses for configuration, but once I got it running it hasn't really given any problems.

    DBs aren't very big - data for dynamic web pages at 2Gb, and a warehouse at 4Gb.

    I'm not a DBA, and it's so low maintenance that really I know very little about it at all, it takes care of itself.

    Certainly works better than the SQL Server we have of equivalent vintage, which seems to have a problem scheduling queries - one big query brings the whole system to a crawl. On the DB2 server a big query increases latency slightly, but that's about all you notice.

  84. How about replacing oracle? by Anonymous Coward · · Score: 0

    We use oracle. We have several tb of data which get queried by analysts and productions jobs. Oracle is falling over (in part due to weak db tuning/design) and terrdata is in the running to replace it.

    Any thoughts on better, cheaper routes to take?

  85. Re:Off topic, but it's not. Why PDF and not HTML? by Anonymous Coward · · Score: 0

    You can't just string Web pages together using cat, silly. There can only be one root element per HTML page (between the and tags). The correct, conforming behavior would have been to ignore all those zillions of other pages you tacked on. I'm surprised all the open source solutions didn't throw up an error message or complain somehow, while Word did the "right" thing. This may be due to the common OSS development value of convenience over overly restrictive correctness, but it still seems wrong in my book.

  86. MySQL features by ravenlock · · Score: 0

    The production version of MySQL still has no subqueries, they're due in 4.1. Some types can be emulated with a left join-ish thingy (IIRC), but that's one big minus for MySQL. Disregarding that, it's a kick-ass db.

    As for optimizing, InterBase is (or can be) one weird mofo where it comes to query plans -- if you have complex databases. I worked on a db with 7 tables and I had to include all of them in one query (the relations were like that). The plan it came up with was seriously fscked (despite the fact that AFAIK the database structure was _pretty_ carefully thought out).

  87. Re:SLASHDOT MYTH #3 VS. REALITY by Anonymous Coward · · Score: 0

    The point is my naive friend, when the economy comes back, the hiring will be done over seas not in America.

    Someday your job goes away and you start to cry I will laugh at you.

  88. talk to linda by mindserfer · · Score: 1

    Linda tuples are not new. They are supported in many languages... If I'm not missing something. Is this not what tuple-space was invented for?
    Has no-one slathered linda onto some mysql backend.

    my $0.05

  89. Mainframe DB2-OS/390 Parallel Sysplex is the model by emes · · Score: 1

    I remain amazed at the lack of awareness of what IBM has been doing with mainframes for over 20 years when it comes to database clustering. Parallel Sysplex is what IBM calls its clustering framework for mainframes. Critical core system environment applications like DB2, IMS, and CICS are written to take advantage of services from OS/390 and a special CPU known as a coupling facility that enable extremely high reliability, recovery and failover. Since most involved in open-source seem seriously mainframe clueless, I am not surprised that both the open source and closed source offerings on Linux still do not even come close to what IBM has achieved.

    Of course it doesn't help that computer science arrogance continues to make people look at mainframes as if they were dinosaurs, and as a result important lessons go unlearned. If open source database developers want to get a clue, it's high time they learned about Parallel Sysplex. By the way- Linux on the mainframe doesn't even take advantage of this, but it probably doesn't help that IBM keeps the whole approach of the underlying supporting hardware proprietary. It costs alot of money to learn how to use the coupling facility, for example.

  90. Arguably, there's no such thing as a replicated DB by Anonymous Coward · · Score: 1, Insightful

    If you have a widely replicated, multi-master, database and the replication network becomes partitioned, it's impossible in general to fully resolve the inconsistencies when connectivity is restored.

    There are particular instances of applications where a specific replication solution might be adequate. But it really does depend on the application requirements.

  91. RAC performance by Anonymous Coward · · Score: 0

    just curious but can anyone out there actually verify that RAC performs better than other clustered dbms's? my company was "convinced" by oracle consultants that RAC was the wave of the future (and it is, i'm sure) but the cluster performance is less than optimal...we've done a ridiculous amt of trouble shooting trying to figure out where the problems are, w/little to no luck. we're about a week away from scrapping the whole thing and moving to a more conventional architecture.

  92. Replication is not the same as Clustering by Anonymous Coward · · Score: 1, Informative

    It would seem that the mySQL zealots have failed to address the real issue: clustering.

    Sure, mySQL supports transactional tables but it does not support two-phase commits. Without distributed transactions, how can you build an ACID cluster? Replication is not the same as clustering!

    And, I'm sorry, but mySQL does not come close to Oracle. I manage a large, distributed J2EE application running on mySQL (4.0.x). mySQL *still* does not support sub-selects, stored procedures, views, or dynamic tables (SELECT a, b, c FROM foo) AS bar(a, b, c). It does not support MINUS. It does not support constraints (e.g. column x must be greater than 15). It is not as fault tolerant as the commerical solutions. I could keep going if you would like? Yes, 4.1.x supports some of these things. Yes, PostgreSQL support some of these things too -- but it has other flaws.

    There is a reason Oracle (and SQL Server, and DB2, etc.) costs a lot of money. As much as I love open source, if my project could afford it we would be running Oracle. Not because I like Oracle (it has its faults too, it is a resource hog), but because none of the open source solutions meet my needs as completely as it does.

  93. replication by juraj · · Score: 2, Informative
    Actually, I've been looking for replication in a Free database for months. The things I don't get with the various "proxy" solutions:

    • if I do insert delayed to a database (which is replicated to two databases), or a simple insert. I have an autoincrement field. Who guarantees, that they will have the same value. If insert delayed is performed, how does the "proxy" guarantee they are actually issued in the same order. i don't care what's the order, but I want the order to be the same on both databases.
    • if a database falls down, how do I get instant resynchronization? I don't want to copy the full *GB database back and forth, while having a read lock. i want instant resynchronization from the point where the database fell down.
    • I want to write to both databases and the changes replicated to the others. I want peer to peer database, not master-server, because that involves always knowing who is the master on the application level (okay, I can grab an IP address, but much easier and nicer solution is having a cloud of servers and these things solved in the cluster, not by my hacks).


    Another thing -- did anyone had a look at SAPdb and Interbase? They are Free too and there's not much talk about them. Are they useable? Do they provide replication?
  94. MySQL Replication + InnoDB = Good Stuff by Anonymous Coward · · Score: 0

    I run several MySQL clusters built on fairly low-end hardware (dual PIII700's, 1GB RAM, two 73GB SCSI-U160 disks in a software RAID1 config).

    These are set with two master->master replication (write) nodes (high availability handled by HA-LVS, failover time is approx. 5 seconds) and a number of slaves (read-only). The configuration took, at most, an hour. The InnoDB table types work fine, autoincrements work fine. Referential Integrity works fine. Rarely does it take over a second for the replication across a 100Mbit network.. except where we're storing sizable LOBs or LONGs (normal transit time over network applies, very little latency induced by the database).

    There's an appropriate use for Oracle, MySQL, and even SQL Server. Match the database to your requirements. If you need the more powerful features found under the hood of Oracle, then use it. If you're only worried about clustering but require referential integrity and other ACID components, then MySQL will fit the bill just fine.

    -AC

  95. SQL Anywhere??? by Anonymous Coward · · Score: 0

    I agree that the Open Source solutions don't compare. As a cheaper alternative to Oracle, MS or DB2, I hear you can cluster SQL Anywhere from iAnywhere. It runs on Linux, etc and apparently is good for over 100 GB and a couple hundred users.

  96. Working on it right now by Magus · · Score: 1

    http://dbd-hard.sourceforge.net/

    I've been working on this project for some time now. HARD stands for High Availability and Replication Driver (although server would be a better term than driver). It is exactly what the article is asking for. The problem is that I can't do it all myself and there is alot to do, but the alpha version I originally tested worked perfectly, just stability problems (hence the re-write)

  97. Re:-1:Troll, The Antecdotal Evidence Chronicles by Schwartzboy · · Score: 1

    I've been primarily using SQL Server [release of the moment] throughout my working life, with a little teeny bit of Oracle and (very recently) turned MySQL from something I play with as a hobby into part of a project my boss has me working on. Of those three, only MySQL (arguably because I haven't been working with it in a business/production environment for even a year yet, but still) has never let me down. I've had all manner of glitches, inexplicable crashes, and entire tables changing their datatypes and horking the applications I had hooked up to them from Oracle & SQL Server, which I'll admit can probably be blamed on the clueless admins (whose most common response was "uh...data type?"), but I was exceptionally clueless when I started playing with MySQL a couple of years ago and I have yet to see any signs of horkage. Heck, I'm nowhere near MySQL-certifiable, and like the parent says I see no reason that it's not a good setup within certain conditions. I wouldn't use it for everything, mind you, but I also don't eat my soup with a pitchfork.
    ...though from the environments I've worked in, it appears that "the right tool for the job" is Phil, that creepy guy in Marketing. Ew.

    --
    "Linux doesn't exist. Everyone knows Linux is an unlicensed version of Unix"- Kieren O'Shaughnessy
  98. What does Google do? by yorkrj · · Score: 1

    From what I've heard, Google uses open source technology for their search engine. They surely have an open source clustered database for their search terms. Or have they come up with some sort of custom dbms?

  99. commercial support for postgresql by gimpboy · · Score: 1

    really it's not that hard to find this information. goto postgresql.com, click on support and then read:

    http://www.pgsql.com/support/

    postgresql, like alot of opensource software, has a free component as well as a commercial one. statements like:

    I don't doubt the capabilities of PostGres and MySql, but when your DBMS is doing memory dumps, you don't want to have to scour google for an answer. You want the vendor on the phone as fast as possible and MS, in my exeperience, is very good at phone support when it comes to SQL Server.


    would indicate that very little research was done when determining how to build your cluster. this is fine if you are familar with ms, but implying that the only way to get support for opensource solutions is through google is very misleading.

    --
    -- john
  100. Replication doesn't mean no data-loss by Stone316 · · Score: 1
    The first thing that pops into my head when someone mentions replication for a high-available environment is data-loss. Depending on the replication stategy your going to lose a little bit of data.

    I haven't read up on mysql lately but i'd double check and make sure that isn't an issue for you before you make a jump.

    IMO OSS is fine for internal low-key applications. But for production financial systems I wouldn't bet my career on an OSS solution. The biggest reason is support. While OSS databases may have more 'developers' they don't have the same level of support as say Oracle. If I have a P1 problem, I can get a technician working on it 24x7 until its resolved. Sure, i'm sure there are companies that may offer support for OSS databases but like I said, i'm not going to bet my career on them.

    There could also be financial arrangements within the level of service contracts.. I doubt you'll see anyone in the OSS community providing that.

    Besides slashdot, I haven't run into many people pushing OSS software (including linux) and i've worked for a couple of large IT organizations. Unfortunately, there seems to be too many 'zealots' out there ready to push their own preference of software without actually looking at the issues objectively. I see it everyday here at work. The oracle people push Oracle, the microsoft people push SQL Server, the OSS people (haven't run into any of those yet..).

    --
    "Thanks to the remote control I have the attention span of a gerbil."
  101. The problems with DB clustering by merlin_jim · · Score: 1

    Basically the big problem is that DB servers are IO-bound, not CPU-bound. Clustering is a way to increase throughput for CPU-bound application, up to the point where they become IO-bound.

    Clustering DB servers only works in situations where you have a write-seldom, read-often application. Otherwise, the bandwidth traffic of write replications starts to overwhelm the useful bandwidth.

    All that said I have seen a MS SQL Server setup that may be interesting to the Free/OSS database folks; basically you have a SAN and connect multiple database servers to it. It requires specialized hardware, but you could probably fake it with a computer with a big raid array and gigabit networking.

    The idea was that the SAN manages disk accesses such that multiple computers can read the same disks at the same time, and therefore everyone always has a fresh copy of the data.

    --
    I am disrespectful to dirt! Can you see that I am serious?!
  102. Re:ACID Compliance in MySQL? You jest? by Anonymous Coward · · Score: 0

    YES .. with innodb transaction

  103. Interbase/Firebird by daniel-kun · · Score: 1

    Hi,

    I wonder why nobody mentions Interbase and/or Firebird, since it is a reliable and fast database. Firebird is a fork of Interbase 6, because Borland stopped distributing it under an open-source licence since they released Interbase 7. So the Firebird guys startet to work on and enhance IB 6, and they're doing a great job.
    Is there any cluster-like support for Firebird at all?
    cu, Daniel

    --
    eat(this); // delicious suicide
  104. triggers and stored procedures are overrated :-) by dcocos · · Score: 1
    MySQL may be able to handle subselects, but it's still struggling with triggers and stored procedures.

    My take has a developer taint (i'm not a DBA nor do I pretend to be one), but it is two fold, the first is that all though they may be faster trigger and stored procs for the most part only serve to confuse and frighten the developer, because you've moved events outside of the programmer's control, (hmmm that table got updated but none of my code touches it) By avoiding triggers and stored procs you also make portability easier. The first rev of a product I worked with had code that said
    if (DATABASE = ORACLE) {
    // do oracle specific stuff
    } else if (DATABASE = MSSQL) {
    // do MSSQL specific stuff
    }
    I took all of that out and made non-db specific code this certainly made it easier when our company was aquired and I now had to support MSSQL, Oracle, Sybase and DB/2 on four different platforms non the less!

    The second comment I want to make is that ObJectRelationalBridge (OJB) from Apache will eliminate 99% of your SQL from your Java code, but letting you treat your objects as object and you no longer have to worry aboout SQL for the simple things.
  105. Re:DB2 ICE sets TPC-H performance standard on Linu by kpharmer · · Score: 1

    Right now all of my servers are running db2 8.1.0 & 8.1.2 on aix 5.1. Reliability is great, performance is fine. These are all stand-alone 2-6 CPU boxes without extended memory, interpartition-parallelism, etc - so they have minimal OS dependencies.

    I'm setting up our RH7.3 servers next week. I think we'll be fine - by avoiding certain features, and having the flexibility of designing our own application.

    Good luck

  106. this worked well by ahree · · Score: 1

    back in the day i ran IT for a company that, sadly, has gone the way of many many software companies. so it goes.

    anyway, we used something by these folks:

    http://www.missioncriticallinux.com/

    called 'convolo' but the technology is called 'kimberlite'. it worked quite well - as we were working for some big name movie houses we needed full availability. our testing was quite intense and it worked fantastic.

    its GPL - mission critical provides service, if i'm not mistaken.

    you'll need to go a bit nuts with the hardware, of course...

    enjoy!

  107. Re:Anonymous Coward is an idiot by Sxooter · · Score: 1

    I wouldn't be surprised to see OSDL submit a TPC test in the next couple years.

    --

    --- It is not the things we do which we regret the most, but the things which we don't do.
  108. Arithmetic Nazi by xsbellx · · Score: 1

    It could be me but I think your math is a little off.

    2 Nines
    87.6 hours per year down time
    438 minutes per month down time
    101 minutes per week down time

    3 Nines
    8.76 hours per year down time
    43.8 minutes per month down time
    10.1 minutes per week down time

    4 Nines
    52.6 minutes per year down time
    4.38 minutes per month down time
    1.01 minutes per week down time

    5 Nines
    5.26 minutes per year down time
    26.2 seconds per month down time
    6.06 seconds per week down time

    Based on one average year having 31536000 seconds (365*24*3600), one average month having 2628000 seconds (average year / 12) and one average week having 606462 seconds (average year / 52).

    Yes I know we can and should define the extact number of seconds in each month (28, 29, 30 or 31 days) and a week should be 604800 seconds (7 * 24 * 3600) but for the sake of arguement, it's close enough.

    --
    If VISTA is the answer, you didn't understand the question
    1. Re:Arithmetic Nazi by crmartin · · Score: 1

      Cripes, what time was it when I was posting that?

      86,400 sec/day x 0.01 -- 14.4 minutes. ... and on from there. Gack.

      But the point is still a good one: you have to ask the question, even if you should do your own division.

  109. Re:DB2 ICE sets TPC-H performance standard on Linu by Sxooter · · Score: 1

    If it's been more than a year since you benchmarked Postgresql, you should really try one more time. there have been massive performance improvements in the 7.2/7.3 and 7.4 beta (soon to be rc) releases there.

    Note that, much like db2, you have to tune it to your load to get best performance, but you could afford to buy a nice used E10k for less than the cost of the db2 licenses you had and run Postgresql there.

    For tuning, go here:
    http://www.varlena.com/varlena/GeneralBits/ Tidbits /perf.html

    --

    --- It is not the things we do which we regret the most, but the things which we don't do.
  110. Clustering or Load-balancing? by FrankieBoy · · Score: 1

    It may be petty but are which technology are we talking about? From my understanding load-balancing is having whole requests sent to different machines based on the individual machines availability while clustering is sending one request to a group of machines who work on that request together. I think we're talking about load-balancing.

  111. Don't forget the pop-rivets by n1ywb · · Score: 1

    and the triple-expanding spray foam
    and the vice-grips
    and the spork^H^H^H^H^H

    --
    -73, de n1ywb
    www.n1ywb.com
  112. Re:I LIEK DONALD DUCK. HOW BOUT YOU by Anonymous Coward · · Score: 0

    I more liek teh mikky mouse

  113. Re:transactionality is hard -- Question by ba_hiker · · Score: 1

    multi master data-bases do updates on all the masters at the same time using a two-phase (or other exotic) commit mechinism. All of the masters are updates in the same transaction. Kinda ups the bandwidth requirements on the backbone between the masters. To avoid the reliablity problem, down servers are noted and updated later (before they come back online) by log sniffing or something.

  114. Ask MySQL by Britz · · Score: 1

    This seems to be a large operation with much money involved. So you it is very likely that You need professional support. There is a company behind MySQL that maintains the server and sells support contracts (at least I think so). So MySQL might be much better acceptable to Your boss than Postgres. Postgres might be better (can do more) but since You would be buying a large contract from MySQL or Oracle, just give both a call, ask Your question very specifically and then compare the answers.

    You might even find a company to offer You Postgres support, but I dunno.

    Btw. if You want 24/7 with low response times You have to buy the whole system somewhere. The only company that offers such kind of contracts (worldwide under 4 hours for anything) on x86 (cheaper than the other stuff, like Sun or IBM) is Red Hat with their Server for Linux afaik. You might then be able to get them to support Your MySQL system as well, because if You have two vendors, like Red Hat and Oracle then they might blame each other in case of a failure.

    But those contracts at Red Hat cost six figures.

    Just go for Debian GNU/Linux!!!! I love it and it runs and runs and runs....

  115. Re:transactionality is hard -- Question by sjames · · Score: 1

    If it's an INSERT, it certainly can't commit, no?

    In those cases, one of the masters MUST be shot in the head (That is, it should shut down). The other may then journal any commits it does until the other master can contact it and play the journal back. Of course, while doing that, it must not accept any queries.

  116. Re:SLASHDOT MYTH #3 VS. REALITY by Anonymous Coward · · Score: 0

    And this is based upon the fact that the third world has begun marching towards the first world. Whatever my clown. I think you're too use to zero-sum games, and believe that their gain is our loss. Personally I think 1 billion wealthy Chinese is a pretty lucrative potential market.

  117. No. by jclarke · · Score: 1

    $20,000

    http://oraclestore.oracle.com/OA_HTML/ibeCCtpSct Ds pRte.jsp?section=11221

    Or $400/named user.

  118. I opened fire on Tupac by Spooge+Knight · · Score: 1

    The source was my gun. I got some good clusters on him!

  119. You can't organize data on disks any more by Macka · · Score: 1

    frequently-used tables spaced far apart on disk

    Actually, with modern disks this is something you don't have much control over any more. There's a good description of disk organisation here where you can find the following explanation:

    Unfortunately, the BIOS has a design limitation, which makes it impossible to specify a track number that is larger than 1024 in the CMOS RAM, which is too little for a large hard disk. To overcome this, the hard disk controller lies about the geometry, and translates the addresses given by the computer into something that fits reality. For example, a hard disk might have 8 heads, 2048 tracks, and 35 sectors per track. Its controller could lie to the computer and claim that it has 16 heads, 1024 tracks, and 35 sectors per track, thus not exceeding the limit on tracks, and translates the address that the computer gives it by halving the head number, and doubling the track number. The mathematics can be more complicated in reality, because the numbers are not as nice as here (but again, the details are not relevant for understanding the principle). This translation distorts the operating system's view of how the disk is organised, thus making it impractical to use the all-data-on-one-cylinder trick to boost performance.

    The translation is only a problem for IDE disks. SCSI disks use a sequential sector number (i.e., the controller translates a sequential sector number to a head, cylinder, and sector triplet), and a completely different method for the CPU to talk with the controller, so they are insulated from the problem. Note, however, that the computer might not know the real geometry of an SCSI disk either.

    Since Linux often will not know the real geometry of a disk, its filesystems don't even try to keep files within a single cylinder. Instead, it tries to assign sequentially numbered sectors to files, which almost always gives similar performance. The issue is further complicated by on-controller caches, and automatic prefetches done by the controller.

    And disk geometry becomes even more of an abstract when you bring intelligent RAID controllers into the picture (like HP's HSG and EVA controllers).
  120. Re:triggers and stored procedures are overrated :- by stanwirth · · Score: 1

    though they may be faster trigger and stored procs for the most part only serve to confuse and frighten the developer, because you've moved events outside of the programmer's control

    It will only "confuse and frighten" developers that don't know SQL, and who, by definition, shouldn't be programming database apps in the first place

    , (hmmm that table got updated but none of my code touches it)

    So dump out the metadata table where the trigger and SP code is stored. It's not exactly rocket science

    By avoiding triggers and stored procs you also make portability easier.

    But not more reliable. What happens when hundreds of connections are being made to a transactional database, where updates are being made that need to cascade down several tables in order to ensure referential integrity. You really want to do that without triggers? With your "portable" apps, either you'll have to lock *all* the tables (if the database does not support row locking) involved in an update until *all* of the updates associated with a single transaction, or you'll just have to tolerate concurrent updates destroying your referential integrity.

    I've seen programmers do the latter, and then refuse to disclose their source code on the basis that they're the big expert. What a larf.

    Furthermore, you must be only acquainted with toy databases, if the order of magnitude difference in the execution speed of a stored procedure is not a *major* consideration. Databases are operating all the time, and there are some operations which must be completed before others. If your little java app falls over in the middle of taking ten times longer to complete a critical transaction with all of its tables locked when the same transaction could have been completed with a single stored procedure that fires off half a dozen triggers, and automatically rolls back if it doesn't complete -- I know which one I'd prefer. If it was my money being transferred from one account to another in a bank, for example.

  121. Re:triggers and stored procedures are overrated :- by dcocos · · Score: 1

    It will only "confuse and frighten" developers that don't know SQL, and who, by definition, shouldn't be programming database apps in the first place

    >, (hmmm that table got updated but none of my code >touches it)

    So dump out the metadata table where the trigger and SP code is stored. It's not exactly rocket science


    I agree, but the point I'm trying make is that the developers should be abstracted from the DBs all together, the only thing worse than developers acting as DBAs is DBAs acting as developers.


    But not more reliable. What happens when hundreds of connections are being made to a transactional database, where updates are being made that need to cascade down several tables in order to ensure referential integrity. You really want to do that without triggers? With your "portable" apps, either you'll have to lock *all* the tables (if the database does not support row locking) involved in an update until *all* of the updates associated with a single transaction, or you'll just have to tolerate concurrent updates destroying your referential integrity.

    If you are developing a "real" app you should be using a database that supports row level locking. What is less reliable is when the app doesn't work as the code shows because activities are going on behind the scenes.


    I've seen programmers do the latter, and then refuse to disclose their source code on the basis that they're the big expert. What a larf.

    I've seen DBA explain that it must be the code causing the problem only to have write a very simple app (so the DBA can understand) to prove that the problem is on the DB side. What a larf.

    Furthermore, you must be only acquainted with toy databases, if the order of magnitude difference in the execution speed of a stored procedure is not a *major* consideration. Databases are operating all the time, and there are some operations which must be completed before others. If your little java app falls over in the middle of taking ten times longer to complete a critical transaction with all of its tables locked when the same transaction could have been completed with a single stored procedure that fires off half a dozen triggers, and automatically rolls back if it doesn't complete -- I know which one I'd prefer. If it was my money being transferred from one account to another in a bank, for example.

    I suppose if you consider DB2 or Oracle toy databases then I guess you are correct, you are out of my league. The comment "If your little Java app falls[sic]" is typical of the ego you have to deal with when working with the typical DBA and the need to have developers dealing with DB side of thing and more with the business logic. Usually the "little Java app" is a J2EE application and not the "toy" that most DBA seem to think anything not written in PL/SQL (or the like) is.

  122. Re:triggers and stored procedures are overrated :- by stanwirth · · Score: 1

    I agree, but the point I'm trying make is that the developers should be abstracted from the DBs all together, the only thing worse than developers acting as DBAs is DBAs acting as developers.

    In a perfect world, both would understand the requirements of the system and work together.

    But not more reliable. What happens when hundreds of connections are being made to a transactional database, where updates are being made that need to cascade down several tables in order to ensure referential integrity. You really want to do that without triggers? With your "portable" apps, either you'll have to lock *all* the tables (if the database does not support row locking) involved in an update until *all* of the updates associated with a single transaction, or you'll just have to tolerate concurrent updates destroying your referential integrity.

    If you are developing a "real" app you should be using a database that supports row level locking. What is less reliable is when the app doesn't work as the code shows because activities are going on behind the scenes.

    You mean like when several different apps are all doing transactions at once, and one app has just done a query and is getting ready to lock a row in response to it -- and another app has already gone and changed it? Because the updates were implemented with query-lock-update-query-lock-update rather than all in one go with um, you know those overrated things called triggers and stored procedures ?

    I've seen programmers do the latter, and then refuse to disclose their source code on the basis that they're the big expert. What a larf. I've seen DBA explain that it must be the code causing the problem only to have write a very simple app (so the DBA can understand) to prove that the problem is on the DB side. What a larf.

    Sound like both sides have to stop larfing at each other and start cooperating . A J2EE update that does query-lock-update-query-lock-update and hence gets trod on by other concurrent J2EE apps, and hence falls over is not the DBA's arrogance getting in the way, it's the developer's lack of database development experience. Java is a wonderful language, and does wonderful things for portability -- but certain transactions do have to be executed all in one go, and without triggers and stored procedures, the system will not scale nearly as far, so a balance needs to be struck between portability and specificity to that database, if it's going to work --and scale.

    This is a big challenge, and I wish you well. Good luck getting your DBAs with the program. Involving them in the requirements review, development cycle and conformance testing might be a big help here, as well as asking them to give you some advice on DB specific trigger and stored procedure implementation -- particularly if you want to avoid things "going on behind the scenes" foiling your beautiful portable code. Java is more readable, and IMHO, because Java developers have typically a much more intensive training in how to program, typically much better designed than some of the piles of triggers, queries, snippets and pl/sql I've seen out there. But some of the database functionality you can get virtually for free with those big expensive RDBMS engines are just...merely necessary, if you don't like them, wonderful if you need the thing to be scalable on the same hardware.