Slashdot Mirror


MySQL Clustering Software Launched

lawrencekhoo writes "MySQL AB announced yesterday that software for building a MySQL Cluster will be available for download by the end of April. Articles available from Computerworld, Internetnews, Linux Electrons, and PHP Architect. Great! Now my website can finally have 99.99% availability ..."

48 comments

  1. More info... by Apiakun · · Score: 5, Informative

    Here are some direct links to more information:

    Oh, and they say availability is 99.999%, not just 99.99% :)
    1. Re:More info... by dacarr · · Score: 1
      Oh, and they say availability is 99.999%, not just 99.99% :)

      Yes, but they're rounding it down a bit. =^^=

      --
      This sig no verb.
    2. Re:More info... by aminorex · · Score: 1

      I think he misspelled "six-sigma".

      --
      -I like my women like I like my tea: green-
    3. Re:More info... by Wycliffe · · Score: 1

      Yes, more information is great, but
      all that info is just PR info.
      Does anyone know where I can get some
      documentation, or better yet a HOWTO.

  2. set nitpicking = on by InsaneCreator · · Score: 1, Interesting

    MySQL Cluster combines the world's most popular open source database with a fault tolerant database

    It's nice to start out a press release with a lie, isn't it? As far as I know, the title of the world's most popular open source database (meaning it has the most installs around the world) belongs to the Berkley DB.

    1. Re:set nitpicking = on by arcanumas · · Score: 1

      Berkley is a relational database? Does it have anything to do with MySQL apart from the fact that the both have something to do with data?

      --
      Slashdot Sig. version 0.1alpha. Use at your own risk.
    2. Re:set nitpicking = on by dacarr · · Score: 4, Insightful

      It's PR. Remember, The SCO Group is "a leading provider of UNIX-based solutions", per many of their press releases. It doesn't make it any more acceptable, it's just a tactic. Chill.

      --
      This sig no verb.
    3. Re:set nitpicking = on by Apiakun · · Score: 2, Interesting

      It really depends on what the meaning of is is. Does popularity mean that it is the most used, or the most liked? I would think that popularity and usage are a different metrics.

    4. Re:set nitpicking = on by Apiakun · · Score: 1

      -a

    5. Re:set nitpicking = on by tzanger · · Score: 1

      The quoted text says nothing about relational databases.

    6. Re:set nitpicking = on by arcanumas · · Score: 0
      Since it would be blindingly obvious that the article meant the kind of databases tha MySQL is (client/server , SQL syntax to store/retriece data, relational, etc) there is no need to mention that.

      However, if you believe that by not mentioning this it is open to any interpretation possible i would suggest that neither BerkleyDB nor MySQL are the most popular. I am sure ext2fs is installed on more machines than MySQL so the article is lying. It's not MySQL , but ext2fs that is the most popular database.
      Hell, why not compare it to DNS? Or anything that would qualify as a "database".

      --
      Slashdot Sig. version 0.1alpha. Use at your own risk.
    7. Re:set nitpicking = on by tzanger · · Score: 1

      reductio ad absurdum. 'nuff said.

    8. Re:set nitpicking = on by jonadab · · Score: 2, Insightful

      I think they're using "database" here to mean RDBMS. Technically a database is
      just anything that organises data, so a filesystem would count, but that's not
      how the term is generally used. Usually these days when people say database
      they mean RDBMS.

      The other thing is, most installs is not the only reasonable measure of
      popularity. I'm pretty sure more people have daily interaction with MySQL
      than with Berkeley DB directly. Berkeley DB is installed so widely because
      it's been around longer and because certain key pieces of software depend
      on or use it for historical reasons, not because people like it better.

      Note that I'm not trying to say Berkeley DB is bad or anything, or that MySQL
      should replace it; they're really quite different things, and they exist for
      different purposes and fill different niches. I wouldn't consider them to be
      direct competition really -- well, not mostly. MySQL is in competition with
      PostgreSQL mainly, and to a lesser extent the major commercial database
      offerings (Oracle, MS SQL Server) and various lesser-known projects (e.g.
      Firebird SQL). Berkeley DB competes with I think certain Gnu libraries and
      maybe some other things I'm even less aware of. Not that MySQL and Berkeley
      DB are in _completely_ different worlds; they both might reasonably be said
      to compete on some level with SQLite for example, so there is some overlap
      between their areas of application. But still, they're mostly not really in
      the same category.

      Sure, they're both databases. But to say one is more popular than the other
      is like arguing whether traceroute is more popular than Mozilla. They are,
      after all, both internet software.

      --
      Cut that out, or I will ship you to Norilsk in a box.
    9. Re:set nitpicking = on by kwoff · · Score: 1

      I don't think it means the most installs. For example, if MySQL had scantily-clad babes advertising it, then it could be really popular even if it wasn't installed a lot.

    10. Re:set nitpicking = on by Drey · · Score: 1

      Thanks to your post, I read the next thread below this as "Nude requirements..."

    11. Re:set nitpicking = on by Anonymous Coward · · Score: 0

      To pick a nit, you should have made your subject line:

      set nitpicking = 'on'

      (ducks)

  3. What about PG? by Anonymous Coward · · Score: 5, Interesting

    I remember someone developing a rahter advanced multi-master replication and clustering for PostgreSQL. Does anyone know how far is that project? Has it entered the testing phase yet?
    From what I've read it looked very, very prommising, but it doesn't do much good if it's on paper only...

    1. Re:What about PG? by Anonymous Coward · · Score: 0

      It's been available for years, the difference is that it was opensourced about 6 months ago. I don't recall the name at the moment, but you could check the PG homepage for links, I'm sure it's there.

  4. set error_detection = on by jtheory · · Score: 4, Informative

    Apples to oranges. The press release should have been more specific than just "database", but still... Berkely DB is not a "database" as most developers think of the term (relational, accessible using SQL, etc.).

    Berkely DB is code that manages a data store, and you access the data using method calls within your app (you compile their code with your project), NOT using SQL, and NOT connecting to an independant application. Remote access n/a, no ODBC or JDBC, etc. etc.. Great product, but a completely different animal from MySql and other relational databases.

    In fact, MySql used to offer Berkeley DB (as opposed to InnoDB, etc.) as a data storage option WITHIN the MySql product.

    --
    There are only 10 types of people: those who understand decimal, those who don't, and, uh, 8 other types I forget.
    1. Re:set error_detection = on by Anonymous Coward · · Score: 0
      Berkely DB is not a "database" as most developers think of the term (relational, accessible using SQL, etc.).
      OTOH, MySQL is also not a "database" as most "real" (DB2, Oracle, PostgreSQL, even MSSQL) database developers think of the term. It lacks most of ACID properties, at least in its "most widely used" form
  5. Wow by Anonymous Coward · · Score: 0

    How much can databases improve over time and how much of improvement can be achieved? Sooner or later MySql will be enterprise caliber and Oracle will have bigger things to worry about than PeopleSoft.

  6. In memory only? by diegomontoya · · Score: 4, Insightful

    If this is the requirement deployment then for people like us were db size at over 20GB, and yes the big blogs are already stored in compressed using compression, this would not be economically pratically to use. Factoring OS, caching, I need to get 22GB memory for each node? Last I checked, the 2GB cheaps are still nasty expensive.

    1. Re:In memory only? by diegomontoya · · Score: 1

      Wow...I have to apologize for my atrocious spelling and grammar in the post. I'm usually not this bad. =)

    2. Re:In memory only? by Unknown+Relic · · Score: 4, Informative

      I was wondering this as well. Also the FAQ mentions that "Data that needs to be highly-available must reside in the MySQL Cluster storage engine. If existing MyISAM and/or InnoDB data needs to be made highly-available then it has to be migrated to the MySQL Cluster storage engine." I'd assume that the clustered table types have support for transactions like InnoDB tables do, but there's nothing here to confirm this.

      From the way I'm reading it, this type of cluster would be most ecomomically used for in conjunction with a traditional replicated mysql database. You would use clustered engine for transactional tables where a large number of inserts or updates occur, and for tables where you have a lot of historical or read-only data, you would use standard replication, where you could tolerate a few minutes without the ability to insert or update should the master fail. In order to reduce the memory requirements for the cluster you could also move old transactations from the transactional tables to historical tables which use InnoDB/MyISAM.

      That being said, there must be SOME use of the disk on the cluster, because their recommended node system has raid + four 73GB SCSI hard drives... major overkill if everything except for OS/Software is stored in memory!

    3. Re:In memory only? by Anonymous Coward · · Score: 0

      The basic idea is you need twice as much memory as you have data since you're obviously going to have two copies of the data. So 20 GB of data does require 40 GB of RAM. That's not to say you need 2 machines with 20 GB of RAM each. MySQL will partition the data for a single table out to different cluster pairs. If you have a 4-node cluster, each one would only need about 12 gigs of RAM to support your 20 gb clustered database.

      The clustering will scale outward for whatever you need. It just becomes an issue finding the sweet spot, where buying too many low-RAM machines doesn't end up costing you more in administration efforts than just buying a few beefier servers.

  7. You know you're a database geek when: by denubis · · Score: 4, Funny

    You know you're a Database geek when you see the headline and immediately think: "Ah hah! Clustered indexes! That'll save some time during joins! Oh. Wait. They're talking about boxes. Drat."

  8. drive usage and thoughts... by diegomontoya · · Score: 5, Informative

    No where did they mention battery backed-up ram modules as a recommended config so I believe your're correct to assume that disk not only has to be used, but MUST be used.

    Without ramsan style battery packed ram, there is no way any enterprise would trust clusters of any kind to ram only storage for write commits.

    Looks like each write transaction will be synchronized acrossed all nodes, which would explain the gigabit and lower latency interconnects. Still, this is crazy complex to make fast and reliable.

    So to make it truely synchronized, they have to write to disk, for backup/log, before committiong the data to the ram. So regardless, writes are slow and I'm waiting to see how they by-pass this disk write commit latency. Add on that they have to do this for all nodes before responding to the app, writes are crazy slow, relatively, since they can influence indices, force cache/ramed-data flushes, etc. Would be interesting to see how they handle this.

    Also, I'm interested to see what type of check code/algorithm to see which NODE is healthy and which ones are corrupt (not dead since dead servers are the easiest to detect). From their diagrams, looks like N-type replication so each node is an exact synchorinized duplicate of all others. But how to know for sure which one is the "safe" one when corrupts happen?

    Also, I wonder how they tackle gigantic inserts/update like "replace into table2 select * from gigantic_table1". They can't assume or dictate that we only stick to small write transactions right?

    Cheap N-way synchronized replication is my and probably most dbms managers' holy grail so I'm crossing my fingers for Mysql to get this right.

    1. Re:drive usage and thoughts... by nukitsuke · · Score: 1

      Actually according to MySQL diskless servers will be supported very soon. Im at the MySQL con in orlando and this was one of the first questions that came up.

    2. Re:drive usage and thoughts... by HarrisonFisk · · Score: 1

      MySQL Cluster will only write to the disk in an asynchronous manner. The disk is only needed if there is a total cluster failure (ie. all machines go down at once)

      However the data is written synchronously to more than a single node. So when you insert data, it is inserted into two places (or more, it is configurable) at the same time. That way even if one server goes down, you will still query from the other place.

      The result of this, is that it still will scale linearly for writes as well. Keep in mind the data is also partitioned, so each node only keeps a piece of the data present in it.

  9. Node requirements by Anonymous Coward · · Score: 2, Insightful

    The standard requirements for the node surprised me.

    Is stats that you need 16GB of RAM !! Why do they say that? Doesn't the amount of RAM depends on the size of your Database? If my InnoDB database file is only 3GB why would I need more that 4GB og RAM?

    Also, why the hell would you need scsi drives for an in memory database?

    1. Re:Node requirements by Precipitous · · Score: 1

      Not to mention that they don't support the Pentium line of processors. From FAQ - MINIMUM requirements:
      1x Intel Xeon, Intel Itanium, AMD Opteron, Sun SPARC, IBM PowerPC
      I guess they think you won't bother clustering on anything less than a hefty server. I won't be testing this at home.

      --
      My motto: "A cat is no trade for integrity."
    2. Re:Node requirements by Daniel+Dvorkin · · Score: 2, Informative

      16 GB is the "preferred" requirement; the minimum is 512 MB. Quite a difference.

      --
      The correlation between ignorance of statistics and using "correlation is not causation" as an argument is close to 1.
  10. Does MySQL AB have credibility here? by Anonymous Coward · · Score: 5, Insightful

    I mean, this is an enterprise-scale storage engine from the same engineering team that used to deride ACID transaction isolation and rollback as unimportant, and whose parser still silently ignores any attempt to use integrity constraints that aren't supported. Are these the right people to achieve the robustness that needs to accompany "five nines"?

    1. Re:Does MySQL AB have credibility here? by ldspartan · · Score: 4, Insightful

      No, they're either morons or criminally ignorant of what is considered a standard feature set for RDBMSs. For all the reasons you mentioned and more.

      --
      lds

    2. Re:Does MySQL AB have credibility here? by Anonymous Coward · · Score: 0

      Lessee, four posts down and we have our first +5 anti-MySQL troll. Congrats.

    3. Re:Does MySQL AB have credibility here? by Anonymous Coward · · Score: 0

      And here we have the "Yeah, fuck MySQL!" follow-up. This Slashdot anthropology study/post is brought to you by Predictability (TM).

      It's like a Rorschach test. You fuckers see "MySQL" and have immediate, uncontrollable convulsions.

      Or maybe the CIA is conducting LSD experiments on you. You sure your nick isn't "lsd?"

    4. Re:Does MySQL AB have credibility here? by bunyip · · Score: 1

      I believe the storage engine came from Ericsson, who developed it for phone switches. So, we should be asking if phone-switch manufacturers know anything about the robustness that needs to accompany "five nines"

      Alan.
      .

    5. Re:Does MySQL AB have credibility here? by Anonymous Coward · · Score: 0
      The clustering technology, as well as its engineering team, was acquired from Swedish telecomm manufacturing giant Ericsson, said Vinay Joosery, the product manager for MySQL Cluster. - InformationWeek

      Thanks! I hadn't heard they finally have serious developers. This is no longer laughable.

  11. Brillant! by Anonymous Coward · · Score: 0

    Good to see MySQL develops so fast and its press releases are already hyped enough.

    If the MySQL development team did clustering, perhaps now they could consider implementing stuff from numerous wishlists, mainly from here and here...

  12. Anyone can understand marketing speak? by Anonymous Coward · · Score: 0
    Ok, I read the page and I have no clue what it means. What is "main memory database" suppose to mean? Does it mean MySql will store all the data in memory? Or does it mean insert/updates will write to any database in the cluster in memory. If that is the case, then I can see how that would improve performance and allow any node in a cluster to accept a transaction and asynchronously update all other nodes in the network. If it is really store all data in memory, then I think it's gonna fail.

    Also, what do they mean by "share processing"? Do they mean all databases are mirrors of each other, therefore a read can be served up any node in the cluster. If that's the case, it also means any transactions can be served up any node.

    Last year there was a post about CJDBC which allows you to create a cluster using clustered JDBC driver. It's good that MySql is getting some more advanced features. There's still a long way to go, but it's a step in the right direction.

  13. MySQL Cluster white paper by Vexware · · Score: 2, Informative

    For the lazy among you (and lazy you have to be to find the task of entering a few fields in a form exhiliarating), I have uploaded the MYSQL Cluster white paper to another FTP site, mirror of the file which you may access there: mysql-cluster-whitepaper.pdf (the document is a PDF file, so fear the Adobe Acrobat Reader loading time).

    --
    "Really, I'm not out to destroy Microsoft. That will just be a completely unintentional side effect" -- Linus Torval
    1. Re:MySQL Cluster white paper by Unknown+Relic · · Score: 2, Informative

      Note that they've now posted a technical whitepaper outlining the architecture which wasn't there yesterday. It's worth a read, goes into a lot more detail than what was there previously and talks about replication options, failure scenarios, etc. It mentions that disk storage is used in addition to memory storage, which confirms the speculation made earlier in the discussion, though it still doesn't explain exactly how disk storage is used.

    2. Re:MySQL Cluster white paper by Anonymous Coward · · Score: 0

      I read the white paper and it provides a bit more information, but their approach is only appropriate for mirrored situations. If you need to partition your database because there are multiple offices around the world, this approach probably isn't appropriate. Situations where really large databases are used, it is also not appropriate. But for the typical MySql small databases, it should be fine.

    3. Re:MySQL Cluster white paper by normal_guy · · Score: 1

      You can speed up Acrobat Reader loading time dramatically by copying everything in the "\Reader\plug_ins" folder to the "\Reader\Optional" folder. You may need to put some back if you actually use some plugins...but it worked for everyone in my office.

      --

      Linux: Free if your time is worthless.
    4. Re:MySQL Cluster white paper by jcuervo · · Score: 1
      the document is a PDF file, so fear the Adobe Acrobat Reader loading time
      xpdf(1). Loads wayyyyyy faster.
      --
      Assume I was drunk when I posted this.
  14. MySQL Cluster technical white paper by Vexware · · Score: 1

    Thank you for noticing this new white paper, document which I have once again mirrored and that you may find there: mysql-cluster-technical-whitepaper.pdf (181 KB in size).

    --
    "Really, I'm not out to destroy Microsoft. That will just be a completely unintentional side effect" -- Linus Torval