Slashdot Mirror


High Performance MySQL

swsuehr (Steve Suehring) writes "Knowing that Jeremy Zawodny works with MySQL in a highly demanding environment at Yahoo! Finance, I have to admit that I was somewhat anxious to crack open a copy of the book High Performance MySQL published by O'Reilly and written by Jeremy and Derek J. Balling. With such high expectations, it would be easy to be disappointed with the content or level of the material in the book. Far from disappointing, High Performance MySQL was exactly what I had hoped it would be and more." Read on for the rest of Suehring's review. High Performance MySQL author Zawodny & Balling pages 294 publisher O'Reilly rating Excellent reviewer Steve Suehring ISBN 0596003064 summary Optimization, Backups, Replication, Load Balancing & More

With a title like High Performance MySQL, the reader should not expect assistance with tasks such as installation or the basics of SQL. The authors acknowledge that the book isn't meant for the MySQL beginner. Even so, the book does spend a little time going over a refresher of MySQL. If the reader is looking for a book on the basics of MySQL, though, I'd recommend picking up a different title. I might be so brazen as to suggest looking at the book on MySQL that I wrote a couple years ago, but I'll leave it for the reader to find an appropriate title for their level of knowledge with MySQL.

The book is broken out into ten chapters, beginning with a chapter on the basics of MySQL as seen from someone with some level of MySQL experience. (And since I am someone with some level of MySQL knowledge, this chapter didn't seem to be too painful of a read.) That statement begs for a further explanation. I find it painful to read books that claim to have advanced topics but find it necessary to spend (or waste) time by giving yet another explanation of how to type ./configure; make; make install in the beginning chapter. In this regard, I find High Performance MySQL to be quite good at providing the appropriate level of explanation for the audience.

The first pages alone are a great starting point for those who are wondering about the performance as it relates to the different versions of MySQL. A discussion of the binary versions of MySQL versus the source code versions gives the reader a good understanding of the implications of this important decision. Choosing correctly between the versions of MySQL can help to maximize performance.

Though only the first chapter is titled "Back to Basics," most chapters include some basic or background information pertinent to that chapter. For example, both the query and index performance-related chapters include sections with the word 'Basic' in their titles. These two chapters, "Index Performance" and "Query Performance," cover topics often misunderstood when considering performance on a database. Naturally, any database administrator worth their pay (insert your own joke here) will tell you that an index can help to improve performance. But High Performance MySQL takes it a step further by diving into why you might use a particular index given your database needs. This type of practical information is common throughout the book. Rather than just give the reader a listing of the indexes (which are mostly common among RDBMS anyway), the authors choose to tell you why you might choose one method over another.

Other chapters within High Performance MySQL include chapters on server performance tuning, replication, load balancing, and high-availability, backup and recovery, and security. I've followed MySQL forums and mailing lists over the years and some of the more common questions relate to MySQL security. The chapter on security gives the reader a great explanation of the MySQL GRANT tables and includes a sidebar on the significance of 'localhost' in MySQL which nearly always seems to be a source of confusion, though maybe not for the reader of this particular book. Like other chapters, the security chapter includes tips for the underlying environment upon which MySQL operates. For example, regardless of how good the MySQL database permissions are configured, if another piece of software on the server isn't kept up to date then the entire server, including the MySQL data, could be compromised.

Looking back at the book, I might have laid the chapters out in a different order or provided more structure through sections of the book. However, this can be dismissed as personal opinion and is non-substantive anyway. I believe High Performance MySQL is an excellent choice for the MySQL administrator or developer not only looking to improve their database performance but also to improve his knowledge of MySQL.

You can purchase High Performance MySQL from bn.com. (And the book that Suehring alludes to is the MySQL Bible . Slashdot welcomes readers' book reviews -- to see your own review here, read the book review guidelines, then visit the submission page.

307 comments

  1. No Thanks by AKAImBatman · · Score: 4, Funny

    You can keep "High Performance MySQL". I'm holding out for "Incredible Data Integrity Management with MySQL". :-D

    1. Re:No Thanks by turboflux · · Score: 3, Insightful

      You would almost think that data inegrity would come before tuning for high performance wouldn't you?

    2. Re:No Thanks by gnuLNX · · Score: 1, Flamebait

      Hum....I manage our companies 4.5 million compounds database which contains ~4.5 million rows, and about 250 collumns in the laregest table. The entire database has roughly 40 tables....I have never had an issue with data integrity...granted my experience certainly can not be extrapolated to all environemets, but I can say that if I was given the authority...our $100,000 oracle system would be gone in a heart beat....waste of money with something as nice as MySQL out there. IMO

      --
      what?
    3. Re:No Thanks by turboflux · · Score: 1

      I guess there is a reason you don't have the authority to do that then.

    4. Re:No Thanks by gnuLNX · · Score: 2, Funny

      guess so. All us stupid mysql people...like the guy's running yahoo finance...must not have any clue what we are doing.

      say what you will but for our companies needs I guarantee you that MySQL is more than suffecient...we don't have a lot of changes on the db....plain and simple...a waste of money!..

      But then you propbably know more about my companies db than I do so I will shut up now.

      --
      what?
    5. Re:No Thanks by Anonymous Coward · · Score: 0

      If some data is lost at yahoo finance, it's ok -- nobody's paying them for it, no orders are being processed, etc. MySQL is great for that sort of thing.
      If you need some real integrity, it just doesn't fit the bill.

    6. Re:No Thanks by AKAImBatman · · Score: 2, Insightful

      For a moment there, I thought you meant that Yahoo was processing company financial data with MySQL. Then I realized that they use it just for displaying market data and the like.

      Give me a heart attack, why don't you?!

    7. Re:No Thanks by fitten · · Score: 2, Insightful

      Unfortunately, a lot of folks who think they know a lot about databases really don't. They don't know when MySQL is sufficient or not because they don't understand what databases need to do and/or provide. Most wouldn't know the difference between MySQL, a big honkin text file, and a real RDBMS like Oracle. To them, they are jsut all places to fling a bunch of data to query back later. Most wouldn't know what transactions really are or when you should use them, although if you ask them, they'll assure you that they don't need that "overhead" in their problem.

    8. Re:No Thanks by gnuLNX · · Score: 1

      Well I admit to not knowing "that" much about databases. I do however know enough to get by...also I think that the problem should determine the solution. If you don't need xyz then we pay for xyz? I guess there are situations were people don't really know what they need, but then I would guess that most people do.

      just my last two cents!

      --
      what?
    9. Re:No Thanks by gnuLNX · · Score: 1

      True...but they do have a reputation on the line...I am pretty sure that their rep is worth some really big dollar amount to them!

      I can't really argue DB's because I honestly don't know that much about the internals..I am a developer who just tries to find the best tool for the job...in many situation...actually all situations were I have needed a db, mysql has fit...I am sure that others would have worked...perhpas better, but the overhead to learn and study would certainly not have been worth the time...for my apps.

      Cheers

      --
      what?
    10. Re:No Thanks by DrSkwid · · Score: 1


      for a developer your typing accuracy is appalling, you must spend 90% of your time chasing down typos !

      --
      There are places where the networks are not touching,and there are places where they are-Boeing's Lori Gunter
    11. Re:No Thanks by Anonymous Coward · · Score: 0

      Heh, then you'll also be interested in:

      High SQL Conformance with MySQL!

      High Performance MySQL? What an oxymoron - that's like High Performance Visual Basic, High Performance Logo, High Performance Windows 95.

    12. Re:No Thanks by fitten · · Score: 1

      I guess there are situations were people don't really know what they need, but then I would guess that most people do.


      In my experience, I've found the opposite to be true. I find it much more common that people who "know" what they need and don't need in actuality don't understand their own problem or the technologies they can/should use to solve the problem. I've seen folks describing a problem that is clearly a transactional problem but then refuse to use transactions because of the "overhead" and they've had systems that "worked for years without using transactions so why do they need them now". I also have seen a number of folks who can't think of a single example of where one would use a transaction conclude that 'therefore, no one needs transactions'.

      Most of the time, it is just ignorance combined with defensive pride. Folks who are "self taught" tend to arrive at a conclusion early in their programming career and doggedly stick to it because that's all they know, regardless of what is "correct".

    13. Re:No Thanks by gnuLNX · · Score: 1

      Probably closer to 95%! The other 5% is spent on slashdot.

      --
      what?
    14. Re:No Thanks by Anonymous Coward · · Score: 0

      I *DO* know an awful lot about database theory and internals, and yes, MySQL is adequate, if not perfect, for about 70% of what it is used for. Furthermore, it has that "Microsoft" thing going for it as well: easy to use. The other 30% of situations, the user has moved on to another solution anyway (PostgreSQL, Oracle, etc).

      Most of you just need to get over yourselves.

    15. Re:No Thanks by Hezaurus · · Score: 1

      It's funny that the InnoDB tables have been available to MySQL users since the last millennium - but you people still don't have a clue.

      InnoDB
      fast and ACID compliant.

      --
      No matter how fast light travels it finds the darkness has always got there first, and is waiting for it. (T. Pratchett)
    16. Re:No Thanks by Turmio · · Score: 1

      I guess you missed the joke. There's no such thing as data integrity in MySQL world.

    17. Re:No Thanks by budgenator · · Score: 1

      Seems you've really hit a nerve there, makes me wonder what would have happened if you said you actualy ran a MySQL testbed with that database and it beat the oracle with your real-world queries.

      Odd how we see more religious zealotry in datbase issues than we do in OS issues now.

      --
      Apocalypse Cancelled, Sorry, No Ticket Refunds
    18. Re:No Thanks by Anonymous Coward · · Score: 0

      No, but I'd expect data integrity would come before anybody called it a "database".

      Personally I call it "disk-based storage". Until I can use MySQL to guarantee data integrity, I'll continue calling it that.

      Let me know when I can specify that table1's primary key appears as a foreign key in table2 no more than 3 times.

      Let me know when I can normalize all my data so that it contains no NULLable columns and create a view to generate rows with NULL-like objects in them.

      Let me know when I can *really* trust the database to enforce my data model.

      Let me know if MySQL ever implements all the stuff that people figured out, oh 20 years ago about databases. Or even just half of it.

    19. Re:No Thanks by Anonymous Coward · · Score: 0

      I don't see why this is flamebait..just uninformed.

      Look, I can put 4.5 million rows of data in TEXT FILES and make the same claims.

      The point is, does MySQL guarantee CONSTRAINTS on your data?

      Tell us about your "business rules". Are there any combinations of data that 1) can be inserted into your database and 2) aren't valid combinations of data according to the business rules?

      If there are, you DO have an issue with data integrity. That's what data integrity means: no meaningless data combinations, at any time, including before, after, and during multi-step transactions.

      I.e. you should be able to hire a summer temp worker to write half-ass Perl scripts to insert data and be guaranteed that he can't screw up your data integrity.

      Like in a shopping cart app having invoices without line items or stuff like that.

    20. Re:No Thanks by scribblej · · Score: 1

      You clearly never tried ReiserFS.

  2. High Performance? by Anonymous Coward · · Score: 1, Funny

    High Performance MySQL, published by Oxymoron Press, Inc.

    1. Re:High Performance? by JAgostoni · · Score: 4, Insightful

      I find MySql to be very high in performance when queries are written concisely and correctly. Heck, I have seen (and written) some Oralce PL/SQL that has horrific performance just to rewrite it concisely and correctly and have it perform very well indeed.

      The engine is only going to run as good as the crap you (developers) put into it. I would go so far as to say the simply due to the pervasiveness of MySql there are for more horrible script examples out there than for PostgreSQL.

      Personally, I use to correct tool for the job. BLOG? Hell ya use MySql. OLAP? Hell no. But you could make it work and work well if you wanted to.

    2. Re:High Performance? by JAgostoni · · Score: 1

      -1 Overrated? You, Mr. Moderator, must have been offended by "writing bad SQL" remark...

    3. Re:High Performance? by Ulic · · Score: 0

      Yes, but I'm an old, third post troll. Check out that user number!

    4. Re:High Performance? by JAgostoni · · Score: 2, Funny

      In this case you're a third reply troll. Ha ... ha ... ha... but seriously folks, I'm here all night. Try the veal.

  3. I would never use MySQL for anything by Anonymous Coward · · Score: 0, Flamebait

    Except maybe for throwaway blog sites. If my blog site crashes, so what?

  4. MySQL - I smell flames? by strictfoo · · Score: 5, Funny

    I'm going all out here.

    MySQL is better than PostgreSQL
    Redhat is better than Debian
    Windows XP is better than MAC OS X
    Bush is better than Kerry
    Conservatives are beter than Liberals
    Americans are better than Europeans

    etc. etc. etc.

    --
    I've just signed legislation that'll outlaw Russia forever. We'll begin bombing in five minutes.
    1. Re:MySQL - I smell flames? by Anonymous Coward · · Score: 1, Interesting

      Really, either the fourth or fifth item on the list would, with VERY high probability, imply all the rest but the first.

      Your list has an actual information content of roughly 2.4 items, as opposed to 6.

    2. Re:MySQL - I smell flames? by LoveMuscle · · Score: 4, Funny

      Bah.. You forgot..

      Emacs is better than vi
      Abortion is a woman's right
      Gun control means using both hands
      Christianity sucks, the bible is just a book

    3. Re:MySQL - I smell flames? by Anonymous Coward · · Score: 0

      The last three items are 100% correct (especially the last one), the first three are 100% wrong.

    4. Re:MySQL - I smell flames? by Anonymous Coward · · Score: 0

      Atheism is better than agnosticism. Come on in, the water's fine.

    5. Re:MySQL - I smell flames? by curious.corn · · Score: 1

      Abortion is a woman's right Gun control means using both hands

      Those two statements don't fit the initial set while the other do; they're conservative/reactionary while the pattern is "cool lib/alt vs. ignorant blind conservatism."

      P.S. I'm all for the initial set... actually that's why I noticed what's out of tune ;-) ... and I own a mac... therefore I'm kewel ;-) and know better... ;-)

      --
      Mi domando chi à il mandante di tutte le cazzate che faccio - Altan
    6. Re:MySQL - I smell flames? by stiggystiggy · · Score: 1

      Does is make me a "moderate" if I agree with half of these?

    7. Re:MySQL - I smell flames? by B747SP · · Score: 1
      You forgot:

      BSD Is Dying

      You insensitive clod!

      --
      I find your ideas intriguing and I wish to subscribe to your newsletter.
    8. Re:MySQL - I smell flames? by Anonymous Coward · · Score: 0

      Sweet, I love true/false tests!

      False
      True
      False
      True
      True
      True

    9. Re:MySQL - I smell flames? by Anonymous Coward · · Score: 0

      Even the Nazis were efficient, but it doesn't mean they had a cure all for what ails the world.

    10. Re:MySQL - I smell flames? by curious.corn · · Score: 1

      I must be drunk... I got it the other way around; please forgive me and move along. Good night e

      --
      Mi domando chi à il mandante di tutte le cazzate che faccio - Altan
    11. Re:MySQL - I smell flames? by Anonymous Coward · · Score: 0

      does netcraft confirm it ;)

    12. Re:MySQL - I smell flames? by worf_mo · · Score: 1

      You probably wanted to write

      vi is better than Emacs

      as the grandparent poster put all the cool stuff on the right side.

    13. Re:MySQL - I smell flames? by ignavus · · Score: 1

      MySQL is better than PostgreSQL ... as a flat file system
      Redhat is better than Debian ... as a door stop
      Windows XP is better than MAC OS X ... at wasting your cash
      Bush is better than Kerry ... for plugging up open manholes
      Conservatives are beter than Liberals ... at messing things up
      Americans are better than Europeans ... at sticking their noses into foreign countries.

      See no disagreement here at all.

      --
      I am anarch of all I survey.
    14. Re:MySQL - I smell flames? by Anonymous Coward · · Score: 0

      MySQL is better than PostgreSQL ... as a flat file system
      I really don't care - I use neither

      Redhat is better than Debian ... as a door stop
      I didn't know bits could be used as a door stop.

      Windows XP is better than MAC OS X ... at wasting your cash
      But I get more with Windows XP?!!?

      Bush is better than Kerry ... for plugging up open manholes
      Actually, Kerry is larger, so he would probably do a better job.

      Conservatives are beter than Liberals ... at messing things up
      Yes, the 1960's and 1970's were a great time in America's history. The Vietnam war was started by conservatives!

      Americans are better than Europeans ... at sticking their noses into foreign countries.
      True, most Europeans have their nose stuck up each other's asses. "Oh you're so clever and intellectually superior!" - "No, you're so clever and intellectually superior!" - "Oh we're all so much better than the rest of the world!" - "Let's quote people who were intelligent! That makes us intelligent!"

  5. Oh I've heard of this by Exmet+Paff+Daxx · · Score: 4, Funny

    Indeed, PostgreSQL is awesome.

    --
    If guns kill people, then CmdrTaco's keyboard misspells words.
  6. Sometimes MySQL indexes are bad... by xxxJonBoyxxx · · Score: 2, Interesting

    Sometimes MySQL indexes are bad. For example, if you infrequently do lookups based on an index but you often add or remove many records from a database, an MySQL index can actually hurt performance.

    I hope there's something about that in the book...but then again, this is 2004. Who still reads (non-electronic) books about technical topics?

    1. Re:Sometimes MySQL indexes are bad... by Atzanteol · · Score: 5, Insightful

      That's true of all databases (not just MySQL). Inserting to a table that has an index requires a cost to index the new data.

      --
      "Ignorance more frequently begets confidence than does knowledge"

      - Charles Darwin
    2. Re:Sometimes MySQL indexes are bad... by Anonymous Coward · · Score: 0

      "if you infrequently do lookups based on an index"

      Umm, why would you do this? That's what indexes are FOR, goofball.

    3. Re:Sometimes MySQL indexes are bad... by smclean · · Score: 1
      If the book addresses topics so severely obvious (and covered in the MySQL documentation) as this, I would be pretty disappointed.

      Of course indexes can hurt performance. If not, you would just make every field in a table an index.

      Hopefully this book is made for people who know this already...

      --

      "'Yrch!' said Legolas, falling into his own tongue."

    4. Re:Sometimes MySQL indexes are bad... by KenBot_314 · · Score: 1

      This doesn't just apply to MySQL. This is actually how indexes work. More work for adds/deletes, but less work for querys using the indexes.

      Your scenario would cause performance problems in any database.

    5. Re:Sometimes MySQL indexes are bad... by Anonymous Coward · · Score: 0

      congrats, you just covered day 3 of a begining database design class.

      that is true of all databases, if you tell it to index something you don't use, it is just unwanted overhead.

      much like how Windows indexes your files for you, but if you don't use their search feature much, it's just something your computer wastes time doing (usually during disk defrag or scan disk if your like me)
      or Linux/Unix's locate
      i imaging MacOSX does something like it too, but i don't know.

    6. Re:Sometimes MySQL indexes are bad... by InsaneCreator · · Score: 2, Informative

      Umm, why would you do this?

      Because using an index requires random data access which is more expensive than sequential access. There is a point where doing an index scan takes more time than just going over all the records in a table (example: small tables, condition matches many rows, etc). Any sane query planner should/will choose the faster method.

    7. Re:Sometimes MySQL indexes are bad... by kylector · · Score: 1

      MacOS X has locate just like Unix. And a content-indexer for their content searching, but I've never used it since they introduced it about 5?? years ago in OS 8, I think.

    8. Re:Sometimes MySQL indexes are bad... by samjam · · Score: 1

      Table scans are generally ONLY quick if ONE PROCESS is trying to process queries at once.

      I had this beef with postgres a while back when I was doing the Ananova TV listings; once 2 queries are going at once table scans are deadly as well as trashing OS bufferred blocks from disk.

      I had to increase the artificial "cost" of table scans to force postgres to prefer indexes and stop the system falling over.

      My lesson is this:
      Where table scans are quicker than indexes, the extra speed isn't important because the only running query has the box to itself anyway; so always use indexes, because it gives the speed when you really do need it.

      I may come across a case where it isn't true, but this was working with 5GB tables and 20 simultaneous queries. Indexes were a darn site faster than the table scans which postgres tried to do.

      Sam

    9. Re:Sometimes MySQL indexes are bad... by jadavis · · Score: 1

      I would just like to add:

      Indexes are also not the best plan unless only a small percentage of the pages contain matching tuples.

      For instance, it's a good idea to use an index on a query like:
      SELECT * FROM slashdot_users WHERE name='jadavis';

      but a bad idea for:
      SELECT * FROM nra_members WHERE sex='M';

      In the former, it reads a couple pages for the index, and then it reads just one page from the heap; which is much faster than reading the entire heap storage for that table and looking at each tuple. In the latter, it would have to read many index pages, then many heap pages, all non-sequentially; which is much slower than sequentially reading all the heap pages.

      --
      Social scientists are inspired by theories; scientists are humbled by facts.
    10. Re:Sometimes MySQL indexes are bad... by ahdeoz · · Score: 1

      The whole point of indexes is using multiple fields

  7. It Sounds Pretty Basic by cowgoesmoo2004 · · Score: 3, Insightful
    Okay, as an industry professional, I tire of books calling themselves "for professionals". Don't most professionals already know about index types and choosing their usage? If they don't, why are they calling themselves professionals.

    The important thing is, does it tell you how to set up and manage highly distributed system managing gigabytes of data under high transaction volumes?

    If it doesn't, and just discusses the general issues to consider under such conditions, it is worthless for those that are already professionals. Professionals know how to consider these things, they just want concrete examples so they don't have to spend months of reading and tinkering to figure it out on their own.

    Unfortunately, reading the review above doesn't even let me know whether or not the book has anything in it at a level that I might want. However, if I was writing the review I'd probably order the paragraphs differently and use some different words. That might just be personal preference.... :p

    1. Re:It Sounds Pretty Basic by smclean · · Score: 2, Interesting
      The important thing is, does it tell you how to set up and manage highly distributed system managing gigabytes of data under high transaction volumes?
      The article did mention that it covers some related issues: "Other chapters within High Performance MySQL include chapters on server performance tuning, replication, load balancing, and high-availability, backup and recovery, and security."

      But I agree, the review doesn't give any indication whether the book spends all its time repeating whats already available in the MySQL manual, or giving the reader new ideas and insight into 'professional' MySQL optimization.

      --

      "'Yrch!' said Legolas, falling into his own tongue."

    2. Re:It Sounds Pretty Basic by Anonymous Coward · · Score: 5, Interesting

      Maybe, but understand that some of us who are MySQL admins may have been thrust into that position as a result of our jobs and not necessarily because we were trained on databases.

      My background is in Electrical Engineering, but as part of my duty I am in charge of maintaining our company's data base. I'm probably the most qualified person here to do it too; but my knowledge of databases doesn't extend much farther than select/insert/update. Sound to me like this book could help me learn a thing or two about what I'm "administrating".

    3. Re:It Sounds Pretty Basic by amorsen · · Score: 1, Funny
      The important thing is, does it tell you how to set up and manage highly distributed system managing gigabytes of data under high transaction volumes?

      I believe you missed something. The book is about MySQL.

      --
      Finally! A year of moderation! Ready for 2019?
    4. Re:It Sounds Pretty Basic by AKAImBatman · · Score: 1, Interesting

      Okay, as an industry professional, I tire of books calling themselves "for professionals". Don't most professionals already know about index types and choosing their usage?

      More importantly, what self-respecting professional would be caught dead using MySQL?

      No data integrity
      Completely non-standard SQL
      No extensibility in the engine (functions, stored procedures, etc.)
      No subselects
      Weird handling of '0' vs. null
      etc.

      Back when MySQL was created, it was a great option for when you needed a simple database to store non-critical data. But then everyone decided that MySQL was super-1337 and that all other databases were simply part of "the evil empire".

      As a result, REAL alternatives like PostgreSQL were just ignored when they came on the scene. After all, you have to perform *gasp* maintenance on Postgres! *sigh*

    5. Re:It Sounds Pretty Basic by shirai · · Score: 4, Informative

      The answer is yes it does show you how to setup massive high transaction systems.

      Up until I picked up this book, I was not comfortable with using MySQL for the extremely high performance apps I have to write. Yes, they are web-based (meaning they fit into the MySQL appropriateness factor) but things like backups and scalability really scared me off of it.

      This book talks about indexes but doesn't focus on them as the magic bullet. Rather it is very specific with lots of need to know items of importance.

      Some things that it covers are:

      1. How to setup replication with different architectures including master/slave, master/slave/slave (another level), replication rings, dual master replication with slaves and more.

      2. RAID (software vs. hardware), RAID types, IDE vs. SCSI. I know. I'm sure you've thought of these things already but most people will find something new here because there is a lot on it. It is quite dense.

      3. Selection of an OS. Seriously. It actually compares threading models on OSes compared to performance with MySQL. This is great stuff.

      4. Selection of a filesystem on an OS. Yes, nitty gritty. Includes talk about journaling vs non-journaling benefits and warnings.

      5. Load balancing mySQL. How to do this through a load balancer. Why and how it's different than load balancing web servers.

      6. How to backup live using replication.

      7. I love this: Detailed information on how the replication system actually works. I love understanding the inner workings because then I know how/why things went wrong.

      8. The benefits of myISAM/InnoDB table types including how they store/retrieve data, the features available, etc.

      I've been optimizing databases for web use for some time and this isn't a book you should be tired of because it claims "high performance." Okay, maybe I'm a geek, but this book kept me up at night figuring out how I should be architecting a system.

      --
      Sunny

      Be my Friend

    6. Re:It Sounds Pretty Basic by kfg · · Score: 0, Flamebait

      If they don't, why are they calling themselves professionals.

      Because they get paid for their misinformation.

      KFG

    7. Re:It Sounds Pretty Basic by Godeke · · Score: 1

      As someone who hires "industry professionals" I have to say that I'm not surprised that there are people out there claiming to be such and not knowing about index types. Heck, a good number of programmers think data storage equals flat files or if they are trendy XML DOM dumps. I have a friend graduating (BS) this year, with a whopping single database class under his belt, so the schools aren't teaching databases. I have had interviews with people who answer "how would you implement a many to many join in ANSI-SQL" with a blank stare (for a DATABASE position, mind you).

      No, I don't feel these people are professionals, but they insist upon calling themselves such and prefer to purchase books that say "for professionals" rather than "for dummies".

      --
      Sig under construction since 1998.
    8. Re:It Sounds Pretty Basic by coene · · Score: 1

      I don't use MySQL for valuable data. But I do consider myself a professional, and I do use MySQL for "convenience records" - data that is merely a cache, or set of aggregate data that can be regenerated easily, and has to be queried a LOT (thousands of times per second). MySQL fits the bill here. Postgres, the database we use for the rest of our needs, couldn't handle so many queries without having locking issues.

      Sometimes the lack of caring about data integrity (and therefore using less locks on a high-UPDATE table) can be good. To each their own.

    9. Re:It Sounds Pretty Basic by ViolentGreen · · Score: 3, Informative

      No subselects

      That is huge. I used MySQL about 6 months ago (though it wasn't the most recent version.) I was very disappointed. I use subqueries extensively. Since these were not available, I ended up with messy and slow joins.

      The non-standard SQL isn't too bad to figure out as long as you have google. There needs to be a SQL translator java applet or something.

      In addition, the version that I used didn't support views. Is this fixed now?

      Regardless, I gave it a chance. The next time I need database services for personal use I'm going with PostgreSQL.

      --
      Not everything is analogous to cars. Car analogies rarely work.
    10. Re:It Sounds Pretty Basic by cowgoesmoo2004 · · Score: 1
      Perhaps you should consider writing a book review... especially if you could whip this reply off in a matter of minutes.

      I can only imagine if you spent the time to fill out some detail for each of the "bullets" you raised. Then it would be easy for a prospective purchaser to know if the book had something to offer to them.

    11. Re:It Sounds Pretty Basic by dprust · · Score: 2, Informative

      Let's be fair -- technology is a very broad subject and nobody can know everything. A lot has to do with how a professional programs with a database. For example, a person may write applications that use an existing database for years, learning all kinds of complex queries and techniques without ever having the access to create an index. A professional is really one who gets paid to do something. I think a person who knows at least 80% of everything there is to know about a specific topic like MySQL would be called an "expert" instead.

      I totally agree on those books. When I'm going to look for a book and see the word "For Dummies", I think, "Hey, I'd be a dummy if I /wasn't/ looking to learn!"

    12. Re:It Sounds Pretty Basic by Anonymous Coward · · Score: 0

      Stored Procedures and subselects are both addressed in later versions.

    13. Re:It Sounds Pretty Basic by ca1v1n · · Score: 4, Informative

      Sounds like you've been using PostgreSQL for quite a while, since your criticisms generally apply to rather old versions of MySQL. I hate all databases with equal passion, so I figured I ought to set the record straight.

      No data integrity

      MySQL has a modular design, allowing you to use any of several database backends, each with different design priorities. If data integrity is a priority for you, RTFM and pick the right backend.

      Completely non-standard SQL

      This is true of pretty much all DBMS's. MySQL's development cycle is quite delightfully quick, meaning that if people want certain features added, it doesn't take long to get them added.

      No extensibility in the engine (functions, stored procedures, etc.)

      Ummm... MySQL has those.

      No subselects

      Yes it does, as of version 4.1. It's not their fault if you're more than a full release behind.

      Weird handling of '0' vs. null

      The special handling of null forces you to write slightly less braindead code. This kind of handholding is becoming popular in the design of modern programming languages like Java and C#, with the recognition that the later in your development cycle you find a bug, the more it costs to fix it. Of course, there are still times when the programmer needs very explicit control, like when writing an OS kernel or a device driver. If you're writing an OS kernel or a device driver in MySQL, you've got bigger problems.

    14. Re:It Sounds Pretty Basic by cft_128 · · Score: 2, Insightful
      Those who have no choice and use open source projects that use it (e.g. www.ensembl.org) and/or those who deal with mostly read-only databases. Its not as bad as you make it out to be, and is good for some projects.

      It is generally faster than Postgres,

      It handles large datatypes more easily than Postgres (although Postgres, while odd, has some more flexibility)

      Postgres has an odd 8k limit per row (probably fixed by now though)

      Supports selects across databases.

      It has good replication support (with 4.x and later)

      Comes with some good admin tools in the normal distro.

      Decent full text indexing out of the box

      It is really about choosing the right tool for the job, each has their advantages and disadvantages.

      --

      Underloved Movies and Pub Quiz: donotquestionme.org

    15. Re:It Sounds Pretty Basic by shirai · · Score: 1

      Well since you asked for it and it was a nice comment, here are some more bits (but regrettably no full review).

      1. Comparison of storage engine features (e.g. myISAM, InnoDB, BDB and Heap) including lock granularity, file split-up, referential integrity, caching, compression, etc.

      2. RAID MyIsam Table for overcoming OS file size limitations.

      3. Benchmarking strategies including how to write custom benchmarks useful to you.

      4. Indexing structures: B-tree, vs. hash vs. the geographic R-tree indexes. Of course info on the MyISAM only Full-text indexes and its limitations (e.g. no stemming). Talks about how indexes are selected for use and how statistics comes into play for how they are selected and what you have to watch out for. Also how to reword queries to take advantage of indexes that might not otherwise be used.

      5. Some great stuff like how the Query Cache works. For example, it only looks at the first characters first. So if you have lots of whitespace before your "SELECT", it won't use the query cache. Good stuff to know.

      6. How to use hinting for indexes.

      7. How to buy hardware. CPU, memory, RAID, etc. ... Ah crud. I have to go but there is actually more. Anyways, worth picking up.

      --
      Sunny

      Be my Friend

    16. Re:It Sounds Pretty Basic by fatboy · · Score: 2, Insightful

      As a result, REAL alternatives like PostgreSQL were just ignored when they came on the scene. After all, you have to perform *gasp* maintenance on Postgres! *sigh*

      When PostreSQL came on the scene I attempted to use it, but there was little in the way of documentation. I went the MySQL route for my little database projects because it had better docs at the time. I never decided MySQL was "1337" and that all other databases were "evil".

      It had good docs, was cheap, and was good enough to get the job done.

      MySQL, if you can read this, your using it right now.

      --
      --fatboy
    17. Re:It Sounds Pretty Basic by boaworm · · Score: 1

      1. How to setup replication with different architectures including master/slave, master/slave/slave (another level), replication rings, dual master replication with slaves and more.

      Simple, just read a few pages on the MySQL documentation.

      2. RAID (software vs. hardware), RAID types, IDE vs. SCSI. I know. I'm sure you've thought of these things already but most people will find something new here because there is a lot on it. It is quite dense.

      What on earth does this have to do with MySQL specifically? And furthermore, if you dont know about _RAID_, and need to be told the benefits of SCSI/Raid, do you really think you need MySQL clustering ? Any PC/Server with enough ram will push queries like a maniac. Also, check #4...

      3. Selection of an OS. Seriously. It actually compares threading models on OSes compared to performance with MySQL. This is great stuff.

      Great, now that i've bought this book, i'm supposed to _CHANGE_ OS in my server farm, just to gain a few more queries/sec ?

      4. Selection of a filesystem on an OS. Yes, nitty gritty. Includes talk about journaling vs non-journaling benefits and warnings.

      Filesystems? Oboy, anyone who knows about SQL performance knows that if you access _DISK_ during execution, you are screwed anyway.

      5. Load balancing mySQL. How to do this through a load balancer. Why and how it's different than load balancing web servers.

      This seems like a decent chapter.

      6. How to backup live using replication.

      DOH! We are talking backup here, right? Backup means archiving a copy of the database. MySQL allows you to use the magic command "cp" (or scp, or rsync, or..) Best # so far ...

      7. I love this: Detailed information on how the replication system actually works. I love understanding the inner workings because then I know how/why things went wrong.

      The updated node sends commands to the other machines to update as well. If something goes wrong, you could rollback the transaction.

      8. The benefits of myISAM/InnoDB table types including how they store/retrieve data, the features available, etc.

      This one could be interesting as well.

      I dont mean to troll here, but most of those #s are either offtopic or irrelevant. To create a high-performance *SQL database, just have enough ram, ie, _NO_ disc access except for sporadic logging etc. Furthermore, it's title is "High Performance MySQL", so what we are interested in are what bottlenecks we might run into with MySQL specifically, not stuff like disc/raid. Tell me about some of the quirks with indexing, tell me how to tune threads, how to determine good from bad queries, how to monitor performance.

      If you need clustering for some reason, just create a low latency backbone for master/slave connections. It's easy to configure, just read the docs.

      --
      Probable impossibilities are to be preferred to improbable possibilities.
      Aristotele
    18. Re:It Sounds Pretty Basic by tijnbraun · · Score: 1

      Just curious. What kind of maintenance do you on your Postgresql database?

      I have (I think) put the following things in cron:
      Vacuum full in the weekend.
      Analyze every night.
      Reindex once a week.
      Cluster every night.
      And I have the vacuum daemon Running.
      (And a backup every night)

      Are there any things I should do manually?

      I can't find any good postgres books about these issues. Ok, the newsgroups are ok, but a book would be still welcomed.

      (I have two postgresql books, but both of them don't really tell things I'm think are interesting (specially the oreilly really stinks)

    19. Re:It Sounds Pretty Basic by AKAImBatman · · Score: 3, Informative
      MySQL has a modular design, allowing you to use any of several database backends, each with different design priorities. If data integrity is a priority for you, RTFM and pick the right backend.

      As I understand it, InnoDB (which I assume is what you're referring to) costs money to acquire and as such is NOT part of the standard database engine. It's really just another database engine using MySQL as a facade.

      This is true of pretty much all DBMS's. MySQL's development cycle is quite delightfully quick, meaning that if people want certain features added, it doesn't take long to get them added.

      MySQL is the ONLY database that won't execute:
      select * from "My Table"
      Instead, you have to execute:
      select * from `My Table`
      Do you have ANY idea how much that screws up those of us in the database tools business?

      No extensibility in the engine (functions, stored procedures, etc.)

      Ummm... MySQL has those.


      In 5.0 Alpha. i.e. Not here yet.

      No subselects

      Yes it does, as of version 4.1. It's not their fault if you're more than a full release behind


      Again, 4.1 is a non-production release. The latest stable is 4.0.

      The special handling of null forces you to write slightly less braindead code.

      So you're saying that inserting zeros instead of nulls (or no value at all) for auto-increment columns leads to less brain-dead code? All it does for me is make my database tools that much more complicated to code.
    20. Re:It Sounds Pretty Basic by AKAImBatman · · Score: 1

      No, that's pretty much it. But MySQL supporters often criticize Postgres for Vacuums, so that's why I made my comment about *gasp* maintenance. After all, that cron job just kills them. :-)

      Of course, some of us never need to vacuum because we never delete any data. :-)

    21. Re:It Sounds Pretty Basic by smurfi · · Score: 2, Informative

      InnoDB is free, it comes with the basic MySQL distribution and is GPLv2'd.

      You can change compatibilit settings, so you're mistaken on the ` vs. " problem.

      4.1 is in "gamma". Good enough if you ask me.

      Autoincrement is non-standard SQL anyway, so what do you expect?

      True -- if you absolutely cannot live without stored procedures, triggers, views and all the rest then MySQL may not (yet) be the right database for you. But nobody suggested otherwise.

    22. Re:It Sounds Pretty Basic by ajs · · Score: 2, Informative

      The non-standard SQL isn't too bad

      MySQL's SQL is quite standard, and while there are differences, they are either the lack of a small number of features (like sub-selects, which you go into below) or very baroque trade-offs, which most users will not encounter.

      I suspect you're instead running into assumptions about what is and is not "standard" based on what you've used in the past. MySQL follows the ANSI SQL standard as closely as all of the other databases I've used, having its own small quirks and LOTS of extensions, but basically doing as good a job of standards compliance as, say, gcc does with the ANSI C standard.

      I understand your frustration, though. I'm an old Sybase user, and I miss the Sybase extensions in many places.

      No subselects [...] That is huge

      It is?! Why? All a sub-select does is instantiate a temporary table on the fly. You can, of course, instantiate a temporary table on the fly, so what is a sub-select buying you?

      The only possible advantage would be the possibility of lazily evaluating a very large subselect, but I've come across that need in perhaps 2 or 3 situations in the course of 12 years of working with databases in production environments... how is this "huge"?

      In addition, the version that I used didn't support views.

      You're showing your lack of depth here. By definition a system that lacks sub-selects lacks views (as views are just named sub-selects).

      All that said, sub-selects are being added to MySQL, but people who pay for work on the core server don't ask for these features as much as certain others, and that's how the folks at MySQL decide what to put effort into.

    23. Re:It Sounds Pretty Basic by AKAImBatman · · Score: 1

      InnoDB is free, it comes with the basic MySQL distribution and is GPLv2'd.

      Seems you're right. I misread the InnoDB page.

      You can change compatibilit settings, so you're mistaken on the ` vs. " problem.

      How does that help tools vendors or anyone attempting to write in support for MySQL? Most users can't or won't run it in ANSI compliant mode because they already have non-compliant code running on the same server.

      4.1 is in "gamma". Good enough if you ask me.

      NOT good enough for business. Unless it's marked stable, it doesn't go into a production environment.

      Autoincrement is non-standard SQL anyway, so what do you expect?

      I expected sensible handling of nulls vs. zeros. Again, no other database thinks this way, so MySQL is simply making things more complicated for tools and software vendors.

    24. Re:It Sounds Pretty Basic by Dan+Ost · · Score: 1

      I'd like to address two of you points that appear to be dated:

      It is generally faster than Postgres
      It is my understanding that this is no longer true in the general case, and
      is not true in any case once you start hitting the db with a moderate amount
      of concurrent requests. MySQL simply doesn't scale as well as PostgreSQL.

      Postgres has an odd 8k limit per row (probably fixed by now though)
      Indeed, fixed some time ago.

      --

      *sigh* back to work...
    25. Re:It Sounds Pretty Basic by Anonymous Coward · · Score: 0

      > You're showing your lack of depth here. By definition a system that lacks sub-selects lacks views (as views are just named sub-selects).

      Wow, it's amazing what I can be taught on slashdot. Tell me again how sheeps bladders may be employed to prevent earthquakes?

    26. Re:It Sounds Pretty Basic by daviddennis · · Score: 1

      cp or scp will not successfully back up a running mySQL database because data is stored in cache and rarely flushed. For applications where the database must run continuously, you would in fact need to back up through replication.

      It would also help with load balancing at the same time.

      D

    27. Re:It Sounds Pretty Basic by ViolentGreen · · Score: 1

      MySQL's SQL is quite standard

      I was just using the terminology of the post I replied to. I'm not exactly sure what the standard is though I know MySQL has differences then the other implimentations that I have used.

      All a sub-select does is instantiate a temporary table on the fly. You can, of course, instantiate a temporary table on the fly, so what is a sub-select buying you?

      Can you? I seem to remember not being able to do that. It could just be the version I was using though. I meant that it is a huge thing for me because I use it very often.

      You're showing your lack of depth here. By definition a system that lacks sub-selects lacks views (as views are just named sub-selects).

      Perhaps so. I don't know the inner workings of it. That does make sense though.

      --
      Not everything is analogous to cars. Car analogies rarely work.
    28. Re:It Sounds Pretty Basic by DrSkwid · · Score: 1


      note it's "High Performance MySQL" not "MySQL for Dummies"

      --
      There are places where the networks are not touching,and there are places where they are-Boeing's Lori Gunter
    29. Re:It Sounds Pretty Basic by samjam · · Score: 1

      4.1 is in "gamma". Good enough if you ask me.

      NOT good enough for business. Unless it's marked stable, it doesn't go into a production environment.

      Thats a very broad statement,and not true of many business.

      Also, in my experience mysql gamma releases are more stable than the "stable" releases of some very expensive (unless you can get an 85% discount) commercial databases. But then its a simpler product.

      Sam

    30. Re:It Sounds Pretty Basic by cft_128 · · Score: 1
      Can you? I seem to remember not being able to do that. It could just be the version I was using though. I meant that it is a huge thing for me because I use it very often.

      create temporary table temp1 select ....
      Works quite well and is fast.
      --

      Underloved Movies and Pub Quiz: donotquestionme.org

    31. Re:It Sounds Pretty Basic by ajs · · Score: 2, Informative
      Yeah, sub-selects are mostly just temporary tables.

      That is to say, if you:
      create temporary table foo select f.id from fiz f, fam m where f.name = m.name;
      select b.stuff from bar b, foo f where b.id = f.id;
      Then that's the same thing as:
      select b.stuff from bar b where b.id in (select f.id from fiz f, fam m where f.name = m.name)
      The only difference being that for sub-selects, since it's done server-side, you can optimize by evlauating the sub-select lazily. Some databases that support sub-selects do not do this optimization, but most do and in some cases it's a big win. It's just a rare enough case that it's not that big a deal for the general discussion of the value of sub-selects vs other features that people want.
    32. Re:It Sounds Pretty Basic by pizza_milkshake · · Score: 1

      your off-the-cuff review was better than the "real thing"

    33. Re:It Sounds Pretty Basic by fitten · · Score: 1

      There are so many things wrong with your post I won't even start. Some other folks have commented on them already. I'd guess that you probably are one of the people who should read the book. Many of your statements are just wrong out of ignorance.

    34. Re:It Sounds Pretty Basic by LogicX · · Score: 1

      I agree with you sam --
      If people just go around blindly allowing anything with a 'stable' stamp on it to make it vs. something said to be in 'beta' -- you're allowing someone else to determine just how 'stable' it is -- instead of taking a clear look at it yourself. I've seen plenty of 'stable' 1.0 releases that totally sucked compared with all the time and energy thats gone into 4.1 - now gamma. I'll take mysql 4.1 anyday -- We've been running it on our 'production' mysql webservers for months -- albeit we allow a few weeks of testing on the dev ones each time a new revision comes out -- but thats something anyone should do, with a stable version of a program too.

      Go mysql team!

      --
      May this post be indexed by spiders, and archived for all to see as my Internet epitaph.
    35. Re:It Sounds Pretty Basic by harlows_monkeys · · Score: 1
      Completely non-standard SQL

      just like PostgreSQL and Oracle. Your point?

    36. Re:It Sounds Pretty Basic by LurkerXXX · · Score: 1

      Maybe, but understand that some of us who are MySQL admins may have been thrust into that position as a result of our jobs and not necessarily because we were trained on databases.

      That's obvious. If you were trained in databases, you wouldn't be using MySQL :P

    37. Re:It Sounds Pretty Basic by budgenator · · Score: 1

      I thought slashdot ran on a DB2 backend, and assumed it was kind of a sponcered deal.

      --
      Apocalypse Cancelled, Sorry, No Ticket Refunds
    38. Re:It Sounds Pretty Basic by AKAImBatman · · Score: 1
      select * from "My Table"
      The above works on Oracle and PostgreSQL. It DOES NOT work on MySQL. That's my point.
    39. Re:It Sounds Pretty Basic by drew · · Score: 1

      MySQL does have it's uses. I worked for a now defunct internet advertising company while i was in college. They used MySQL extensively in the ad serving process- they would create a queue of ads for a given visitor to a site, and store the queue and the history of ads served to that visitor in MySQL. They also used it to do various queries for targetting purposes, such as geographic targetting. Oracle was used to keep track of all of the important data (numbers of impressions, click-thrus, etc.- the stuff that made them money) but it was nowhere near fast enough to do all of the work that MySQL did. So there were a couple of dedicated Oracle database that did all of the tracking and logging, while each pool of webservers had it's own MySQL database to do targetting and such. If a MySQL database got corrupted (it happened pretty frequently- there were at least 40 MySQL databases) we just truncated the user queues and histories and copied a backup of the tables used for targetting and they were up and running again. the worst that happened was that somebody who had been served an ad by that server might see the same ad again before the other ads it was supposed to serve. They were using this setup to serve over 6 billion banners a month when doubleclick bought them out.

      I can't say whether this is still true now, but at that time ('99 or so) for doing simple queries MySQL was so far beyond anything else out there in terms of speed that it wasn't even funny. Of course you took a big hit in data integrity to get that speed, but in some cases it's worth it if that's what you really need. Of course, most people who use MySQL don't really have that need, bt that's not to say it doesn't have its uses.

      --
      If I don't put anything here, will anyone recognize me anymore?
    40. Re:It Sounds Pretty Basic by Anonymous Coward · · Score: 0

      MySQL, if you can read this, your [sic] using it right now.

      503 Service Unavailable. Brought to you by Slashdot and MySQL.

    41. Re:It Sounds Pretty Basic by harlows_monkeys · · Score: 1
      So? There are plenty of things in standard SQL that don't work in PostgreSQL. There are plenty of things in standard SQL that don't work in Oracle.

      You made lack of conformance to the SQL standard a line item in the things wrong with MySQL when every other major SQL database suffers from the same problem.

    42. Re:It Sounds Pretty Basic by AKAImBatman · · Score: 1
      I don't think you understand how serious of an issue this is. For every other database the SQL I listed works. EVERY other database. Why? Because it's the minimal part of the ANSI spec that everyone supports. Now thanks to MySQL, tool vendors like myself have to write special SQL just to select from a table. What that means is that either you code for MySQL, you code for the standard, or you write ugly "if(mysql())" statements everywhere.

      BTW, the equivalent code is the ugliness you see below:
      select * from `My Table`
    43. Re:It Sounds Pretty Basic by Anonymous Coward · · Score: 0

      "...my knowledge of databases doesn't extend much farther than select/insert/update."

      Dude, you should totally look into DELETE next. ;-) Sorry, couldn't resist.

    44. Re:It Sounds Pretty Basic by Anonymous Coward · · Score: 0

      MySQL has a modular design, allowing you to use any of several database backends, each with different design priorities. If data integrity is a priority for you, RTFM and pick the right backend.

      Is there any backend that allows you to specify arbitrary constraints? For instance, I would like to guarantee that a given table is referenced by a foreign key in table 1 OR table 2 but not both. Is that possible?

      The trouble with all these databases is that they don't allow for complete specifications of business rules and fully normalized logical designs. I would love to be able to guarantee data integrity at the database level but no product allows it. Very frustrating.

    45. Re:It Sounds Pretty Basic by Anonymous Coward · · Score: 0

      You're showing your lack of depth here. By definition a system that lacks sub-selects lacks views (as views are just named sub-selects).

      Absolute horseshit. If subselects are usually implemented as temporary tables (not actually true, fuckwad), then you are saying that views are also implemented as temporary tables. This is not at all the case; views are basically stored queries, though you can assign arbitrary permissions to them, and you can change them without changing all your frontend code, making them quite powerful.

      You may be thinking of Oracle's materialized views/snapshots, which are implemented in disk-based storage, similar to temporary tables. But ordinary views are not subselects and never will be. Please check out some more sophisticated databases before you start running your mouth again.

    46. Re:It Sounds Pretty Basic by carnivore302 · · Score: 1

      Although I understand your post was meant to be funny, I feel an urge to point out that mysql can very well be used in enterprise environments. Me, I administer a mysql database of over 2TB, with one table holding 3.5 billion records, growing some 20 million every day. The thing is still blazingly fast.

      And look at the mysql home page. There are some really large companies out there using mysql.

      Support is excellent too, and available in all sorts. From the do-it-yourself lookups via google to payed support that will have experts log in to your machine and fix any problem you have.

      I say mysql is great. A friend of mine always says "know your database". What he means is: look at the problem you have and select a database for it. For us, that definitly is mysql.

      --
      Please login to access my lawn
    47. Re:It Sounds Pretty Basic by jadavis · · Score: 2, Interesting

      MySQL has a modular design, allowing you to use any of several database backends

      Do they actually have constraints yet? For all the RDBMS apps that I can think of, I wouldn't go without constraints. Last I heard, MySQL thinks 2004-02-31 is a date.

      Also, I'm a little wary of the mix-and-match feature policy, perhaps because I don't understand it. Are there any features not available in the InnoDB tables? Can I do master-master replication and everything, all with transactions? I would be really troubled if each feature is it's own table type or something crazy like that, I want to be able to use all the features at once.

      --
      Social scientists are inspired by theories; scientists are humbled by facts.
    48. Re:It Sounds Pretty Basic by JamieF · · Score: 2, Insightful

      >>Weird handling of '0' vs. null
      >The special handling of null forces you to write slightly less braindead code.

      Way to spin it, dude. Here's another way to look at it: The retardo handling of null in MySQL makes your work as a programmer harder. It's just a pissing match argument: only a wuss needs those training-wheels features. Real Programmers never make mistakes.

      create table foo (bar integer not null, biz integer not null);
      insert into foo (bar) values (1);
      select * from foo;
      +-----+-----+
      | bar | biz |
      +-----+-----+
      | 1 | 0 |
      +-----+-----+

      That's just the MySQL dudes deciding that silently doing the wrong thing is better than giving you an error message. Hmm, 0 isn't null, so technically the NOT NULL is not violated, but that's not what it does on other databases, and in SQL books, etc.

      This sort of thing is consistent with MySQL AB's attitude toward product design and standards compliance: we don't feel like implementing it because it's hard, and/or the people who are asking for it don't really need it, and/or we don't really understand the issue, so we'll just document it and that makes the problem go away.

      So, you have to learn all sorts of oddities about the way that MySQL differs from everything else, and that makes your job as a programmer harder, which means more bugs and slower coding.

    49. Re:It Sounds Pretty Basic by ajs · · Score: 1

      I won't justify the parent flamer's choice of language by quoting directly, but I will take issue with the point that views are not sub-selects. Views most certainly are sub-selects, and as evidence for that look at the fact that you can either implement sub-selects using temporary views or (more naturally) that you can implement views using sub-selects trivially. Logically, a stored query that acts as a table *is* a sub-select.

      That does not mean that an implementation is tied to directly translating any use of a view to the equivalent sub-select query. Far from it: there are many optimizations to be had in terms of caching and applying indicies to the view itself. But again, even optimizations of the implementation don't change the basic nature of the construct, which is a stored sub-select.

      As far as sub-selects being implemented as temporary tables... they most certainly are, though there are relational optimizations to be had and the user-concept of a "temporary table" might not map to the internal implementation at all. Again, optimization does not change the fundamental nature of the beast. Please feel free to respond using something other than curses and vague claims. I'll respond in kind.

    50. Re:It Sounds Pretty Basic by DittoHead · · Score: 1

      "... you're allowing someone else to determine just how 'stable' it is ..."

      Yeah, you're allowing the project developers to determine the stability of their project code. Beta? Gamma? What do they know anyway?

    51. Re:It Sounds Pretty Basic by ttfkam · · Score: 1

      No data integrity

      MySQL has a modular design, allowing you to use any of several database backends, each with different design priorities. If data integrity is a priority for you, RTFM and pick the right backend.

      And all of those backends fail on data integrity. How? Is the following logically correct?

      'My new purple crayon' == 0

      Show me a single programming language where this evaluates to true. Show me a reasonable human being who say that this seems reasonable. And yet if I try to insert that string into a column marked integer, 0 is what I get. No error. No warning. No notification. No equivalent of a compiler error. Nothing except "query successful".

      That my friend is an example of data integrity problems. And if you feel the desire to cite programmer error, you are right. Inserting a string into an integer column is programmer error. In the real world, programmer error happens. The issue isn't the programmer error. The issue is that your datastore never lets the programmer know there was an error. That is fundamentally wrong for a datastore. Hell, that's fundamentally wrong for any software at all!

      Completely non-standard SQL

      This is true of pretty much all DBMS's. MySQL's development cycle is quite delightfully quick, meaning that if people want certain features added, it doesn't take long to get them added.

      Try comparing PostgreSQL's SQL syntax to SQL92. Do the same with MS SQL, Sybase, Firebird, Oracle, SQLite, etc. Yes there is deviation, but the lionshare is standard.

      Now look at MySQL. No, I don't mean unique features of MySQL. I mean the features that are common with all other RDBMS software. It's not even close. Quoted integers are acceptable?

      And don't get me started on the CHECK keyword. CHECK constraints are in my opinion a bread and butter feature of any database. MySQL used to throw an error when a CHECK statement was used. What was their solution? Implement CHECK constraints? No! Their solution was to have the parser accept the CHECK syntax and simply not do anything about it. No error. No warning. Hmmm... I could see the conversation like it was yesterday...

      "Hey! Why didn't my CHECK work?"

      "Didn't you read the docs? The documentation clearly states that the CHECK syntax is ignored."

      "Why would I? Every other piece of software I use, if it supports the syntax, it supports the functionality. Do I need to make sure that UPDATE is supported too?"

      "Of course not. Stop bashing MySQL."


      It's like anytime anyone mentions any of MySQL's proverbial elephants in the corner, the response is to claim that grossly inappropriate behavior is justified by its adequate documentation.

      No extensibility in the engine (functions, stored procedures, etc.)

      Ummm... MySQL has those.

      In the alpha branch, yes. Until that is in code that is recommended for production, I'm sorry. You don't get to use it for a feature comparison.

      Weird handling of '0' vs. null

      The special handling of null forces you to write slightly less braindead code.

      I don't see this at all. If you write "braindead code", any other RDBMS would cry foul and spit out an error. So what does the programmer do? S/he fixes the code and moves on.

      With MySQL, submarine bugs are easier to write since there is no real validity check being made. Your argument is like someone saying you should write C code with nothing but void pointers. Why not! If you know what you're working with and you aren't making braindead decisions, everything should work! People who can't handle simple offsets are just a bunch of stupid crybabies who need extra hand holding from the compiler.

      In the real world, "NOT NULL" means "I need actual data there." "DEFAULT 0" means "if you don't supply data, put in this defau

      --

      - I don't need to go outside, my CRT tan'll do me just fine.
  8. Check your SQL.... by tcopeland · · Score: 4, Informative

    ...before blaming MySQL! Ha ha!

    Er, anyhow, as an apology, here's an open source SQL analyzer.

    1. Re:Check your SQL.... by JAgostoni · · Score: 1

      I actually just submitted a comment in total agreement. Same goes for PHP or any other script/programming language out there that is widespread. People can write some scary code and then just blame it on the engine.

    2. Re:Check your SQL.... by tcopeland · · Score: 4, Informative

      > People can write some scary code and then
      > just blame it on the engine.

      Yup. There's a really good article by Stephane Faroult on OnLamp about writing better SQL... it's right here.

      After reading that article, I went through some code I'd written and found some places where I was using DISTINCT incorrectly in exactly the way he described.

    3. Re:Check your SQL.... by JAgostoni · · Score: 1

      I cannot count the number of times a DISTINCT has burried a flaw in some SQL of mine and developers on my TEAM. Now I give people a slap on the wrist during code reviews for using DISTINCT when not absolutely necessary.

    4. Re:Check your SQL.... by LordLucless · · Score: 1

      I had a look through that article - and the replacement query he describes uses subqueries. I think MySQL 4.1 works with subqueries, but most hosting companies I have seen still only offer 4.0. If MySQL only supports bad queries, then it is then engines fault, because the poor developer has to hack around its incomplete implementation.

      --
      Just because you're paranoid doesn't mean there isn't an invisible demon about to eat your face
  9. Nearly worthless review by stevewz · · Score: 2, Informative

    This review didn't really tell me anything that I couldn't have gleened by simply reading the Table of Contents. I get the impression the author either didn't actually read the book or he was too lazy to write more detail.

    Having said that, I'm intrigued by the title and will probably investigate the book anyway simply because it's a topic that directly pertains to some upcoming projects on my calendar.

  10. mysql backup by InsaneCreator · · Score: 4, Interesting

    I see one of the topics covered is backing up data. Would somebody care to explain how to make a consistent backup of a mysql database?

    Plain mysqldump does not export consistent data across tables. While you could lock all tables for the duration of backup, that's not likely going to be possible if you're aiming for "high performance". Mysqlhotcopy only works for myisam tables and simply tar-ing up data files is just asking for a disaster to happen. So how do you make a consistent backup without setting up a dedicated replication slave for this purpose?

    (No, this is not a troll; I am in fact looking for a good way to back up mysql data on my server)

    1. Re:mysql backup by smclean · · Score: 2, Informative
      I'm stepping out on a limb here, but I think you will end up having to lock all tables to make a real consistent backup of your database where all tables relational data is gauranteed intact.

      I've never had to do this but your problem interested me. Think about it, how could you possibly make the database spit out a perfectly frozen-in-time set of tables without actually freezing in order to make the set?

      It's not like you'd be able to tell the database "Make me a backup of the database as it was at is right now, at 12:45:23.183848!" and the database could continue modifying the content of the tables while still producing for you a perfect backup. Even things programmed into the database to perform an action like this would take as much overhead as locking the tables first anyway, right?

      The replication slave would work, though.

      --

      "'Yrch!' said Legolas, falling into his own tongue."

    2. Re:mysql backup by phobonetik · · Score: 1

      Know what you mean. We just do the 'fastest' options for mysqldump for a database, and given our systems (content managed sites) are (almost exclusively) used by businesses within a single country, this happening at 5am has proved to work, without hiccups, for several years. Given the backups occur on a daily basis, it doesn't seem to be much different that you've come up with a bit of data that is inconsistent amoung tables to where we've plain lost a several hours data,it now being half way through the day. Our "important" data (bookings, payments, sign-ups) etc are emailed as a backup to general disaster, and are useful in this case. Obviously this doesn't scale, but we don't need it to currently.

    3. Re:mysql backup by Anonymous Coward · · Score: 1, Informative

      It's called Multi Version Concurrency Control. It's what databases like PostgreSQL and Oracle use to do just this.

      Most databases do this one way or another. Those that don't have MVCC use the transaction log to achieve the same result, though you can (AFAIK) onlyl do a binary dump of the pages in the database in that case. Wherease MVCC lets you do a logiacal dump.

    4. Re:mysql backup by ArsonSmith · · Score: 1

      What we have started doing was using netfilers and putting the dbs in hotbackup mode and snap shotting them. then do what ever kind of backups we need on the snapshot. Linux LVM now supports snapshot so I am looking into doing this locally as well.

      --
      Paying taxes to buy civilization is like paying a hooker to buy love.
    5. Re:mysql backup by fingon · · Score: 4, Informative

      Check out LVM2 and snapshots.

      In principle, using LVM 2 system it's easy. Get database to consistent state on disk, take LVM snapshot of the logical volume involved, resume database, run dump on the snapshot, take out the snapshot when no longer needed.

      Beauty is that the size of snapshot is size of delta that is caused during the backup; therefore, this is practical for almost any sized database.

      --
      -- pending
    6. Re:mysql backup by shirai · · Score: 5, Informative

      Actually, lack of live backups was one of my biggest fears with using MySQL as well and the book actually does address this.

      The caveat: You need to set up replication. This is a good idea anyways for a high performance system in case the master crashes and you need another system available quickly.

      It works like this:

      1. Set up replication.
      2. Do a locked table backup on the slave.

      The nice things are the details like how the new version of MySQL uses two threads on the slave. One to copy the transactions to the local drive and the other to apply the transactions. Because of this, your local transaction log is up to date even if the transactions haven't been applied which subsequently means you don't have missing transactions if there is a crash on the master during the backup.

      This is seriously a good book. Okay, you might want to use PostgreSQL for its feature set but if you are doing write few read mostly with the web as the application, this book seriously gives you the confidence to use MySQL. Up until I read this book, I didn't know enough about MySQL performance implications to trust it for scalability. Now I know what I need to do to get it.

      --
      Sunny

      Be my Friend

    7. Re:mysql backup by smclean · · Score: 1
      *limb breaks*

      Yeah, you are right, I've never heard of MVCC, but the transaction thing makes perfect sense to me.

      I should have realized we'd left my area of expertise when the guy said he wasn't using MyISAM tables.

      --

      "'Yrch!' said Legolas, falling into his own tongue."

    8. Re:mysql backup by itchomatic · · Score: 1

      For our InnoDB stuff, InnoDB Hot Backup (http://www.innodb.com/order.php) was the only thing I could find that seemed to fit the bill.

      It's not perfect (it does lock the tables for a few seconds at the tail end of the backup), and it costs money... but I guess it all depends on how important that consistent backup is for you.

      We actually ended up doing both a hot backup and a replicated copy to another server. We use mysqldump on the replicated copy to produce a consistent dump file that can be used in the event that anything happens to our binary backup. It's also nice having a server we can quickly failover to in case the core one blows up.

    9. Re:mysql backup by perler · · Score: 1

      i do it with replication. i have another server connected to the main server and changed 3 lines in each servers my.cnf according to http://dev.mysql.com/doc/mysql/en/Replication.html - it takes minutes and afterward you can easily shut down and restart the backup mysql server to backup things the way you like..

      PAT

    10. Re:mysql backup by Anonymous Coward · · Score: 0

      I used LVM2 and snapshots. stop mysql, snapshot, start it again and tar -czvf the data

    11. Re:mysql backup by tzanger · · Score: 1

      I've never had to do this but your problem interested me. Think about it, how could you possibly make the database spit out a perfectly frozen-in-time set of tables without actually freezing in order to make the set?

      I dunno, PostgreSQL does a good job. :-)

    12. Re:mysql backup by pHDNgell · · Score: 2, Informative

      It's not like you'd be able to tell the database "Make me a backup of the database as it was at is right now, at 12:45:23.183848!" and the database could continue modifying the content of the tables while still producing for you a perfect backup.

      This is an example of why people who have used real databases are annoyed by mysql's undeserved popularity. People who have never seen a DB before build applications and processes around mysql's limitations and think that's the way things are.

      Why would you store data in something that can't do this?

      Sure Oracle (which does this among other things) is expensive, but people are pretty sure that if they put something in it, they can get it back out.

      Postgres (which does this among other things) is indisputably more free than mysql and outscales it without making it harder on the low end. (And I don't buy the embedded mysql thing...if you're embedding, embed sqlite which is still yet more free, smaller, and easier).

      --
      -- The world is watching America, and America is watching TV.
    13. Re:mysql backup by ianster · · Score: 1

      Are you using InnoDB tables? If you are, then adding the mysqldump option `--single-transaction` ought to give you a snapshot of the DB. This only works if the whole DB is using InnoDB tables (which if you care about data integrity as many here have posted on, you would be)

    14. Re:mysql backup by Anonymous Coward · · Score: 2, Informative

      % man pg_dump
      ...
      Description

      pg_dump is a utility for backing up a PostgreSQL database. It makes consistent backups even if the database is being used concurrently. pg_dump does not block other users accessing the database (readers or writers).

      ...

    15. Re:mysql backup by ahodgson · · Score: 1

      PostgreSQL has real transactions and multi-version concurrency :)

    16. Re:mysql backup by generalpf · · Score: 1
      It's not like you'd be able to tell the database "Make me a backup of the database as it was at is right now, at 12:45:23.183848!" and the database could continue modifying the content of the tables while still producing for you a perfect backup. Even things programmed into the database to perform an action like this would take as much overhead as locking the tables first anyway, right?


      It's easier than you think. It will just copy everything from the data file PLUS everything from the transaction log BEFORE the time the backup started.

      People who think MySQL is a Real Database don't know what a Real Database is/does.
    17. Re:mysql backup by skiman1979 · · Score: 1
      I haven't had an opportunity to look into it yet, but does InnoDB support foreign keys? I'm currently using MyISAM tables in MySQL, and although there is syntax in the SQL language for foreign keys, I read that the key is not actually stored, so it's not supported in MyISAM. That would make it a bit difficult to use something like referential integrity.

      As far as backups go, I have a cron job set up on my server to do a mysqldump --opt --all-databases > /path/to/file.sql. Of course, as others have pointed out, this would be better to perform on a slave mysql server, but I don't have a spare box for that. :-P

      --
      Having a smoking section in a public restaurant is like having a peeing section in a public swimming pool.
    18. Re:mysql backup by ahmusch · · Score: 1

      I'm stepping out on a limb here, but I think you will end up having to lock all tables to make a real consistent backup of your database where all tables relational data is gauranteed intact.

      A thought experiment to demonstrate why this is incorrect, with the postulate we are not working at a quantum level:

      Imagine a particle x.

      If you know the exact location of this particle at some time t(0), and you know completely every vector constituting its movement until time tn....

      Can you describe the exact location of particle X at time t(n), or at any time in between?

      Modern, robust, ACID-compliant (and expensive) database systems use exactly this type of system to take read-consistent backups. All Oracle does is defer writes to a tablespace in backup mode to ensure all changes have been written as of the System Change Number (time t(0) in the above example) in the header file for every data file in the tablespace.

      Heck, with those change vectors (as well as the rollback information), Oracle can tell allow you to *query* the database as of a given time. (within limitations, of course).
    19. Re:mysql backup by ianster · · Score: 1

      InnoDB supports foreign keys. To my knowledge its the only table engine in MySQL that does. http://dev.mysql.com/doc/mysql/en/ANSI_diff_Foreig n_Keys.html

    20. Re:mysql backup by kuhneng · · Score: 1

      (limb breaks)

      That's exactly the idea of a consistent backup, or of any non-blocking read consistent query.

      The implementation varies from DB to DB (Postgres keeps Multiple Versions of Concurrently updated data for as long as necessary, MVCC).

      Oracle has it. IBM has it. Postgres has it. MySQL doesn't.

    21. Re:mysql backup by Anonymous Coward · · Score: 0

      >Get database to consistent state on disk

      Well that rules out MySQL, since consistency is for pussies who don't like writing their own buggy ACID code at the application level.

  11. How does Yahoo! Finance use MySQL? by HerculesMO · · Score: 5, Interesting

    In what type of application do they use it? I would not imagine as a primary database, do they?

    --
    The price is always right if someone else is paying.
    1. Re:How does Yahoo! Finance use MySQL? by Anonymous Coward · · Score: 5, Informative

      Why not? Yahoo finance has been using MySQL for the last four years.

      "Yahoo! uses the MySQL database to power many of the services on Yahoo! Finance (finance.yahoo.com), a popular web site that provides the full spectrum of financial services and information. MySQL is behind Yahoo! Finance's high-traffic area of news headlines, which is aggregated from over 100 sources worldwide, as well as its stock charts, insider trading, SEC filings, conference calls and earnings reports, among others. "

      "Prior to its switch to MySQL in 2000, Yahoo! Finance managed its database needs through homegrown flat files and Berkeley DB databases. But those solutions proved to be inflexible and not scalable enough for their needs, with the volume of data nearly doubling every few months. "Our needs vary considerably, and that is one reason that MySQL works well with us. It handles the high-volume, 260 million record tables just as well as the small, low-volume ones. One of our servers handled over a quarter of a billion queries in a month-and-a-half, and it still has capacity to spare.""

      "Since Yahoo! Finance started using MySQL, the database's popularity has grown within the company, and many of Yahoo's other world-wide properties are now using MySQL too. "We have used MySQL far more than anyone expected," stated Zawodny. "We went from experimental to mission-critical in a couple of months. Once others saw it, they jumped on board.""
      ---

      I wonder why many people here hate MySQL even when it has been proven to be a very competitive database. Is it because MySQL is a foreign product?

    2. Re:How does Yahoo! Finance use MySQL? by Anonymous Coward · · Score: 0

      In what type of application do they use it? I would not imagine as a primary database, do they?

      Translation: "OMGZOR!@!* PoStGrESQL is SOOOOOOO mUcH BeTtEr!@(*) MySqL SuXXorS!!!! I 4m l337!"

    3. Re:How does Yahoo! Finance use MySQL? by phobonetik · · Score: 1

      I think a new segment should form on Slashdot, formed by a band of people whose only relation is the fact they oppose MySQL. They can then limit "feedback" by appointing a single person from the segment, as a spokesperson, so that the rest of us can divulge in useful conversations whenever MySQL hits the press on here, and continue to use MySQL happily where it is appropriate (which, is "most of the time") :P

    4. Re:How does Yahoo! Finance use MySQL? by HerculesMO · · Score: 1

      Hey, I was just curious :)

      I am not one to judge whether MySQL is inferior or superior to any given database.. all I know is that for my needs (my website) it's free, it works, and doesn't seem slow but then again, I only get ~5-7 million hits a month... not daily :)

      --
      The price is always right if someone else is paying.
    5. Re:How does Yahoo! Finance use MySQL? by Anonymous Coward · · Score: 1, Insightful

      Its because it has proven to be very competitive compared to text files and berkeley DB. Mysql is great at exactly this, replacing simple text based data management with basically the same thing only with an SQL interface. It doesn't compare to actual databases though, like oracle, db2, mssql and pgsql.

    6. Re:How does Yahoo! Finance use MySQL? by Anonymous Coward · · Score: 0

      I've read somewhere that they had all but eliminated the use of mysql and went back to using in house developed apps, because it didn't fit their needs.

      They're still using it, mind you, but only enough for that press release somebody else posted below to stay on mysql's site.

      Unfortunately I can't remember where I've read it or even if that source is to be trusted. :(

    7. Re:How does Yahoo! Finance use MySQL? by gnuLNX · · Score: 1

      I agree totally....MySQl certinaly works well in our small pharmaceutical company...manages a 4.5 millionm compound database flawlessly.

      --
      what?
    8. Re:How does Yahoo! Finance use MySQL? by Lazy+Jones · · Score: 3, Insightful
      One of our servers handled over a quarter of a billion queries in a month-and-a-half, and it still has capacity to spare.

      OK, that explains a lot. 250 mil. queries in 45 days is normal for a small company like us and definetely peanuts for Yahoo - i.e. I'd say that this allows the conclusion that nothing particularly important or demanding (i.e. heavily used compared to Yahoo's overall traffic) is built using MySQL at Yahoo. Just to put this into perspective, a little dual Xeon box here does more than 5 million *INSERTs* every day and a large number of SELECTs (we update each of our currently 2.4 million product prices at between 1 and 20 times per day - the exact number of INSERTs per day is currently unknown to me). We use PostgreSQL 7.4.5 though, so we don't consider this a particularly impressive feat ;-)

      --
      "I love my job, but I hate talking to people like you" (Freddie Mercury)
    9. Re:How does Yahoo! Finance use MySQL? by Anonymous Coward · · Score: 0
      I wonder why many people here hate MySQL even when it has been proven to be a very competitive database. Is it because MySQL is a foreign product?

      Are you f*cking kidding me? You posted as AC, so I suspect so, but I'll bite...

      Of course, we all know that Linus is an red-blooded gun-totin' pure-bred Southerner. Yes indeed... Linux was coded from scratch in the back woods of Alabammer.

    10. Re:How does Yahoo! Finance use MySQL? by -noefordeg- · · Score: 1

      Agree with you.

      We run a few online webshops, and average 50 queries per second. That makes almost 200mill queriers in 45 days, on a singel processor system (AMD 1.2ghz, 2gb memory, fast IDE disks).
      And we still have 'room to spare'. :)

    11. Re:How does Yahoo! Finance use MySQL? by cft_128 · · Score: 1
      Just to put this into perspective, a little dual Xeon box here does more than 5 million *INSERTs* every day and a large number of SELECTs (we update each of our currently 2.4 million product prices at between 1 and 20 times per day - the exact number of INSERTs per day is currently unknown to me).

      Just to put that into perspective, I know at least one of our MySQL databases has done over 100 million queries in one day while loading genomic mapping data into our system. That equated to about 7-10 million inserts, the rest selects. That was also running on a dual Xeon (~2.4Ghz, a Dell PowerEdge 2600). It was averaging about 1400 queries a second, topping out at over 3000 at times.

      --

      Underloved Movies and Pub Quiz: donotquestionme.org

    12. Re:How does Yahoo! Finance use MySQL? by MythMoth · · Score: 1, Offtopic

      While I'm sure there are some idiots who "hate" MySQL, that's really missing the point.

      MySQL doesn't do a lot of things that the big commercial databases do.

      If you don't need those things, then MySQL is an outstanding product. If you do need those things, MySQL is useless - worse, if you need those things but assume that MySQL has them it becomes dangerous. Specifically I'm thinking of referential integrity and invalid value behaviour here.

      So MySQL gets a bad name with (some) people who use "real" databases, and a good name with (some) people who use "toy" databases - even though it's perfectly suited to some very "real" applications, and some of the competitors would be wild overkill for "toy" applications.

      Your requirements sound like an excellent fit for MySQL. Good choice.

      --
      --- These are not words: wierd, genious, rediculous
    13. Re:How does Yahoo! Finance use MySQL? by oneishy · · Score: 1

      I think you might have missed the denomination on yahoos query numbers. They said 250 Billion queries every 45 days... That is a bit different than comparing with 250 million.

      So by what you said, it would take a Thousand of your dual Xeons to handle the same load with PostgreSQL (5M * 45days * 1000servers) = 225B compared with your 225M now. This of course assumes your servers are already at capacity and a few other things, and it just goes by your insert numbers... but anyway you get the idea.

    14. Re:How does Yahoo! Finance use MySQL? by Lazy+Jones · · Score: 1
      I think you might have missed the denomination on yahoos query numbers. They said 250 Billion queries every 45 days... That is a bit different than comparing with 250 million.

      I think you'll find that the Anonymous Cowards posting I replied to says: "One of our servers handled over a quarter of a billion queries in a month-and-a-half, and it still has capacity to spare". A "quarter of a billion" is still 250 millions around here.

      --
      "I love my job, but I hate talking to people like you" (Freddie Mercury)
    15. Re:How does Yahoo! Finance use MySQL? by drew · · Score: 1

      I guess I can't speak for everyone else, but for my part, I wouldn't say that I hate it, just that I see very few uses where it is appropriate. In this particular case, it seems very appropriate- They need extremely high nearly read-only performance, and all of the data being stored in the system is coming from outside providers, so they are not losing mission critical data when it ineveitably takes a dump. It looks as though they are using it as a sort of data cache/aggregator rather than as a store of business-critical data. The first is something MySQL excels at, the second is not.

      --
      If I don't put anything here, will anyone recognize me anymore?
    16. Re:How does Yahoo! Finance use MySQL? by JamieF · · Score: 2, Insightful

      >Prior to its switch to MySQL in 2000, Yahoo! Finance managed its database
      > needs through homegrown flat files and Berkeley DB databases. But those solutions
      > proved to be inflexible and not scalable enough for their needs, with the
      >volume of data nearly doubling every few months

      So, a SQL database is better than flat files. Ergo, MySQL rules...?

      I'm glad that Yahoo has found a solution that works (or at least that the person who said that thinks it does) but that doesn't say anything about why MySQL is better or worse than another SQL database, or another kind of DBMS for that matter. Did they even do an evaluation, or did they just upgrade from DBM to MySQL and see no reason to consider other options?

      If Yahoo Finance loses a story, or a stock chart, or a user account, or an email, big whoop, it's a huge free service. If they can run 30% fewer database servers, that's more important. So it probably is a good choice for them. But if there were actual money in transit they'd probably want to use something else for that (there's no reason the content management couldn't stay in MySQL, of course).

    17. Re:How does Yahoo! Finance use MySQL? by ahdeoz · · Score: 1

      I'd consider using PostgreSQL, but it seems the only people that use it are dorks that spend most of their time looking for MySQL postings on the internet and trolling about how Klingon-cool their database is. Show me a real user testimonial or even an article about using it that doesn't mention MySQL and I might give it a second look.

      (No offense to star trek fans, but you've got to admit that there are some fans that make the rest of you look like dorks.)

    18. Re:How does Yahoo! Finance use MySQL? by Lazy+Jones · · Score: 1
      I'd consider using PostgreSQL, but it seems the only people that use it are dorks that spend most of their time looking for MySQL postings on the internet and trolling about how Klingon-cool their database is

      So, that makes it a very cool-headed technical and competent decision on your behalf - congratulations ;-) (and it probably makes you a typical user of *that* particular database).

      --
      "I love my job, but I hate talking to people like you" (Freddie Mercury)
  12. I bought it 3 weeks ago... by ttys00 · · Score: 4, Informative

    ...and was surprised at a two things:

    1) The number of typos in the book. Didn't anyone proofread it before it went to print?
    2) How many features MySQL lacks. Don't get me wrong, I like it as its fast and easy to work with for web stuff, but I was surprised at how often I read "MySQL doesn't do x yet, but will in version y".

    The chapter on replication was worth the cost of the book for me, but I suggest people considering buying the book should wait until the 2nd edition.

    1. Re:I bought it 3 weeks ago... by kannibal_klown · · Score: 1

      Yeh,

      I've been waiting for VIEWS since forever. It's supposedly in the 5.0.1 Alpha, but I don't install MySQL on my development box unless it's Gamma or higher.

      MySQL may lack a lot, but just about the only thing I need that it doesn't have is views.

    2. Re:I bought it 3 weeks ago... by mslinux · · Score: 1

      A lack of features is a very common complaint about MySQL. There is a reason for this. Initially, MySQL concentrated on speed, not features. Other DBs such as Postgresql concentrated on features. Today, you see the two ideas coming together... MySQL is adding features (as you have found from reading the book) while Postgresql is getting faster. They started at different points, but that matters not in the end.

    3. Re:I bought it 3 weeks ago... by ergo98 · · Score: 1

      I'm not so sure if they concentrated on speed, or rather if speed naturally results from the lack of basic features like transactional integrity. dBase is tremendously faster than a real RDBMS in some scenarios (for instance mass bulk inserts), but this doesn't indicate that the dbase team focused on speed.

    4. Re:I bought it 3 weeks ago... by Bozdune · · Score: 1

      This is a really important point. Our app has to choose between record-by-record flutter-fill and dump/load functionality. God help us if dump load was slow. Thank God for MySQL.

      Sometimes you have to be a Neanderthal. If I want my hand held, I'll ask. Or I'll hold it myself. There are plenty of ways to implement transactional integrity with MySQL, you just have to do it yourself with code. Which means it is enormously faster, usually, than someone else's best guess at what you wanted.

      Finally, with all the wailing about transactional integrity and how much better this or that DB is than My, I wonder how many of the wailers have actually tried to RECOVER from a rollback? It ain't duck soup, people. Scary shit, replaying a 6-hour log file and holding onto your ass, praying. And sometimes, guess what? Some weird error occurs halfway through playback. Like the old joke, "Now you're fucked." I've seen it, lived it.

  13. Apples and oranges by Pan+T.+Hose · · Score: 4, Insightful

    You can keep "High Performance MySQL". I'm holding out for "Incredible Data Integrity Management with MySQL". :-D

    High Performance and Incredible Data Integrity? You are comparing apples and oranges. Make up your mind. It's either "High Performance with MySQL" or "Incredible Data Integrity with PostgreSQL." You can't have your cake and eat it too. If you need real ACID (Atomicity, Consistency, Isolation, and Durability) then by all means have it, but realize that it isn't free. Please read about the relational model and ACID. Read about the set theory and predicate calculus, about tuple calculus and relational algebra. This is complicated stuff which it is not cheap algorithmically. Most of people don't need ACID and should not be penalized by the algorithms that implement it. Those who need it will have to pay with performance, because there is only this much you can do in one cycle. A pure ACID relational model is not for everyone, some people will only need a persistent object store instead of a relational database. The most important thing to realize is that databases are not magic. They implement complicated algorithms in code that have to be run by the CPU and by definition cannot be a one-size-fits-all solution. So please don't oversimplify it by saying "I want everything with no paying for anything, and also a pony." Use the right tool for the right job.

    --
    Sincerely,
    Pan Tarhei Hosé, PhD.
    "Homo sum et cogito ergo odi profanum vulgus et libido."
    1. Re:Apples and oranges by AKAImBatman · · Score: 4, Insightful

      Touchy, aren't we? I think that most of us here understand core database concepts (including ACID) just fine. (At least, I hope most of us do.) The problem is that it has become all to common to use MySQL in situations were data-loss and data-integrity can be measured in real dollars. MySQL is completely the wrong tool for the job, and as such has been heavily criticized.

      In fact, it is very difficult to understand where MySQL fits in today's technological environment. On one hand you have truly powerful database servers that guarantee data safety. On the other hand you have simple embedded databases that run much faster than MySQL. Where does this leave MySQL? With nothing more than a lot of "it's Open Source!" momentum.

      High Performance and Incredible Data Integrity?

      I never said that. Please don't act like I did.

    2. Re:Apples and oranges by smclean · · Score: 1, Interesting
      I think that most of us here understand core database concepts (including ACID) just fine.
      Oh? Read some of the other comments on this article and see :)

      I agree that MySQL is overused. It's the M in LAMP, and so people assume its what they have to use for everything.

      But what are these alternatives you mention that are so much better suited for every possible scenario? For me, MySQL is the perfect mix of speed and features. Give me some leads on other databases I should consider replacing it with, I'll check them out.

      --

      "'Yrch!' said Legolas, falling into his own tongue."

    3. Re:Apples and oranges by yamla · · Score: 1

      The reason I use it is that MySQL allows for easy master-master replication. PostGreSQL doesn't, though they are working on it. Looks like it may be a while before they get it working, though.

      --

      Oceania has always been at war with Eastasia.
    4. Re:Apples and oranges by tzanger · · Score: 2, Informative

      I guess it's kind of humorous then that PostgreSQL beats the pants off of MySQL in any kind of moderate DB usage. MySQL can do a damn fine job for simple INSERTs and SELECTs but throw a hundred users at it or a few WHERE or ORDER BY clauses (or all of the above) and MySQL shows its true colours: Made by Fischer-Price. Throw in complex queries including subselects or try using views and... oh right, I forgot... It's not got any kind of real relational power behind it... my mistake!

    5. Re:Apples and oranges by Anonymous Coward · · Score: 0

      Really now, MUST your sig say, PhD?

    6. Re:Apples and oranges by AKAImBatman · · Score: 3, Informative

      But what are these alternatives you mention that are so much better suited for every possible scenario?

      There is, of course, no tool that meets the needs of every possible scenario. Instead, there are tools that meet the needs of common scenarios. Most of what MySQL is being (improperly) used for could be easily replaced with PostgreSQL. Postgres offer similar performance, but takes a minor hit for data integrity. As a bonus, you gain sub-selects, stored procedures, SQL-92 compliance, and other features of a "real" SQL database.

      MySQL is quite decent for something like a blog, but why even bother with a complete server in those cases? An embedded database such as Berkeley DB or HSQL (formerly Hypersonic) would provide better performance and would get rid of security issues inherent in running a complete database server.

      In still other instances, SQL databases are misused for large object data storage. In many of these instances, an Object Database such as ObjectStore (or your favorite open source choice of the 1000+ options) will provide better performance, without sacrificing much in the way of database management. (Standard database management tools are usually insufficient for dealing with databases containing large amounts of LOBs.)

      Basically, the choice in database and database technology should be carefully weighed against the application instead of saying "I know SQL and MySQL is 1337!"

      Some other database options include:

      DaffodilDB
      SAP DB
      FireBird DB
      Cloudscape (Soon to be open source)
      Xindice (XML Database)
      ObjectDB
      DB4O
      Prevayler

    7. Re:Apples and oranges by aled · · Score: 1

      LAMP sounds LAME to me. I prefer to have the posibility of changing one of the tiers. For example I can deploy on Windows or Solaris or whatever instead of Linux. After all there is Apache, MySql and PHP for most platforms. May be I want other web server. But it seems that PHP and MySql have some mutual atraction (API) that becomes difficult to change the database.
      Plus MySQL isn't free and Postgresql is :-)

      --

      "I think this line is mostly filler"
    8. Re:Apples and oranges by borg1238 · · Score: 2, Insightful

      Damn it, where's the "no sense of humor" mod option?

    9. Re:Apples and oranges by RogerWilco · · Score: 1

      I have a shortlist of stuff I need a database to do, and AFAIK MySQL is the best for this job, please suggest any replacements:
      - It needs to run on windows (95-XP)
      - It must be allowed to be distributed with closed source apps.
      - It should be automatically installed and running when the user
      clicks 'OK' a few times in the installer that also installs the closed source app.

      For the commercial licence fee you can do this with MySQL.
      We used Paradox and Access before MySQL, and compared to those MySQL is a blessing.

      Our clients are the equivalent of WinME laptop-carrying traveling salesmen, assuming any knowlegde is to much, but they do but the 5,000 Euro highly specialized applications we make.
      (one was running Windows95 inside VMware on WindowsXP, thinking he was running only win95, and wondering why our application could not find the serial port, I'll not explain any further, it was a mess)

      --
      RogerWilco the Adventurous Janitor
    10. Re:Apples and oranges by AKAImBatman · · Score: 1

      Based only on your list, I would have looked at an embedded database like HSQL or Berkeley DB. These would have been less intensive on your target machines, faster performing, more secure (no network connection), and easier to install. Unfortunately, I can't give a realistic recommendation without knowing more about the application.

      Your specs could even be met with databases like PostgreSQL, Firebird, Xindice, and quite a few others.

    11. Re:Apples and oranges by Gorath99 · · Score: 3, Insightful
      There is, of course, no tool that meets the needs of every possible scenario. Instead, there are tools that meet the needs of common scenarios. Most of what MySQL is being (improperly) used for could be easily replaced with PostgreSQL. Postgres offer similar performance, but takes a minor hit for data integrity. As a bonus, you gain sub-selects, stored procedures, SQL-92 compliance, and other features of a "real" SQL database.

      Actually, MySQL has been moving in that direction for quite some time now. MySQL 4.1 has subqueries and MySQL 5.0 has stored procedures.

      Of course, MySQL still doesn't quite have the functionality of Postgress, especially in the data integrity department, so there's certainly truth to your claims. However, my point is that it's not standing still. It's certainly improving, so we can't just keep pointing out its past mistakes.
    12. Re:Apples and oranges by jadavis · · Score: 2, Insightful

      I encourage you to examine PostgreSQL 8.0 upon it's release, which will most likely be November or December of this year.

      It fits your requirements perfectly without requiring the commercial license. Of course the installer bit depends also on your application, but it shouldn't be too hard to do. You may want to check out the pginstaller project here: http://pgfoundry.org/projects/pginstaller/.

      --
      Social scientists are inspired by theories; scientists are humbled by facts.
    13. Re:Apples and oranges by SpamapS · · Score: 4, Informative

      You PgSQL nazis are just being ignorant. Do you think because MyISAM isn't transactional that the whole database isn't?

      MySQL's InnoDB offers the same level of data integrity as PgSQL does, and about the same performance. Where it wins is in the flexibility to choose table-by-table whether you want InnoDB, MyISAM, or the new NDBCLUSTER. There's also the universal availability and familiarity when it comes to the mysql protocol and libraries.

      As far as performance.. you might want to look at numbers on well designed threading systems such as Solaris or Linux 2.6. Most of PostgreSQL's gains are erased by MySQL 4.1's excellent code when it is combined with those OS's proper threading.

      The only place I see PgSQL winning the performance war is the query optimizer.

      PgSQL will be doomed to a BSDesque life. Touted by too many people with no social skills, nobody will ever be able to see how good it is, because they just get shouted down when asking questions.

      --
      SpamapS -- Undernet #Linuxhelp
    14. Re:Apples and oranges by Not+The+Real+Me · · Score: 4, Insightful

      Where does MySQL fit? Storing discussion forums, news sites, caching mailing lists, storing information, etc.

      I would never use MySQL for financial transactions but for non-monetary usage it's a good general purpose database as long as your don't require complex business rules filled with lots of cascading actions.

    15. Re:Apples and oranges by AKAImBatman · · Score: 1

      However, my point is that it's not standing still. It's certainly improving, so we can't just keep pointing out its past mistakes.

      A fair enough point. When MySQL "arrives" I will be the happiest person on the block. Unfortunately, I am very worried that MySQL won't so much as fix the issues as cover them over with incomplete, incorrect, or incompatible features as they have in the past. :-(

    16. Re:Apples and oranges by DrSkwid · · Score: 3, Interesting


      I prefer to FAPP

      (FreeBSD, Apache, PostgreSQL & PHP)

      but that makes me sound a bit of a wanker

      The thing about MYSQL and speed is that "it's really quick at returning rows" isn't the only metric of speed.

      Let's see how fast you can work out how to do an INTERSECT!

      or how about a huge join with LIMIT 5, mysql (and postgres) do the whole join and then return the 5 records. With Postgres you can utilize stored procedures to do the join *after* the required records have been worked out and return the rows via a stored procedure. I got a 3 minute query down to 3 seconds like that !

      You may as well turn your LAMPs off, they are dim.

      --
      There are places where the networks are not touching,and there are places where they are-Boeing's Lori Gunter
    17. Re:Apples and oranges by gmack · · Score: 1

      Easy replication provided you don't *breathe* on it.

      They seem to have fixed most of the glitches in the latest version however GET DATA FROM MASTER still doesn't quite work all of the time.. and the slave logs will all change names if you happen to change the hostname on the server.

      What kind of an idiot came up with naming the logs after the hostname anyway?

      Were now looing into CA's Ingres now that it's Open source.

    18. Re:Apples and oranges by Anonymous Coward · · Score: 0

      PostgreSQL performs better than MySQL as well. Those MySQL developers must be truly lousy.

    19. Re:Apples and oranges by Anonymous Coward · · Score: 0

      Uhm, apparently "Incredible Data Integrity" in PostgreSQL not only comes at the price of performance (almost all of it), but also random corruptions of indices. Damn that day the programmers figured "We don't need to do all the work, let's just use PostgreSQL."

    20. Re:Apples and oranges by tzanger · · Score: 2, Insightful

      You PgSQL nazis are just being ignorant. Do you think because MyISAM isn't transactional that the whole database isn't?

      I never said that; I have not seen a decent benchmark showing InnoDB having *any* kind of decent performance though. And it still does nothing to address the scalability issues of MySQL with hundreds of users, nor does it address the speed issues of MySQL in general when performing complex queries.

      MySQL's InnoDB offers the same level of data integrity as PgSQL does, and about the same performance.

      Show me. And I'm not talking "how fast can I do a thousand SELECT * FROM mytable;" -- good, real-life queries with WHERE clauses, maybe cursors and ordering.

      Most of PostgreSQL's gains are erased by MySQL 4.1's excellent code when it is combined with those OS's proper threading.

      Again, show me. Postgres' performance increases in 7.4.x. are nothing to sneeze at, and certainly have nothing to do with the threading changes between 2.4 and 2.6.

      PgSQL will be doomed to a BSDesque life. Touted by too many people with no social skills, nobody will ever be able to see how good it is, because they just get shouted down when asking questions.

      That's funny; I see something similar with the general MySQL crowd; it's like watching a bunch of pizza-faced 13 year olds drool over the latest and greatest nVidia card and trying to convince their parents to buy the $500 video card to replace the $250 they bought only 6 months ago so they can play Doom3, when all the parents want to be able to do is balance the checkbook and get some real work done.

    21. Re:Apples and oranges by ahdeoz · · Score: 1

      I've got customers very similar to your traveling salesman (my parents) and was wondering about a good way for them to install MySQL + Webserver + custom apps for them. Willing to share any ideas?

    22. Re:Apples and oranges by RevAaron · · Score: 1

      MySQL and PostreSQL are overkill for a lot of the projects people use them for. More power to them, but it seems a bit much to use MySQL to keep track of what CDs or DVDs you own.

      But screw j00 all. I just use a OODBMS and let whomever wrote the thing worry about this crap. All I want to do is store data and pull it back to work with when I need it again. Yeah, it's slower, but if I'm not writing some 1000 user web app it takes a tenth of the time to write code for and get the work done.

      Not to say that OODBMS are inherently slower or that everyone should switch to them- it's just what I use because it does what I want and need with minimal fuss.

      --

      Working toward a usable PDA environment in the spirit of Newton OS: Dynapad
    23. Re:Apples and oranges by yamla · · Score: 1

      Does CA's ingres allow for master-master replication? Do you know of any other databases (at all) that do? It's not totally transparent with MySQL, but the only change I need to make is breaking up my primary keys to multipart primaries. Not much work, but it would be MUCH nicer if auto increment fields just worked properly even when the two databases were disconnected.

      --

      Oceania has always been at war with Eastasia.
    24. Re:Apples and oranges by fitten · · Score: 1

      Do you think because MyISAM isn't transactional that the whole database isn't?

      Unfortunately, this is the default state of the system. Since most folks who jump on the MySQL bandwagon don't understand why you'd actually need data integrity or when to use it, they never both to learn more about databases and continue down their MyISAM path for everything. Just like most things, if someone doesn't understand it, they don't know what to use it for so they just keep going in the same way for all the problems they encounter. I guess ignorance truly is bliss.

    25. Re:Apples and oranges by ahdeoz · · Score: 1

      Well, HSQLDB is embedded, but only inside a java application. It persists data in a flat file that it loads on start. If you've got more than 1000 records, you might want to go with something more robust, like MS Access. With everything in memory, it can do some things faster than MySQL. It's really only meant for simple testing, or small databases like a you'd have in a POS catalog. But it is a memory hog, requires a JVM running (which is also a memory and cpu hog.) A well tuned MySQL database, even reading and writing to files a lot will often outperform HSQLDB.

      Berkeley DB isn't a "database" in the sense most people think. It's not relational, and there's no SQL (there may be a SQL driver, but it'd be totally inefficient. It's better to think of it as a big random access collection, since everything is a name-value pair. It's really designed for caching, which it excels at.

    26. Re:Apples and oranges by gmack · · Score: 1

      Were very much in the beginning stages of looking into it so I cant say for sure. However they do Advertise it

    27. Re:Apples and oranges by AKAImBatman · · Score: 2, Informative

      Close, but not quite.

      Well, HSQLDB is embedded, but only inside a java application

      True, but that's why I said I'd need to know more about his application.

      It persists data in a flat file that it loads on start.

      Not quite. HSQLDB uses an SQL script of the transactions to recreate the database. However, this script is significantly reduced in size if you use CACHED tables.

      If you've got more than 1000 records

      Again, depends on the data and the application. 1000 records is usually not a whole lot to cache in memory. And with CACHED table types, you could easily go up to millions of records without stressing the user's machine.

      you might want to go with something more robust, like MS Access.

      That's what he was trying to get away from.

      Berkeley DB isn't a "database" in the sense most people think.

      It's not an "SQL" database, but it is a database. (Technically, even a filesystem is a type of database, but we won't go there.) Again, it depends on his application. He gave no requirement that the database support SQL, so I went with the most natural assumption: It doesn't need it.

      It's not relational, and there's no SQL (there may be a SQL driver, but it'd be totally inefficient.

      That's an amusing statement, because MySQL does exactly that. :-)

    28. Re:Apples and oranges by johnnyb · · Score: 1

      "Do you think because MyISAM isn't transactional that the whole database isn't?"

      Transactionality isn't the only part of data integrity. One of the most glaring errors of MySQL is that inserting an integer value too large into a column gets TRUNCATED instead of getting an error.

      I've heard of other problems relating to foreign key handling, but do not remember them.

      "The only place I see PgSQL winning the performance war is the query optimizer."

      For complicated programs, this is the kicker, too. And the optimizer has only gotten better. You can now do UNIONs and GROUP BYs in views, and they even can get optimized based on the rest of the query.

      In addition, write-ahead logging can give dramatic speed improvements for insert/update-heavy databases. Also, PG's MVCC also adds performance to insert/update-heavy databases.

      So, you have better performance by the query optimizer for complex queries, and better performance for update/inserts through WAL and MVCC. It looks like MySQL's performance gain is minimal, and only on small applications. I wonder if those applications might be sped up if they just used more advanced queries?

      "There's also the universal availability and familiarity when it comes to the mysql protocol and libraries."

      I don't see anything special there. In Perl, it's the same no matter what DB you use. In PHP, it's the same no matter what DB you use. The only difference is that the query language under PGSQL is much more comprehensive than MySQL.

      The protocol doesn't really matter -- under Windows you can install an adapter for either, in Perl you can install an adapter for either, same w/ PHP.

    29. Re:Apples and oranges by johnnyb · · Score: 1

      Looks like PGSQL already fits your bill. For the windows installer, you'd have to contact a company like CommandPrompt.com until PG releases version 8.0. (8.0 beta 3 was recently released).

    30. Re:Apples and oranges by Ryosen · · Score: 1

      Then use WinAMP. Not everything needs to be done on Linux (heresay, I know, but, hey, I'm in a generous mood today).

      >>MySQL isn't free

      I'm sorry, did I miss something? Or did you? A license for MySql only needs to be purchased if you are developing proprietary software and are not releasing it under the GPL. That is, you are not releasing your source code with the product. This is in keeping with the GPL and the open-source foundation under which MySql was created.

      MySql has introduced a second licensing option in order to address the concerns that the GPL can create for persons that fall in this category. In this case, you are required to purchase a license. You are still giving "something for something". It's just that, rather than giving your creation back to the Open Source community from whose efforts your are directly benefiting, you are giving cash.

      This seems to me to be a very fair arrangement and is an excellent example of how OSS projects can thrive while still maintaining the basic underlying principles of OSS.

      --

      Ryosen
      One man's "Troll, +1" is another man's "Insightful, +1".
    31. Re:Apples and oranges by caluml · · Score: 1

      I got annoyed with not being able to do a WHERE item NOT IN ( SELECT * FROM stuff ) in MySQL 4.0, and spent 20 minutes s/mysql_/pg_/g the whole site, and recreating the tables, and reimporting to Postgres. (Oh, and a few mysql_error() -> pg_last_error($res) and mysql_affected_rows() ->pg_num_rows() extras too. )

      One query that took 25 seconds in MySQL took 0.8 in Postgres - with no changes at all. I had already done everything to get the query as fast as possible in MySQL and I was quite chuffed with 25 seconds - I think I got it down from about 130. MySQL seems to be faster for very simple stuff, but throw some nasty long LEFT JOINs etc and it soon bogs down.

    32. Re:Apples and oranges by Cajal · · Score: 2, Informative
      MySQL's InnoDB offers the same level of data integrity as PgSQL...
      No, it doesn't.
    33. Re:Apples and oranges by DrSkwid · · Score: 1

      lol wtf are *you* on about

      I guess this must be a typo in my kernel config

      # SMP OPTIONS:
      #
      # SMP enables building of a Symmetric MultiProcessor Kernel.
      # APIC_IO enables the use of the IO APIC for Symmetric I/O.

      options SMP # Symmetric MultiProcessor Kernel
      options APIC_IO # Symmetric (APIC) I/O

      FYI information I serve 3Million hits / 1 mill page impressions on 100k uniques per month on my dual P3 IBM Netfinity with 1Gb RAM, a bit over-specced but I thought best to get some headroom !

      That comes in at around 15 million SQL queries per month.

      --
      There are places where the networks are not touching,and there are places where they are-Boeing's Lori Gunter
    34. Re:Apples and oranges by CoolVibe · · Score: 1

      Call me again when PostgreSQL does replication like MySQL does. MySQL may be a glorified cardfile, but some things (like master/slave replication) are stuff that I _need_ for work, and it works well.

    35. Re:Apples and oranges by killjoe · · Score: 1

      Firebird is what you are looking for. Check it out

      --
      evil is as evil does
    36. Re:Apples and oranges by harlows_monkeys · · Score: 4, Insightful
      Postgres offer similar performance, but takes a minor hit for data integrity

      I've got a few million reports that basically look like this: some header information of the form "variable=value" (about a dozen possible values of variable), and then one or more sections. Each section has a header in the same format as the report header, and from 0 to 5000 entries of the form "item:status:file", where name is one of about 5000 items, status is one of 5 status values, and file is the full path of a file containing information about the item.

      I want these in a database. So, I've got a table that represents reports, which gets the information from the header. Another table represents report sections, and has the information from the section headers, and references the report table. A third table represents the "item:status:file" lines. I've actually got a bunch more tables, because the database is in third normal form, but the above is the gist of it.

      With MySQL, I write the straightforward code to parse reports and insert into the database. It takes about a day to import all my reports, and querying them runs at decent speed.

      With PostgreSQL, it starts out about an order of magnitude slower at importing, and after four days, when it is isn't anywhere near done, it is running even slower.

      PostgreSQL does seem to query a bit faster than MySQL (but see below), but that slowness in getting the data in is a killer.

      As far as queries go, PostgreSQL seems fast, if you've done "VACUUM ANALYZE" on the database. If you don't do that, it is incredibly slow. For example, if I have a simple table, indexed on an integer field, and do a select that was "WHERE that_field = 123", it is fast, and uses the index, but if I do "WHERE that_field IN (123, 456)", it doesn't use the index, unless you analyzed the table--it scans the whole freaking table. MySQL is smart enough to actually use an index without you having to run a special command first. Geesh!

      So, here's what it comes down to, and why sooooo many people choose MySQL over PostgreSQL: PostgreSQL seems to require that you know a fair amount about database stuff to actually make it work well, even if you aren't trying to do fancy stuff. MySQL, when you are doing simple stuff, just works, and works well.

      Oh, one more thing. When I was importing my few million records into MySQL, it barely loaded down my machine. With PostgreSQL, pretty early on, the disk light came on and stayed on, and doing things on the machine became painful. If that were a shared machine, say, a web server with shared hosting, replacing the MySQL in LAMP with PostgreSQL would be a disaster.

    37. Re:Apples and oranges by harlows_monkeys · · Score: 2, Informative
      MySQL can do a damn fine job for simple INSERTs and SELECTs but throw a hundred users at it or a few WHERE or ORDER BY clauses (or all of the above) and MySQL shows its true colours: Made by Fischer-Price

      Last time I checked, Dark Age of Camelot usually has more than a few hundred users, and they are using MySQL. So much for your amusing theory.

    38. Re:Apples and oranges by wximagery · · Score: 1, Insightful


      "You can't have your cake and eat it too"

      Sure you can. It's called Oracle ... but that's a damn expensive cake.

    39. Re:Apples and oranges by drew · · Score: 1

      I know MySQL has improved dramatically since the last time I used it (early 2001), both in speed and features. However, I don't believe I will ever use it again, not because I don't like the features or the database, but because I no longer trust the developers to know what they are doing. I have read too many announcements on their website, and too many mailing list discussions that indicate that they don't really understand the theory behind what they are trying to write, why people ask for certain features, and what those features really mean.

      I'm not database nazi- over the last year or so i have used pgSQL, MS SQL Server, and Oracle for various projects and reasons. All of them are solid applications and I feel that I can trust them to do what they are supposed to do. (say what you will about Microsoft, SQL Server is a pretty solid product. Besides, they didn't write it, they lifted it from Sybase)

      I continue to follow MySQL's progress from time to time, and maybe in time I will use it again, but it willtake a long time for the MySQL developers to prve to me why I should trust them with my data.

      --
      If I don't put anything here, will anyone recognize me anymore?
    40. Re:Apples and oranges by Anonymous Coward · · Score: 0

      Um... it has several commercial replication solutions and one released ~ 6months ago (SlonyI) is fully open-sourced and very robust. They will have multi-master in ~12-18months.

      I'm sure this book is decent and all and MySQL can certainly be used in a high performance manner but those of us that dont want to use toys (with minimal features) use more powerful DB's like Oracle, PostgreSQL, MS SQL Server etc.. why do ppl have this crazy idea that most statements per second == whinnah? I can assure u that for multiple connections MySQL's scalability is a 1/3 of the top three I mentioned above.

    41. Re:Apples and oranges by boneshintai · · Score: 4, Informative

      I'm going to forgo the opportunity to mod this fascinating debate to discuss this. You mentionned that PostgreSQL was "an order of magnitude" slower on imports -- my guess is that your import tool simply connected and started running INSERTs? It's a common-ish trap for new users. If there is no transaction running, each statement is its own transaction, with all the BEGIN and COMMIT overhead that goes with it. Wrapping your import in a BEGIN; ... COMMIT; pair would probably have sped it up considerably.

      There is an option to change the transactionality of standalone commands, too, so that they implicitly start a transaction that doesn't end until you COMMIT it.

      VACUUM is something of a contentious point among PostgreSQL users, certainly. On the one hand, I can't think of an algorithmic way to automatically schedule statistic updates, and apparently tracking them in real time causes more of a performance hit than is needed. On the other, having to set up an external task (either a cron task or vacuumd) rather than being able to configure postmaster to vacuum itself on a schedule or other parameters is kind of a pain.

      PostgreSQL's default memory cache size is exceedingly small -- this is an issue with the default configuration that can lead to excessive disk reads and writes.

      In short, pgsql does require somewhat more effort to initially configure and maintain. On the other hand, for a lot of people it's very much worth that effort.

    42. Re:Apples and oranges by Anonymous Coward · · Score: 0

      You PgSQL nazis are just being ignorant.

      So wait, acknowledging that MySQL has severe faults that PostgreSQL doesn't now qualifies somebody as a nazi?

      MySQL's InnoDB offers the same level of data integrity as PgSQL does

      No it doesn't. For a start, if you ask for InnoDB and it isn't possible, you get MyISAM instead without warning. That alone means that all the data integrity that InnoDB supposedly gets you just doesn't happen. And you don't get to hear about it.

      Somebody else has mentioned large numbers being silently truncated to smaller values, and somebody else has posted a link to a list of gotchas. The problems with MySQL extend far beyond mere non-ACID compliance. It's the cavalier attitude the developers have towards data integrity that matters. Even if somebody sent them a patch to fix all the issues without any speed loss, you wouldn't be able to trust MySQL because equally poor decisions could be made before the next version is released.

      There's also the universal availability and familiarity when it comes to the mysql protocol and libraries.

      Remind me which of the two databases has license issues?

      Most of PostgreSQL's gains are erased by MySQL 4.1's excellent code when it is combined with those OS's proper threading.

      So you mean, on certain platforms, MySQL are equally fast, but PostgreSQL has far superior data integrity? And on other platforms, MySQL can't catch up?

      MySQL is like Windows - there are much better options, but MySQL has the overwhelmingly large market share, which means all the books, courses, etc, are focused on MySQL rather than other options.

    43. Re:Apples and oranges by harlows_monkeys · · Score: 3, Insightful
      I'm going to forgo the opportunity to mod this fascinating debate to discuss this. You mentionned that PostgreSQL was "an order of magnitude" slower on imports -- my guess is that your import tool simply connected and started running INSERTs? It's a common-ish trap for new users. If there is no transaction running, each statement is its own transaction, with all the BEGIN and COMMIT overhead that goes with it. Wrapping your import in a BEGIN; ... COMMIT; pair would probably have sped it up considerably.

      Actually, I did have the complete processing for each report in a begin/commit pair. That was one of the reasons I was experimenting with PostgreSQL...with MySQL there was the annoyance of having to make sure the whole report was going to be OK before inserting any of it, so I couldn't just add to each table as I processed the report, because I might find out near the end that the report is bogus and want to discard it.

      In short, pgsql does require somewhat more effort to initially configure and maintain. On the other hand, for a lot of people it's very much worth that effort

      Right...and for a lot of people, they need something the works in the common case without that effort, and so there's always going to be an important place for MySQL.

      What I'd really like to see is a good document included with PostgreSQL that says "So..you've got a big application that just works fine in MySQL, but you want to convert? Here's what you do" that covers all these things.

    44. Re:Apples and oranges by Dirtside · · Score: 2, Informative

      My company uses MySQL for most of its databases (we're still in the lengthy process of converting everything over from Oracle, but, ah, we have a handful of critical tables with over 100 million rows each). We do several thousand queries per second, non-stop, and have minimal problems with MySQL. (We have fewer problems with MySQL than we did with Oracle!)

      I'm sure Postgre is great and all, but the evidence does seem to indicate that MySQL is suited for at least some intense applications.

      --
      "Destroy science and religion. Science would re-emerge exactly the same; but not religion." - Penn Jillette, paraphrased
    45. Re:Apples and oranges by aled · · Score: 1

      Good for them but I mean I get better quality from other products and it doesn't hurt that they are free.

      --

      "I think this line is mostly filler"
    46. Re:Apples and oranges by _xeno_ · · Score: 3, Informative
      You're forgetting the other great thing about MySQL: it doesn't bother reporting data errors!

      A recent example involves BLOBs and MySQL. The basic feature request is to be able to attach various binary files to other data. One of the ways to do that is with a BLOB. So I create a simple test so we can compare this approach with various other approaches.

      Code up the test, and attach a file. Pull the file back out, and try and read it back in. Doesn't work, the file is corrupted. At first I figure I'm interfacing with the database wrong, since I've never used BLOBs before. So I muck around with the API some, still no go.

      Then I realize I should check how much data is going in and how much is going out. Turns out I'm trying to attach a 70KB file, and I'm getting 64KB back. A little bit of research tells me that MySQL BLOB types only support up to 64KB. Oops. Change to a MEDIUMBLOB (I think) and then it works.

      Turns out instead of flagging it as an error to insert more data into the field than it can contain, MySQL just trunucates it. (So if you try and do something like insert "MYSQL SUCKS" into a CHAR(5) column, it'll say it successfully inserted one row. A select will then get "MYSQL" back.)

      Some page out there has a really nice list of things that MySQL will do with bad data. Besides trunucating values, it has some interesting ways of handling bad numeric values.

      --
      You are in a maze of twisty little relative jumps, all alike.
    47. Re:Apples and oranges by tzanger · · Score: 1

      So you are saying you trust your data to a system that has an unbelievable list of gotchas -- you're saying that you prefer to try and keep all of these in mind when you write apps that use MySQL, and you are saying that you're not concerned about the data integrity issues brought up in this thread?

      What company do you own, again?

    48. Re:Apples and oranges by Anonymous Coward · · Score: 0

      MySQL's InnoDB offers [..] data integrity

      You lost me right around here.

      Seriously, database theory is a "solved problem". We know what databases should do. When is somebody going write a real database FIRST and optimize for performance SECOND?

    49. Re:Apples and oranges by Anonymous Coward · · Score: 0
      LIMIT 5,...do the join *after* the required records have been worked out

      This idea is quite useless (remember ORDER BY):
      you'll get back random 5 records always.
      It's how Oracle does the trick with "rownum < 6".

    50. Re:Apples and oranges by Dirtside · · Score: 2, Informative
      you're saying that you prefer to try and keep all of these in mind when you write apps that use MySQL,
      It probably says more about you than about me that you think that that list of "gotchas" (many of which are no longer valid, or are otherwise insignificant) is somehow difficult to keep in mind. None of the professional programmers I work with have any difficulty dealing with the quirks of the MySQL database environment, especially considering that that list of gotchas is tiny compared to the list of design quirks our site code has built up over the years. Hell, it's tiny compared to the quirks in any desktop GUI environment.
      What company do you own, again?
      I work at Neopets.com. 250 million pages a day, mean of 9 queries per page, which averages out to about 26,000 queries per second. And we're having many fewer DB integrity problems since we started switching back to MySQL from Oracle. What problems we do have arise when people do things like enable code that lets users do fulltext field searches 20 times a second on tables that have 200,000 rows. Our biggest tables have over 100 million rows.

      What company do you own?

      --
      "Destroy science and religion. Science would re-emerge exactly the same; but not religion." - Penn Jillette, paraphrased
    51. Re:Apples and oranges by wiwo · · Score: 1

      I constantly import databases from our (postgres) production systems to our test systems. The largest of this databases is about 1GB without BLOBS, most of the data is just numbers, so we have a lot of rows to insert. On the test system where we check out performance problems (Celeron 466, so slow, you notice every performance problem weeks before it could harm the production system) the import takes around 15 to 30 minutes. So this is far away from your 4 days.

      >> For example, if I have a simple table, indexed
      >>on an integer field, and do a select that was >>"WHERE that_field = 123", it is fast, and uses
      >>the index, but if I do "WHERE that_field IN
      >>(123, 456)", it doesn't use the index, unless
      >>you analyzed the table

      I guess you were using an old postgres version? It was a problem in the pre 7.3 versions, but since then IN statements are as fast or faster than joins.

    52. Re:Apples and oranges by Anonymous Coward · · Score: 1, Funny

      "So..you've got a big application that just works fine in MySQL, but you want to convert? Here's what you do"

      Just say "NO" ;-)

    53. Re:Apples and oranges by Anonymous Coward · · Score: 1

      He's somewhat right. Even if you have an ORDER BY, taking a large result, doing a join, and returning only some of the rows can be inefficient.

      Of course, in MySQL, you can just create a temporary (in memory) table from the large/ordered/limited table in one statement, then do another select with the join.

      In other words, not all of the query-speeding-up "tools" are available in MySQL yet, but there is frequently another way that's just as good.

    54. Re:Apples and oranges by nconway · · Score: 1
      Actually, I did have the complete processing for each report in a begin/commit pair.


      That will be cheaper than doing each INSERT in its own transaction, but it is still significantly slower than it could be. You should really be using COPY for loading lots of data into PostgreSQL, as it is significantly faster than INSERT. Other tips for improving bulk-loading performance:

      • Bulk load the data into the table first, then create indexes and foreign key constraints. Creating an index once the data has already been loaded is significantly faster than incrementally updating the index as each new row is added. (This advice applies to most RDBMSs, AFAIK.)
      • Consider temporarily disabling fsync.
      • Increase maintainence_work_mem and checkpoint_segments -- the former increases the amount of memory PostgreSQL will use for in-memory sorting (e.g. when creating B+-tree indexes), and the latter increases the number of WAL segments.
    55. Re:Apples and oranges by nconway · · Score: 1
      As far as queries go, PostgreSQL seems fast, if you've done "VACUUM ANALYZE" on the database.
      To nitpick, only ANALYZE is necessary to update the optimizer's statistics (which is the source of the performance problem you mentioned) -- VACUUM garbage-collects expired row versions, and VACUUM ANALYZE does both.
      For example, if I have a simple table, indexed on an integer field, and do a select that was "WHERE that_field = 123", it is fast, and uses the index, but if I do "WHERE that_field IN (123, 456)", it doesn't use the index, unless you analyzed the table--it scans the whole freaking table. MySQL is smart enough to actually use an index without you having to run a special command first. Geesh!
      Simply put, you need to run ANALYZE on any database before running any serious queries on it with PostgreSQL (you really ought to run ANALYZE periodically, say once a day or once a week -- it is a fairly fast operation). Doing any meaningful query optimization without statistics about the data is impossible, so the fact that PostgreSQL happened to do an index scan for one query but not the other is pure dumb luck.

      The problem with "just using an index" even if the table hasn't been ANALYZE'd is that there are plenty of situations in which using an index results in worse performance than doing a sequential scan (consider a query that fetches nearly all the rows in a table, for example: in PostgreSQL, using an index scan to process that query would mean reading in the whole index, then reading in the whole table -- and worse yet, doing both with random I/O, rather than the sequential I/O that is obviously used by a sequential scan). The situation becomes much more complex when you consider more realistic queries (involving joins, subqueries, set operations, etc.) -- here, to have even a hope of getting a good plan you need to gather some statistics about the distribution of data in the tables (which is what ANALYZE does). Gathering that information on-the-fly is possible, but it would add overhead to routine database operation.

      So I'm not disagreeing with you -- the need to run ANALYZE does impose a small burden on new users of PostgreSQL, and of course it would be great if PostgreSQL could pick a better query plan without ANALYZE being run. But actually figuring out how to do that is rather more difficult...

    56. Re:Apples and oranges by CoolVibe · · Score: 1
      "The Postgres-R project is being referred to as "a working model", because it is currently based on PostgreSQL-6.4.2 and needs to be merged with the PostgreSQL-7.2 code base along with a good deal of integration effort."

      In other words: still useless, but here soon(tm)

    57. Re:Apples and oranges by Just+Some+Guy · · Score: 1
      Some page out there has a really nice list of things that MySQL will do with bad data. Besides trunucating values, it has some interesting ways of handling bad numeric values.

      You were thinking of this page. It convinced me to avoid MySQL completely.

      --
      Dewey, what part of this looks like authorities should be involved?
    58. Re:Apples and oranges by _xeno_ · · Score: 1

      Yep - that's the page I was thinking of all right. Thanks!

      --
      You are in a maze of twisty little relative jumps, all alike.
    59. Re:Apples and oranges by Captain_Chaos · · Score: 1

      In other words: "MySQL did something I didn't expect because I didn't read the documentation"?

    60. Re:Apples and oranges by _xeno_ · · Score: 2, Informative
      Oh please. Even if I had read the documentation, I probably wouldn't have known the file was too large at first anyway. I just picked a fairly small binary file to use. It turns out it was over the MySQL BLOB size.

      The proper response when receiving bad data is to throw an error. My program should not have to check its data to make sure it's valid for MySQL. Well, that's not true, it should do some checks, but bugs still exist. Eventually, some wrong data will go through to the database server.

      The DB should never "decide what's best" and do that. It should always raise an error. Yes, the client programs should try and always send valid data. But sometimes, things get messed up, and bad data gets sent. The proper response shouldn't be to just mangle the data, it should be to raise an error.

      Example:

      mysql> create table testusers (name varchar(32), uid integer primary key);
      Query OK, 0 rows affected (0.06 sec)

      mysql> insert into testusers values ( 1, 'user');
      Query OK, 1 row affected (0.00 sec)

      mysql> insert into testusers values ( 2, '2user');
      Query OK, 1 row affected (0.00 sec)

      mysql> select * from testusers;
      +------+-----+
      | name | uid |
      +------+-----+
      | 1 | 0 |
      | 2 | 2 |
      +------+-----+
      2 rows in set (0.00 sec)
      That's just not right!
      --
      You are in a maze of twisty little relative jumps, all alike.
    61. Re:Apples and oranges by tgrigsby · · Score: 3, Insightful

      In short, pgsql does require somewhat more effort to initially configure and maintain.

      That extra effort to configure and maintain is why we went with MySQL. It's hella fast (and yes, that *is* a technical term), isn't plagued by the insecurities of MS SQL Server, and it's a breeze to install and configure -- don't even get me started on PostgreSQL's requirement that it be run from a non-Administrator account under Windows when EVERYONE with a default installation of Windows from NT 4.0 through XP is running as an Administrator.

      At the end of the day, the time and aggravation saved makes MySQL the hands-down winner over PostgreSQL, it's price/performance make it the smack-down winner over Oracle, and the security, stability, and speed make it the no-lie DB God over MS SQL Server.

      Now if it just had Booleans as a native datatype...

      --
      *** *** You're just jealous 'cause the voices talk to me... ***
    62. Re:Apples and oranges by boneshintai · · Score: 1

      Right, and as pgsql is an open-source project, if you feel strongly enough about the default configurations and the installation instructions and so on, you're free to provide changes. I don't feel strongly enough about it; I'm personally very comfortable with PostgreSQL's configrations, and leave very little of it untouched from the defaults.

      To be honest, the biggest issue with PostgreSQL that I see is the weakness of the introductory documentation. Learning how to add users, change authentication mechanisms (both how and why, and importantly why not), create databases, and so on is somewhat loosely explained; the docs have tutorials for users, which assume that the database is already set up and working, but no "cookbook"-style introduction for new server administrators.

    63. Re:Apples and oranges by mikis · · Score: 2, Informative

      Check out Gaia Online: phpBB forum with 190,595,085 articles posted, 1,042,339 registered users -- and 10,386 of them online at this very moment. MySQL db running on one dual Opteron. Here is an interview with guy that manages it.

    64. Re:Apples and oranges by tzanger · · Score: 1

      It probably says more about you than about me that you think that that list of "gotchas" (many of which are no longer valid, or are otherwise insignificant) is somehow difficult to keep in mind.

      Your DB truncating numbers, giving invalid results and allowing invalid data is not an issue for you? I dunno but that is not something I will tolerate from my DB. I expect its output to make sense and I expect it not to take artistic license with what I give to it.

      Hell, it's tiny compared to the quirks in any desktop GUI environment.

      I don't trust my data to a desktop GUI environment. I trust my data to the underlying filesystem and the DBs running on top of that filesystem. I would not accept such quirks from the filesystem or any kind of data archival/backup solution, either, but it seems that you would. You're a more tolerant person than I.

      As far as the list being invalid -- it seems that the guy maintaining it is keeping up with the changes/fixes in MySQL as they occur. Do you have specific examples of where he's exaggerating?

      250 million pages a day, mean of 9 queries per page, which averages out to about 26,000 queries per second.

      I'm calling your bluff. What hardware is this running on? You do no query caching or static pageviews? Every single one of those 250 million pages is dynamically served? Not even the mighty Slashdot does that. You have a table with a hundred million rows in it that's actively manipulated in read and write capacity and it has acceptable performance? How many queries/sec on that one and what kind of queries?

      What problems we do have arise when people do things like enable code that lets users do fulltext field searches 20 times a second on tables that have 200,000 rows.

      Well, duh. :-) I suppose you think it's acceptable to remove reverse from your transmission since some people accidentally slip into reverse on the way to park?

    65. Re:Apples and oranges by Dirtside · · Score: 1
      Your DB truncating numbers, giving invalid results and allowing invalid data is not an issue for you? I dunno but that is not something I will tolerate from my DB. I expect its output to make sense and I expect it not to take artistic license with what I give to it.
      Even if MySQL does a few things counterintuitively, it does them deterministically. We've learned those problems and moved on. "Artistic license"? Please.
      I don't trust my data to a desktop GUI environment. I trust my data to the underlying filesystem and the DBs running on top of that filesystem.
      Yes, and I'm sure your favorite underlying filesystem and the DBs you run never have any quirks or bugs.
      I would not accept such quirks from the filesystem or any kind of data archival/backup solution, either, but it seems that you would. You're a more tolerant person than I.
      Well, one man's quirk is another man's expected behavior. We're all familiar with MySQL and its quirks, which means we don't have to worry about them causing problems, because we've already accounted for them.
      I'm calling your bluff. What hardware is this running on?
      300 web servers (Mandrake 9.2) (we were overloaded with 100 servers, then we brought up a shitload more to reduce per-server load and give us room to grow), twenty MySQL servers in paired clusters that each hold distinct tables (one of them, used for holding gobs of language translation data, is quad-replicated).
      You do no query caching
      We probably do, but most of our tables are getting updated at least ten percent as often as they're getting selected from.
      or static pageviews?
      Essentially zero static pages. Every single page has PHP code in it, and all but a tiny handful of pages have queries.
      Every single one of those 250 million pages is dynamically served?
      Yep (although some of the flat content, like the Neopian Times (weekly newspaper) are served from static include files, but PHP still comes into play for things like picking random news articles to display in the sidebar). Go play with our site (neopets.com) a bit.
      Not even the mighty Slashdot does that.
      I'm not familiar with Slashdot's numbers, but I'm reasonably certain we do quite a lot more pages (and bandwidth) than Slashdot. I think, during peak times, we have an outgoing steady bandwidth of 1.5 gigabits/sec. Two colo facilities, one in Los Angeles and one in San Jose.
      You have a table with a hundred million rows in it that's actively manipulated in read and write capacity and it has acceptable performance?
      We have several of them, actually. Ten big ones with 100mrows or more. Plus a few hundred other, smaller tables ("smaller" meaning "between one and 30-odd million rows").
      How many queries/sec on that one and what kind of queries?
      Our biggest table is, naturally, the account data table, which gets selected on every page and updated pretty frequently (probably about 5% of page loads per user, and we usually have a couple hundred thousand users playing at peak).
      I suppose you think it's acceptable to remove reverse from your transmission since some people accidentally slip into reverse on the way to park?
      You know the old saying: Never let a pig drive a tractor full of fish. That saying makes about as much sense as your transmission analogy, which is to say: What the hell are you talking about?
      --
      "Destroy science and religion. Science would re-emerge exactly the same; but not religion." - Penn Jillette, paraphrased
    66. Re:Apples and oranges by harlows_monkeys · · Score: 1
      That will be cheaper than doing each INSERT in its own transaction, but it is still significantly slower than it could be. You should really be using COPY for loading lots of data into PostgreSQL, as it is significantly faster than INSERT

      That's interesting. There's a problem, though. Say I've got three big tables, T1, T2, and T3. Each report I have to import leads to one entry in T1, and one or more entries in T2 that have an ID field from T1 as a foreign key, and up to a few thouand T3 entries that have an ID from T2 as a foreign key.

      It would appear that COPY would be hard to use, because an entry in T2 can't be made without knowing the ID in T1, and a T3 entry can't be made without knowing the ID from the T2 entry.

      So, two ways around that come to mind.

      First, in my program that parses reports, make it keep track of the IDs, rather than relying on the database to assign them, at least when doing a bulk import, so I can write out files with those all set. A bit inelegant, since it basically means duplicating some database functionality in the report parser.

      Second approach would be to FIRST import all my reports into MySQL, then dump it with mysqldump, write a Perl script to massage that dump into the form COPY wants, and use COPY. There's something perversely amusing about using MySQL to speed up PostgreSQL data importing.

    67. Re:Apples and oranges by ttfkam · · Score: 2, Insightful

      This would be all well and good but PostgreSQL isn't standing still either. By the time MySQL 5.0 stabilizes, PostgreSQL will have a new host of features that MySQL lacks.

      But that's fine. Not everyone needs all of the current features of PostgreSQL let alone all of the potential new ones.

      HOWEVER, MySQL needs to work on data integrity post haste. "my new purple crayon" being logically equal to 0 is not a missing feature; it's a data loss bug. No error. Not even a warning. A bug in software that tries putting a string into an integer field will have no notification that anything went wrong until after the data is retrieved later. It's things like this that keep me from liking MySQL, not its lack of features.

      Until then, when I need SQL and something small, I'll be using SQLite.

      --

      - I don't need to go outside, my CRT tan'll do me just fine.
    68. Re:Apples and oranges by Anonymous Coward · · Score: 0

      Did it have pretty little icons under mysqladmin?
      Awwww, those windoze uzers.

  14. ./Configure by tokenhillbilly · · Score: 3, Interesting

    Actually, I would have thought that a section on how to best configure MySQL for particular performance situations would have been very helpful. I have not read the book, so it may very well have such a section, but this posting implied that configuration was assumed to be a non-advanced topic.

  15. Performance Dropoff by Anonymous Coward · · Score: 1, Interesting

    I'm currently working on a web based intranet app that could most likely contain a lot of data (but only used by a small number of people).

    I'm thinking about buying this book (although a professional book with typos worries me).. but it does me no good if running these performance enhancements mean squat once the DB gets over a gig or something. Anyone know where the performance dropoff is?

    1. Re:Performance Dropoff by Anonymous Coward · · Score: 0
      Anyone know where the performance dropoff is?


      Just after installation.

    2. Re:Performance Dropoff by Anonymous Coward · · Score: 0

      Now how did I know I'd get a poor ass response like this?

    3. Re:Performance Dropoff by jarich · · Score: 1
      I worked at a bio-tech and we used MySql to store bio data because it blew MS-Sql Server and Oracle out of the water in terms of raw performance. (No, we never benched against Postgres)...

      When I left the company, the database was ~80 gigs and had nearly a quarter billion entries.

      You may need tinker with the memory settings after you reach a certain size, but MySql can handle it.

    4. Re:Performance Dropoff by MrPink2U · · Score: 0

      What kind of backup strategy did you use? I'm curious because a database of that size would have to be down for quite some time if you have to do a text dump or a tar backup or something along those lines.

    5. Re:Performance Dropoff by jarich · · Score: 1
      It was an infrequent write scenario... we built it for customers and they (from time to time) added their data, so backups were just copying off the database when it was offline.

      However, if memory serves, (and it's been 2 years), I think that an Innnodb database format let's MySql do backups on the fly.

  16. Read only MySQL, right? by esconsult1 · · Score: 3, Insightful
    Is this high performance read only MySQL installations?

    The authors certainly could not be referring to a system that has medium to high levels of inserts or updates.

    While MySQL is great for reads, it sucks balls with inserts and updates... even with InnoDB, even with gobs of RAM and 15K SCSI hard drives in Raid 0 configurations on Quad Opteron systems.

    And what a sucky review too... Nothing about what the book addresses.

    1. Re:Read only MySQL, right? by lewiscr · · Score: 2, Funny

      You're either overindexing the tables, or you missed a few performance parameters.

      I'll admit that INSERTs/UPDATEs are slower than SELECT, but they shouldn't be that painful. My Database is approximate 50/50 read/write by byte, and 66/33 read/write by query. During peak updates, I 'load data infile' about 10 Gig of data in 2 hours. It uses about 2 of the 4 Sparc 450MHz processors. 'load data infile' is quite a bit more efficient for bulk loads that 1 million INSERT statements.

      I'll assume you have a reason for RAID 0, and have some insurance against the inevitable failure. My system uses a RAID 0 stripe, but only because I know that each "disk" is actually mirrored by the EMC behind the scenes.

    2. Re:Read only MySQL, right? by cft_128 · · Score: 1
      While MySQL is great for reads, it sucks balls with inserts and updates... even with InnoDB, even with gobs of RAM and 15K SCSI hard drives in Raid 0 configurations on Quad Opteron systems.

      The only time I've seen MySQL "suck balls" for inserts was when I was doing single row insert...select... statements. I have no idea why but those are really, really slow and doing an individual select then an insert is much faster (3x faster IIRC).

      --

      Underloved Movies and Pub Quiz: donotquestionme.org

  17. My company implemented mysql clustering and failed by Serveert · · Score: 3, Interesting

    The read only slave got corrupt and in order to create the replica apparantly you have to pretty much disable the master while it creates a new replica so it was unacceptable.

    --
    2 years and no mod points. Join reddit. Because openness is good.
  18. A deal with the devil? by ShatteredDream · · Score: 3, Funny

    In recent news MySQL has defied all industry expectations in its latest performance tests. For the first time, MySQL wrote data faster than a second grader learning to write in cursive.

    Industry insiders have hinted that an organization known as Infernal Industries has become a major investor in MySQL. Headed by one Baal Diablos II, this rising star has come to challenge Microsoft and Oracle for the title of most evil corporate entity.

    When asked for comment, Mr. Diablos replied "we have 6,000 years of investment experience. We created the modern labor market with our first employees, Adam and Eve. Today, we are becoming the biggest player in IT products thanks to our efforts to bring this formidible experience to the table."

    The new, high performance MySQL preemptively versioned at 6.66 comes with a complimentary familiar with no less than 20 years of SQL experience. When asked whether this new round of investment would cause the Roman Catholic Church to divest itself of MySQL products, the Holy See responded that it is strongly looking into PostegreSQL.

  19. Safari Rules by NardofDoom · · Score: 2, Informative
    I was just about to run out and buy this, but then I realized that my safari subscription will get me the book for only $20/month.

    Best $120/year I've ever spent, that. And it's a tax write-off.

    --
    You have two hands and one brain, so always code twice as much as you think!
    1. Re:Safari Rules by Kethinov · · Score: 0

      I can get free books for using Safari in Mac OS X? Cool!

      --
      You're right, I wouldn't steal a car. But if it were possible, I sure as hell would download one!
    2. Re:Safari Rules by Anonymous Coward · · Score: 0

      Yeah, I made my company buy me one. Had it for a couple years. Fuckers let it lapse though, and now I miss it ever so much.

    3. Re:Safari Rules by smellystudent · · Score: 1

      How do you spend $20 per month but only $120 per year?

      --
      Predictive text is shiv!
    4. Re:Safari Rules by NardofDoom · · Score: 1

      I'll tell you how. You spend 16 hours straight coding, that's how.

      --
      You have two hands and one brain, so always code twice as much as you think!
  20. Move the problem away by wowbagger · · Score: 3, Informative

    Just as in maths transforming the problem to a new domain might make it easier to solve, perhaps there is an easier way for you to solve this problem:

    Put your MySQL data onto a Logical Volume Manager volume. Use the LVM snapshot mechanism to insure a consistent view of the data during backup. Here's how:

    Create an LVM physical group with enough space to contain all your database, plus enough overcapacity to store any changes during the backup proceedure. So, let us say your database contains 10G of data, will take 10 hours to back up, and will, during that time, undergo revision to about 1G of that data - you create a PV of 11G or more.

    Next, create a logical volume big enough to store your data - in the case of the numbers above, create a 10G logical volume. In that volume create your file system, and set up your database.

    Now, when you wish to do a backup, lock out access to the database, and do a snapshot of the logical volume it is on. Then restore access to the database. This won't take very long at all.

    Mount the snapshot read-only and back it up. If you need to back up through the MySQL, bind a server onto the read-only data.

    Then release the snapshot.

    During the time the snapshot exists (in this example, the 10 hours to do the backup), the real, live read/write file system may be updated as desired, as long as the total differences between it and the snapshot do not exceed the reserve capacity of the physical group - in the case of the example numbers above that would be 1G of total differences (NOT 1G of writes - change the same 1k record a billion times and it is still a 1K change).

    That's one of the reasons the Big Boys (like Sun) pooh-poohed Linux - it did not have LVM. Now it does.

    1. Re:Move the problem away by InsaneCreator · · Score: 1

      Interesting stuff. I wasn't aware of such functionality being available under Linux. Thanks to everyone who suggested LVM.

    2. Re:Move the problem away by tzanger · · Score: 1

      Now, when you wish to do a backup, lock out access to the database, and do a snapshot of the logical volume it is on. Then restore access to the database. This won't take very long at all.

      Um no. It's not enough to just lock out access and take a snapshot; you need to lock out users, tell the DB to flush to disk and THEN snapshot.

    3. Re:Move the problem away by lewiscr · · Score: 1

      Now, when you wish to do a backup, lock out access to the database, and do a snapshot of the logical volume it is on. Then restore access to the database. This won't take very long at all.

      MySQL has a feature to do this automagically:
      FLUSH TABLES WITH READ LOCK

      Although you may need to consider the transaction running on the machine. I can run the FLUSH & LOCK just fine on the machines that do small I/O, but I have an external locking mechanism for machines that do large SELECTs or UPDATEs. For example, the machine that does a LOAD DATA INFILE on a 1M row data file will be locked until the LOAD and the FLUSH finish. Took me a while to track that one down...

      In my case, all the long running queries are backend, so I can live with the external mutex (managed by MySQL - see GET_LOCK()). If the web site had any long running queries, I'd have to setup replication and offload this process to the slave.

    4. Re:Move the problem away by cft_128 · · Score: 1
      Um no. It's not enough to just lock out access and take a snapshot; you need to lock out users, tell the DB to flush to disk and THEN snapshot.

      Not that hard, here is the procedure: Execute the following mysql command:

      FLUSH TABLES WITH READ LOCK;
      Then (while maintaining the DB connection) take your snapshot. When the snapshot is done, drop the DB connection and the read lock will be released (or issue an UNLOCK TABLES command).
      --

      Underloved Movies and Pub Quiz: donotquestionme.org

    5. Re:Move the problem away by tzanger · · Score: 1

      FLUSH TABLES WITH READ LOCK;

      Didn't know you could do that. Thanks for the tip! :-)

  21. MySQL Performance by rodrigo.avila · · Score: 4, Interesting

    If you need extreme performance in small commercial site, why you need an great and cpu-expensive Referential Integrity? In my opinion, in this case, MySQL is better. Comments?

    --
    Rodrigo
    1. Re:MySQL Performance by fitten · · Score: 1

      If you need extreme performance in small commercial site, why you need an great and cpu-expensive Referential Integrity? In my opinion, in this case, MySQL is better. Comments?


      Yeah, why would you want to link multiple purchases to a single account. Why would you want to want to make sure that the account you are associating with a purchase actually exists in your database and flag non-existance of a primary key as an error. Why would you want an error to be raised if you try to delete an account without dealing with the purchases associated with that account, thus leaving records of purchases that have no account associtated with them. Certainly you wouldn't want these things for accountability to your customers, tax purposes, or making sure the database isn't just garbage.

    2. Re:MySQL Performance by benjamin264 · · Score: 1

      A 'small commercial site' should not have performance issues with a good design. 'CPU expensive' to me is generally more than milliseconds. Forgiving that, it is better to use cycles when saving data than wasting them looking for orphaned/nonexistent data. Additionally, if you do not use RI in the database, that requires the application to do it. Which is more expensive? You can argue it either way, but ultimately, the presence of RI has multiple benefits that outweigh the costs. Either way, a lack of RI often reflects poor design. And that will cost you more than CPU in the long run.

    3. Re:MySQL Performance by ahodgson · · Score: 1

      Ah, but it's trivial to do all that in your application code. Never mind if multiple applications need to access the data, or if programmers don't necessarily like thinking about RI all the time. /sarcasm off

    4. Re:MySQL Performance by Just+Some+Guy · · Score: 3, Interesting
      Here's the deal: you have to put the "integrity layer" somewhere, unless you're OK with every other page load returning a 500 error. So, the real question is whether you want to put that layer close to the data (eg with foreign keys and other restrictions inside the database) or close to the user (eg with a lot of code in PHP or other languages not ideally suited to such things).

      For example, say that you're running a shopping cart and want to guarantee that each available item has a unique identifier. Is it more efficient to make the "itemid" field a unique key, or to fire off two queries every time you want to add an item: the first to attempt to fetch a row with the newly-generated itemid and the second to insert the new data? Even more importantly, how thread-safe is your website code? Can you prove that it's impossible for someone to insert an itemid the instant after you've tried to verify that the exact same itemid hasn't already been used?

      Again, you must and will put some sort of integrity checking into your system, unless you're really curious how well unemployment insurance pays in your city. Do you want to hand-roll your own half-assed solution, or would it be easier to say "here's a list of constraints, Mr. Database! Make sure I don't accidentally break them, would you?"

      --
      Dewey, what part of this looks like authorities should be involved?
    5. Re:MySQL Performance by poot_rootbeer · · Score: 2, Interesting


      Maybe it's just me, but I can't conceive of any relational database scenario where maintaining referential integrity WOULDN'T be a requirement.

      Garbage In, Garbage Out. Who cares how fast your queries run if the data has gotten munged?

    6. Re:MySQL Performance by sql*kitten · · Score: 3, Insightful

      If you need extreme performance in small commercial site, why you need an great and cpu-expensive Referential Integrity? In my opinion, in this case, MySQL is better.

      OK, MySQL can be used as a cache. It's like Squid and Oracle is like Apache. You can use Oracle to handle your actual transactions, integrity constraints, etc etc, then periodically dump out the data into MySQL and generate the web pages from there.

      I have Oracle databases with thousands of connected users, all doing both queries and transactions. Could you read the data quicker from MySQL? Probably. Can MySQL manage tens - sometimes hundreds - of thousands of locks at a time? Not a chance.

    7. Re:MySQL Performance by bushidocoder · · Score: 1

      On the outside chance you're a small business that really doesn't need the power of PostGre, I'd still go PG - why? Because a small business cannot afford to make the mistake of being unable to grow in a new direction technically because their existing system is sitting on a very special-case db. Small businesses die because of mistakes like needing to stop development on new features for 6 months to redo their backend so that they can end up in the same spot they started in.

      Almost all specialized products are for systems whose requirements are immutable, and only then in cases where a more generic solution is dramatically inferior in some regard. From a business case, if you have to spend an extra 3% on your hardware budget to accomodate a slightly slower db, but that db allows you to rapidly grow in a direction you haven't thought up yet, you take that case in a heartbeat.

    8. Re:MySQL Performance by Elwood+P+Dowd · · Score: 1

      If you need extreme performance in small commercial site

      Whu... if you need extreme performance, it isn't a small commercial site, now is it.

      --

      There are no trails. There are no trees out here.
    9. Re:MySQL Performance by gnuLNX · · Score: 1

      how about one that doesn't change very often...is very speed critical on selects, and can be backed up easily everytime it is changed.

      I have never once seen mysql loose data...not once.
      **disclaimer...I haven't actualy done that much with mysql other than a few molecualr databases which are used in a real time production environment...I certainly can't make statements about massive inserts, rollbacks, transactions, etc....

      I honestly believe for most small projects mysql is a good tool for the job.

      --
      what?
    10. Re:MySQL Performance by gnuLNX · · Score: 1

      Yeah but nevermind the $100,000 price tag on the "slower" db.

      For small companies this is just out of the question.

      --
      what?
    11. Re:MySQL Performance by ttfkam · · Score: 1

      It's not just you.

      --

      - I don't need to go outside, my CRT tan'll do me just fine.
    12. Re:MySQL Performance by ttfkam · · Score: 1

      SQLite is faster, more SQL92 compliant, and simpler than MySQL. So if you're just looking for an SQL front end to a simple dataset, there are better choices.

      If you are (as I think) using MySQL has a web data backend and not updating/inserting often (meaning less than every few hours), why use a live database at all? Pregenerate static files for the site and get a massive speed improvement.

      I honestly believe for most small projects mysql is a usable tool for the job. However, it is very rarely the best tool for the job. In terms of learning good database practice, it is almost never a good tool.

      You may never have seen MySQL lose data, but that doesn't mean that MySQL will bother to tell you when a bug in the code tries to insert a string into an integer column. As far as MySQL is concerned, "my new purple crayon" is logically equal to 0. My main beef with MySQL is that you can go extended periods of time with MySQL losing data and not know it.

      Make backups often.

      --

      - I don't need to go outside, my CRT tan'll do me just fine.
  22. Visual Tools and Algorithms by Anonymous Coward · · Score: 0

    I came across the replication chapter online, and was pleased with the author's writing style.
    Although there are now many visual tools available for MySQL (Query Browser etc) the book explains all the concepts with traditional mysql> command line commands.
    This is fine, although I am a big fan of the new visual tools. The authors mention that they will put some chapters in the future versions of this book to show some advanced concepts with these tools.
    As far as database performance goes .. I think there are many developers out there who don't have a fundamental understanding of how data structures work. As much as it was a dreaded computer science course, The Analysis of Algorithms and Data Structures has served me extremely well in my professional career. Unfortunately, (or fortunately for me) I have encountered many database architectures and data access strategies which don't take advantage of the organization of the data itself. Even though it is common sense most of the time such as O(log n) access instead of O(n^2) I think a chapter which covers these simple concepts would be in order...

  23. WhySQL by Anonymous Coward · · Score: 0

    SQL is generally overkill for a small commercial site in the first place, as is the whole EJB thing, but like Y2K, if it keeps people employed, who am I to argue.

  24. His blog by Anonymous Coward · · Score: 2, Interesting

    Jeremy has a blog at http://jeremy.zawodny.com/blog/

  25. The biggest shocker by Stone316 · · Score: 1
    is that a company like Yahoo only switched over to a database in 2000. Thats kinda scary if you ask me...

    I guess its safe to say they aren't on the bleeding edge.

    --
    "Thanks to the remote control I have the attention span of a gerbil."
    1. Re:The biggest shocker by Anonymous Coward · · Score: 0

      Yahoo was bleeding edge ... in 1996.

      You can bet they used databases for their own internal financials (and probably not MySQL).

  26. OLAP? by techsoldaten · · Score: 2, Interesting

    Okay, every time I hear about some new feature in MySQL and start making the argument the database may be finally be sufficent for the backend of an OLAP system, someone immediately responds, 'BUT THERE IS NO ROLLUP PEROGATIVE - HOW CAN YOU HAVE OLAP WITHOUT ROLLUP?' Responses tend to differ from there on, but the basic point is that few people agree MySQL can be used for data analysis.

    One argument that I have been turning to is that not that many people take the time to become great MySQL developers. It is easy to be good with MySQL, it is terribly difficult to be great with MySQL. Indexes alone are a problem for many people to grasp - the whole leftmost column rule is easily overlooked unless someone is having a problem with a specific query.

    Books like this give me hope people will begin looking deeper at MySQL as a development platform and that a wider base of developers will begin to emerge.

    Anyways, OLAP? Can this book be used to help people get there?

    M

    (P.S. I have built an almost-generic OLAP system for my company using a MySQL backend. I really want to know what other people's thoughts are.)

    1. Re:OLAP? by Anonymous Coward · · Score: 1, Informative
  27. read the book by Anonymous Coward · · Score: 0

    k .. This book is definately a good read however it is lacking in some key mysql clustering areas:

    For one thing it shows you some nice clustering scenarios however it doesn't go into any great detail as to actually use mysql 4.1's new in memory clustering capabilities... (At press time they were still beta I guess?)

  28. Why people hate MySQL by ashpool7 · · Score: 2, Informative

    (from what I've read and observed)

    Because it's not a "real" database. Sacrifices too many features to be fast.

    Because of this list: http://sql-info.de/mysql/gotchas.html

    Because it's pawned off as an Oracle or DB2 replacement when it's not. There are instances where a database isn't using all the features of DB2 and in that case, MySQL or Postgres could be used, but that does not make it a "DB2 replacement."

    Because data corruption used to be the nom de guerre until InnoDB. There are still skeptics.

    Because there are no views.

    Somewhere (yes, I know not providing it makes this argument looks stupid. I honestly couldn't find it, since I didn't know which one said it or what precisely it said) there's a good quote about one of the MySQL founders saying something like "PRIMARY KEY is the same as a UNIQUE." This lack of understanding is not appreciated by actual database administrators... See http://www.phpfreaks.com/postgresqlmanual/page/key s.html

    1. Re:Why people hate MySQL by smitty45 · · Score: 1

      Yahoo does well over a BILLION pageviews a day. I'm not so sure that the database that they are using sucks, just because a hundred slashdot trolls says that it does.

    2. Re:Why people hate MySQL by gnuLNX · · Score: 1

      No doubt. In fact you could almost turn it the other way around. if 100 slashdot trolls say x sucks...then you should probably go and by X.

      However I do agree with the original poster that it doesn't have all the featuers of DB2 or Oracle, but it does seem to do quite well for a lot of application.

      --
      what?
  29. Tripping on ACID by yintercept · · Score: 3, Insightful

    Truth is, most people want ACID (and transaction processing) because it saves them from having to think too much. You perform a whole bunch of data operations. If it doesn't work you role back. If you are just dropping things into a MySQL structure, you have to think harder about what will happen if an operation crashes mid way.

    As for the parent post, he's just bought into the illusion that database gurus think at a transcendental level. It is a common ailment. Object gurus, business gurus, religious gurus, political gurus are all treading on the same clouds. It is a common ailment that usually cures itself with time.

    1. Re:Tripping on ACID by AKAImBatman · · Score: 1

      Truth is, most people want ACID (and transaction processing) because it saves them from having to think too much.

      I seriously hope you don't approach real database work this way. ACID is NOT about "not thinking too much". It's about making it impossible for unexpected bugs to corrupt data, or leaving a transaction halfway completed.

      How would you feel if your bank:

      - Deducted the money for an savings to checking transfer, but failed to add the money to the checking account?
      - Accidently associated your account with someone else, thus denying you access to your money?
      - Failed to reschedule an auto-pay, making you delinquent on your car/credit/house payments?

      ACID is about data integrity, pure and simple.

    2. Re:Tripping on ACID by yintercept · · Score: 3, Informative

      Personally, I don't just stop with ACID, I make sure I have an audit trail from start to finish and I make a barage of tests that show that the audit trails add up. Even with ACID I never buy the illusion that there will never be a problem with the transaction.

      ACID may be about data integrity. My point is that people who have an ACID database tend to stop thinking about data integrity.

      It's about making it impossible for unexpected bugs to corrupt data

      Sorry, but I don't buy into the illusion that anyone is creating perfect programs. I have had to deal with people who thought they wrote perfect programs. My experience is that the programs written by buzz word spouting gurus is generally very poor quality.

    3. Re:Tripping on ACID by Omega1045 · · Score: 1
      Sorry, but I don't buy into the illusion that anyone is creating perfect programs. I have had to deal with people who thought they wrote perfect programs. My experience is that the programs written by buzz word spouting gurus is generally very poor quality.

      While I understand your point of view, I don't agree with it. First of all, ACID has been around for so long that I don't think you can call it a buzz word. Maybe it was a buzz word back in the early to mid 90s, but now it is standard industry terminology.

      I work with some very smart people. However, all of these people are writing different modules to a product, and communication can never be perfect especially amongst geeks. We are going to step on each other's toes. I am counting on the fact that some PhDs at Oracle figured out how to do ACID right. We would drop Oracle in a split second if MySQL could prove to do this, and do it well (plus better backups). But Oracle has done it, and done it right for a long time, and their backups are much more reliable than anything with MySQL.

      --

      Great ideas often receive violent opposition from mediocre minds. - Albert Einstein

    4. Re:Tripping on ACID by yintercept · · Score: 1
      I don't think you can call it a buzz word.

      The buzziness of words generally has more to do with context than the word itself. Object Oriented is not by necessity a buzzword. About 80% of the time its used it is a buzzword. As in "I was synergized by my object oriented coffee this morning."

      You are obviously taking data integrity to heart in your project. I think that is fantastic. The sales man who starts talking about ACID probably is just trying to get the buzz.

      I am counting on the fact that some PhDs at Oracle figured out how to do ACID right.

      It probably wasn't a PhD at Oracle that wrote the code. The code was probably hacked out people of various skill levels. The driving force behind a lot of things at Oracle was screaming customers livid about losing data. You should read up on the history of Oracle. The PhDs usually come in after the fact and "explain" things to students.

      BTW, I've had really weird problems with Oracle. Lot's of things have gone wrong that never should have happened. Most stuff works, the application is better than most. It is not perfect.

    5. Re:Tripping on ACID by Anonymous Coward · · Score: 0

      You can bet your ass than Oracle's ACID features were designed by PhDs (or people with equivilant knowledge). Maybe you should read about the product a little bit before you call features like MVCC "hacked out". Sheesh.

    6. Re:Tripping on ACID by Anonymous Coward · · Score: 0

      Perhaps you should spell/grammar check your posts. It's difficult to believe someone who says that they "role back transactions" or that they run a "barage" of tests. The words are "roll" and "barrage", respectively. Based on the general content of your posts, and esp. the use of "role back" (that's not a typo - that's just ignorance), I think you're one of those people who knows some buzz words and read some stuff online, but doesn't really know what he's talking about. You probably don't know this, but people around you do. Stop talking about computers at a higher level than you're qualified to do, at least until you've learned a little more. The people who are impressed will stil be impressed if you stop talking out of your ass, and the gurus won't think you're such an idiot. Trust me - we know when you're talking above your own level.

    7. Re:Tripping on ACID by yintercept · · Score: 1

      I have read several histories of Oracle. My belief that the product was developed by people of varying skill levels with a great deal of input from the customer base is a hell of a lot closer to reality than the bogus myth being pushed that database was derived from the aether by a bunch of stuck up, head in the clouds set theorists who were sitting in an ivory tower contemplating the fragrances of each other farts.

    8. Re:Tripping on ACID by sql*kitten · · Score: 1

      Truth is, most people want ACID (and transaction processing) because it saves them from having to think too much.

      That is not the truth at all. Transactions are needed because the world is an unpredictable place. All your "think harder" advice falls down if a workman puts his pneumatic drill through the fibre under the street midway through a big job - but a transactional database will cope with that just fine.

      you have to think harder about what will happen if an operation crashes mid way.

      Of course, the environment in your parents' basement where you run your Linux box probably doesn't suffer from that risk, but some of us do real database work in the real world.

  30. shameless plug by krjordan · · Score: 1

    With a title like High Performance MySQL, the reader should not expect assistance with tasks such as installation or the basics of SQL.

    Nope. For that the reader might require something simple, like MySQL Basics for Visual Learners, a $2 download which I wrote.

  31. MOD PARENT WAY THE CRAP UP by Anonymous Coward · · Score: 0

    "MySQL, if you can read this, your using it right now."

    DING DING!

    1. Re:MOD PARENT WAY THE CRAP UP by Anonymous Coward · · Score: 0

      "MySQL, if you can read this, your using it right now."

      DING DING, DI...


      503 Error

  32. to you maybe Re:It Sounds Pretty Basic by samjam · · Score: 3, Informative

    By the tone of your post you seem angry about this book, and scared to read it in case you learn something new and doubt what you have done in the past.

    1. How to setup replication with different architectures including master/slave, master/slave/slave (another level), replication rings, dual master replication with slaves and more.

    Simple, just read a few pages on the MySQL documentation. [mysql.com]

    Some people prefer printed book form, as some publishers are aware.

    2. RAID (software vs. hardware), RAID types, IDE vs. SCSI. I know. I'm sure you've thought of these things already but most people will find something new here because there is a lot on it. It is quite dense.

    What on earth does this have to do with MySQL specifically?

    You might benefit from learning about it if you are trying to optimise an SQL server?

    And furthermore, if you dont know about _RAID_, and need to be told the benefits of SCSI/Raid, do you really think you need MySQL clustering ?

    Well, once the reader has read these chapters they can make this judgement. You've manage to make this judgement for EVERYONE without reading the chapter for which I congratulate you.

    Any PC/Server with enough ram will push queries like a maniac. Also, check #4...

    queries... I think this book isn't just a book about getting fast queries.

    3. Selection of an OS. Seriously. It actually compares threading models on OSes compared to performance with MySQL. This is great stuff.

    Great, now that i've bought this book, i'm supposed to _CHANGE_ OS in my server farm, just to gain a few more queries/sec ?

    Steady there! Whoa! This isn't advertised as a book for people who are happy with what they have and know they did it right by instinct. This could just be a book for people who want to learn something before they decide what to do NEXT.

    4. Selection of a filesystem on an OS. Yes, nitty gritty. Includes talk about journaling vs non-journaling benefits and warnings.

    Filesystems? Oboy, anyone who knows about SQL performance knows that if you access _DISK_ during execution, you are screwed anyway.

    And whats going to happen to your data when you power down? I suppose you've been committing all your data to files on the disk? It might just be useful to know about the benefits and dangers of the various filesystems.

    Anyone who knows anything about indexing knows that btrees and rtrees are optimised for pulling data off random access media. There's better index forms if you are going to be ram based. I forget what they are because all my data is disk based.

    5. Load balancing mySQL. How to do this through a load balancer. Why and how it's different than load balancing web servers.

    This seems like a decent chapter.

    6. How to backup live using replication.

    DOH! We are talking backup here, right? Backup means archiving a copy of the database. MySQL allows you to use the magic command "cp" (or scp, or rsync, or..) Best # so far ...

    Totally wrong. If you want to back up a database that is being modified, you need to lock it and that would stop updates. This certainly needs a chapter.

    You could read a few pages on mysql.com about this instead if this book offends you. It sounds like you might need to. Some people prefer web sites to books.

    7. I love this: Detailed information on how the replication system actually works. I love understanding the inner workings because then I know how/why things went wrong.

    The updated node sends commands to the other machines to update as well. If something goes wrong, you could rollback the transaction.

    If there are transactions? I can use mysqlbinlog and convert to ascii logs, edit the logs, and replay the edited logs. If you didn't know about this, maybe you acknowledge the need for a chap

  33. Silly Thread! by Anonymous Coward · · Score: 0

    Really, SQL isn't even relational, it's just a rubber-stamp of DB2 (Yay, ANSI!). If you need relational flexibility, use a relational query language.

  34. No brainer. by DogDude · · Score: 2, Informative

    Uh, MSDE?

    --
    I don't respond to AC's.
  35. So... by Aldric · · Score: 1

    You graduated from college knowing everything about software development and database administration did you? Or just maybe did you actually have to learn some things as you went along like every other mortal on the planet because you didn't know it all?

  36. depends on the application by ashpool7 · · Score: 1

    Serving pageviews (SELECTs) is one thing, the things I listed are another. Considering Yahoo! used to use flatfiles and BDB, I'd say MySQL was a step up in organization without being overpowered.

    It's not about sucking, it's about features. MySQL has the least, so it's at the bottom of the barrel when people go shopping based on features they need.

    1. Re:depends on the application by smitty45 · · Score: 1

      so are you implying that having the most features make the best database ?

  37. You forgot sqlite by ScreamingSlave · · Score: 1

    www.sqlite.org

    1. Re:You forgot sqlite by AKAImBatman · · Score: 1

      Good catch! Thanks for reminding me. :-)

  38. no by ashpool7 · · Score: 2, Insightful

    It seems to go like this.....

    DBAs and those who actually know how to use all the features in the high-end database exploit the features in the database. PostgreSQL supports the most features of the free DBs and it's pretty fast when you use connection pooling compared to MySQL, so they go with it.

    People see this and immediately declare MySQL crap, dredge up all those reasons not to use it, and declare PostgreSQL the winnar! [sic] People love conflict....

    So, you can use whatever you want. I'm sure Yahoo! picked MySQL because of fast SELECTs and that's all that is really important to them. I pick PostgreSQL because I don't want to have to convert from MySQL to PostgreSQL because I need some feature.

    1. Re:no by smitty45 · · Score: 1

      ah, I see. I agree. It's why we use it at work, too. (a certain un-named high-volume social networking site)

  39. Start at the beginning, then proceed. by hfx_ben · · Score: 1

    "The authors acknowledge that the book isn't meant for the MySQL beginner. Even so, the book does spend a little time going over a refresher of MySQL. If the reader is looking for a book on the basics of MySQL, though ..." Speaks well for the authors ... as though they're pro (assuming that section isn't just fluff).
    I'd take that as an indication of quality.

    --
    -- When you look to see how the system works, you usually find that it doesn't.
  40. must be some good bussiness plan... by Maljin+Jolt · · Score: 1

    1. write a 294 pages book about an oxymoron
    2. profit!!!!

    --
    There you are, staring at me again.
  41. Re:Apples, oranges and SAP DB by atherton2 · · Score: 1

    Now if I remember correctly, SAP DB was the ADABAS-D code that MySQL picked up when they partnered with SAP last year. I belive they renamed it MAX DB.