High Performance MySQL
With a title like High Performance MySQL, the reader should not expect assistance with tasks such as installation or the basics of SQL. The authors acknowledge that the book isn't meant for the MySQL beginner. Even so, the book does spend a little time going over a refresher of MySQL. If the reader is looking for a book on the basics of MySQL, though, I'd recommend picking up a different title. I might be so brazen as to suggest looking at the book on MySQL that I wrote a couple years ago, but I'll leave it for the reader to find an appropriate title for their level of knowledge with MySQL.
The book is broken out into ten chapters, beginning with a chapter on the basics of MySQL as seen from someone with some level of MySQL experience. (And since I am someone with some level of MySQL knowledge, this chapter didn't seem to be too painful of a read.) That statement begs for a further explanation. I find it painful to read books that claim to have advanced topics but find it necessary to spend (or waste) time by giving yet another explanation of how to type ./configure; make; make install in the beginning chapter. In this regard, I find High Performance MySQL to be quite good at providing the appropriate level of explanation for the audience.
The first pages alone are a great starting point for those who are wondering about the performance as it relates to the different versions of MySQL. A discussion of the binary versions of MySQL versus the source code versions gives the reader a good understanding of the implications of this important decision. Choosing correctly between the versions of MySQL can help to maximize performance.
Though only the first chapter is titled "Back to Basics," most chapters include some basic or background information pertinent to that chapter. For example, both the query and index performance-related chapters include sections with the word 'Basic' in their titles. These two chapters, "Index Performance" and "Query Performance," cover topics often misunderstood when considering performance on a database. Naturally, any database administrator worth their pay (insert your own joke here) will tell you that an index can help to improve performance. But High Performance MySQL takes it a step further by diving into why you might use a particular index given your database needs. This type of practical information is common throughout the book. Rather than just give the reader a listing of the indexes (which are mostly common among RDBMS anyway), the authors choose to tell you why you might choose one method over another.
Other chapters within High Performance MySQL include chapters on server performance tuning, replication, load balancing, and high-availability, backup and recovery, and security. I've followed MySQL forums and mailing lists over the years and some of the more common questions relate to MySQL security. The chapter on security gives the reader a great explanation of the MySQL GRANT tables and includes a sidebar on the significance of 'localhost' in MySQL which nearly always seems to be a source of confusion, though maybe not for the reader of this particular book. Like other chapters, the security chapter includes tips for the underlying environment upon which MySQL operates. For example, regardless of how good the MySQL database permissions are configured, if another piece of software on the server isn't kept up to date then the entire server, including the MySQL data, could be compromised.
Looking back at the book, I might have laid the chapters out in a different order or provided more structure through sections of the book. However, this can be dismissed as personal opinion and is non-substantive anyway. I believe High Performance MySQL is an excellent choice for the MySQL administrator or developer not only looking to improve their database performance but also to improve his knowledge of MySQL.
You can purchase High Performance MySQL from bn.com. (And the book that Suehring alludes to is the MySQL Bible . Slashdot welcomes readers' book reviews -- to see your own review here, read the book review guidelines, then visit the submission page.
You can keep "High Performance MySQL". I'm holding out for "Incredible Data Integrity Management with MySQL". :-D
Javascript + Nintendo DSi = DSiCade
High Performance MySQL, published by Oxymoron Press, Inc.
Except maybe for throwaway blog sites. If my blog site crashes, so what?
I'm going all out here.
MySQL is better than PostgreSQL
Redhat is better than Debian
Windows XP is better than MAC OS X
Bush is better than Kerry
Conservatives are beter than Liberals
Americans are better than Europeans
etc. etc. etc.
I've just signed legislation that'll outlaw Russia forever. We'll begin bombing in five minutes.
Indeed, PostgreSQL is awesome.
If guns kill people, then CmdrTaco's keyboard misspells words.
Sometimes MySQL indexes are bad. For example, if you infrequently do lookups based on an index but you often add or remove many records from a database, an MySQL index can actually hurt performance.
I hope there's something about that in the book...but then again, this is 2004. Who still reads (non-electronic) books about technical topics?
The important thing is, does it tell you how to set up and manage highly distributed system managing gigabytes of data under high transaction volumes?
If it doesn't, and just discusses the general issues to consider under such conditions, it is worthless for those that are already professionals. Professionals know how to consider these things, they just want concrete examples so they don't have to spend months of reading and tinkering to figure it out on their own.
Unfortunately, reading the review above doesn't even let me know whether or not the book has anything in it at a level that I might want. However, if I was writing the review I'd probably order the paragraphs differently and use some different words. That might just be personal preference.... :p
...before blaming MySQL! Ha ha!
Er, anyhow, as an apology, here's an open source SQL analyzer.
The Army reading list
This review didn't really tell me anything that I couldn't have gleened by simply reading the Table of Contents. I get the impression the author either didn't actually read the book or he was too lazy to write more detail.
Having said that, I'm intrigued by the title and will probably investigate the book anyway simply because it's a topic that directly pertains to some upcoming projects on my calendar.
I see one of the topics covered is backing up data. Would somebody care to explain how to make a consistent backup of a mysql database?
Plain mysqldump does not export consistent data across tables. While you could lock all tables for the duration of backup, that's not likely going to be possible if you're aiming for "high performance". Mysqlhotcopy only works for myisam tables and simply tar-ing up data files is just asking for a disaster to happen. So how do you make a consistent backup without setting up a dedicated replication slave for this purpose?
(No, this is not a troll; I am in fact looking for a good way to back up mysql data on my server)
In what type of application do they use it? I would not imagine as a primary database, do they?
The price is always right if someone else is paying.
...and was surprised at a two things:
1) The number of typos in the book. Didn't anyone proofread it before it went to print?
2) How many features MySQL lacks. Don't get me wrong, I like it as its fast and easy to work with for web stuff, but I was surprised at how often I read "MySQL doesn't do x yet, but will in version y".
The chapter on replication was worth the cost of the book for me, but I suggest people considering buying the book should wait until the 2nd edition.
High Performance and Incredible Data Integrity? You are comparing apples and oranges. Make up your mind. It's either "High Performance with MySQL" or "Incredible Data Integrity with PostgreSQL." You can't have your cake and eat it too. If you need real ACID (Atomicity, Consistency, Isolation, and Durability) then by all means have it, but realize that it isn't free. Please read about the relational model and ACID. Read about the set theory and predicate calculus, about tuple calculus and relational algebra. This is complicated stuff which it is not cheap algorithmically. Most of people don't need ACID and should not be penalized by the algorithms that implement it. Those who need it will have to pay with performance, because there is only this much you can do in one cycle. A pure ACID relational model is not for everyone, some people will only need a persistent object store instead of a relational database. The most important thing to realize is that databases are not magic. They implement complicated algorithms in code that have to be run by the CPU and by definition cannot be a one-size-fits-all solution. So please don't oversimplify it by saying "I want everything with no paying for anything, and also a pony." Use the right tool for the right job.
Sincerely,
Pan Tarhei Hosé, PhD.
"Homo sum et cogito ergo odi profanum vulgus et libido."
Actually, I would have thought that a section on how to best configure MySQL for particular performance situations would have been very helpful. I have not read the book, so it may very well have such a section, but this posting implied that configuration was assumed to be a non-advanced topic.
I'm currently working on a web based intranet app that could most likely contain a lot of data (but only used by a small number of people).
I'm thinking about buying this book (although a professional book with typos worries me).. but it does me no good if running these performance enhancements mean squat once the DB gets over a gig or something. Anyone know where the performance dropoff is?
The authors certainly could not be referring to a system that has medium to high levels of inserts or updates.
While MySQL is great for reads, it sucks balls with inserts and updates... even with InnoDB, even with gobs of RAM and 15K SCSI hard drives in Raid 0 configurations on Quad Opteron systems.
And what a sucky review too... Nothing about what the book addresses.
Newsfollow.com
The read only slave got corrupt and in order to create the replica apparantly you have to pretty much disable the master while it creates a new replica so it was unacceptable.
2 years and no mod points. Join reddit. Because openness is good.
In recent news MySQL has defied all industry expectations in its latest performance tests. For the first time, MySQL wrote data faster than a second grader learning to write in cursive.
Industry insiders have hinted that an organization known as Infernal Industries has become a major investor in MySQL. Headed by one Baal Diablos II, this rising star has come to challenge Microsoft and Oracle for the title of most evil corporate entity.
When asked for comment, Mr. Diablos replied "we have 6,000 years of investment experience. We created the modern labor market with our first employees, Adam and Eve. Today, we are becoming the biggest player in IT products thanks to our efforts to bring this formidible experience to the table."
The new, high performance MySQL preemptively versioned at 6.66 comes with a complimentary familiar with no less than 20 years of SQL experience. When asked whether this new round of investment would cause the Roman Catholic Church to divest itself of MySQL products, the Holy See responded that it is strongly looking into PostegreSQL.
Click here or a puppy gets stomped!
Best $120/year I've ever spent, that. And it's a tax write-off.
You have two hands and one brain, so always code twice as much as you think!
Just as in maths transforming the problem to a new domain might make it easier to solve, perhaps there is an easier way for you to solve this problem:
Put your MySQL data onto a Logical Volume Manager volume. Use the LVM snapshot mechanism to insure a consistent view of the data during backup. Here's how:
Create an LVM physical group with enough space to contain all your database, plus enough overcapacity to store any changes during the backup proceedure. So, let us say your database contains 10G of data, will take 10 hours to back up, and will, during that time, undergo revision to about 1G of that data - you create a PV of 11G or more.
Next, create a logical volume big enough to store your data - in the case of the numbers above, create a 10G logical volume. In that volume create your file system, and set up your database.
Now, when you wish to do a backup, lock out access to the database, and do a snapshot of the logical volume it is on. Then restore access to the database. This won't take very long at all.
Mount the snapshot read-only and back it up. If you need to back up through the MySQL, bind a server onto the read-only data.
Then release the snapshot.
During the time the snapshot exists (in this example, the 10 hours to do the backup), the real, live read/write file system may be updated as desired, as long as the total differences between it and the snapshot do not exceed the reserve capacity of the physical group - in the case of the example numbers above that would be 1G of total differences (NOT 1G of writes - change the same 1k record a billion times and it is still a 1K change).
That's one of the reasons the Big Boys (like Sun) pooh-poohed Linux - it did not have LVM. Now it does.
www.eFax.com are spammers
If you need extreme performance in small commercial site, why you need an great and cpu-expensive Referential Integrity? In my opinion, in this case, MySQL is better. Comments?
Rodrigo
I came across the replication chapter online, and was pleased with the author's writing style. .. I think there are many developers out there who don't have a fundamental understanding of how data structures work. As much as it was a dreaded computer science course, The Analysis of Algorithms and Data Structures has served me extremely well in my professional career. Unfortunately, (or fortunately for me) I have encountered many database architectures and data access strategies which don't take advantage of the organization of the data itself. Even though it is common sense most of the time such as O(log n) access instead of O(n^2) I think a chapter which covers these simple concepts would be in order...
Although there are now many visual tools available for MySQL (Query Browser etc) the book explains all the concepts with traditional mysql> command line commands.
This is fine, although I am a big fan of the new visual tools. The authors mention that they will put some chapters in the future versions of this book to show some advanced concepts with these tools.
As far as database performance goes
SQL is generally overkill for a small commercial site in the first place, as is the whole EJB thing, but like Y2K, if it keeps people employed, who am I to argue.
Jeremy has a blog at http://jeremy.zawodny.com/blog/
I guess its safe to say they aren't on the bleeding edge.
"Thanks to the remote control I have the attention span of a gerbil."
Okay, every time I hear about some new feature in MySQL and start making the argument the database may be finally be sufficent for the backend of an OLAP system, someone immediately responds, 'BUT THERE IS NO ROLLUP PEROGATIVE - HOW CAN YOU HAVE OLAP WITHOUT ROLLUP?' Responses tend to differ from there on, but the basic point is that few people agree MySQL can be used for data analysis.
One argument that I have been turning to is that not that many people take the time to become great MySQL developers. It is easy to be good with MySQL, it is terribly difficult to be great with MySQL. Indexes alone are a problem for many people to grasp - the whole leftmost column rule is easily overlooked unless someone is having a problem with a specific query.
Books like this give me hope people will begin looking deeper at MySQL as a development platform and that a wider base of developers will begin to emerge.
Anyways, OLAP? Can this book be used to help people get there?
M
(P.S. I have built an almost-generic OLAP system for my company using a MySQL backend. I really want to know what other people's thoughts are.)
k .. This book is definately a good read however it is lacking in some key mysql clustering areas:
For one thing it shows you some nice clustering scenarios however it doesn't go into any great detail as to actually use mysql 4.1's new in memory clustering capabilities... (At press time they were still beta I guess?)
(from what I've read and observed)
y s.html
Because it's not a "real" database. Sacrifices too many features to be fast.
Because of this list: http://sql-info.de/mysql/gotchas.html
Because it's pawned off as an Oracle or DB2 replacement when it's not. There are instances where a database isn't using all the features of DB2 and in that case, MySQL or Postgres could be used, but that does not make it a "DB2 replacement."
Because data corruption used to be the nom de guerre until InnoDB. There are still skeptics.
Because there are no views.
Somewhere (yes, I know not providing it makes this argument looks stupid. I honestly couldn't find it, since I didn't know which one said it or what precisely it said) there's a good quote about one of the MySQL founders saying something like "PRIMARY KEY is the same as a UNIQUE." This lack of understanding is not appreciated by actual database administrators... See http://www.phpfreaks.com/postgresqlmanual/page/ke
Truth is, most people want ACID (and transaction processing) because it saves them from having to think too much. You perform a whole bunch of data operations. If it doesn't work you role back. If you are just dropping things into a MySQL structure, you have to think harder about what will happen if an operation crashes mid way.
As for the parent post, he's just bought into the illusion that database gurus think at a transcendental level. It is a common ailment. Object gurus, business gurus, religious gurus, political gurus are all treading on the same clouds. It is a common ailment that usually cures itself with time.
With a title like High Performance MySQL, the reader should not expect assistance with tasks such as installation or the basics of SQL.
Nope. For that the reader might require something simple, like MySQL Basics for Visual Learners, a $2 download which I wrote.
"MySQL, if you can read this, your using it right now."
DING DING!
By the tone of your post you seem angry about this book, and scared to read it in case you learn something new and doubt what you have done in the past.
...
1. How to setup replication with different architectures including master/slave, master/slave/slave (another level), replication rings, dual master replication with slaves and more.
Simple, just read a few pages on the MySQL documentation. [mysql.com]
Some people prefer printed book form, as some publishers are aware.
2. RAID (software vs. hardware), RAID types, IDE vs. SCSI. I know. I'm sure you've thought of these things already but most people will find something new here because there is a lot on it. It is quite dense.
What on earth does this have to do with MySQL specifically?
You might benefit from learning about it if you are trying to optimise an SQL server?
And furthermore, if you dont know about _RAID_, and need to be told the benefits of SCSI/Raid, do you really think you need MySQL clustering ?
Well, once the reader has read these chapters they can make this judgement. You've manage to make this judgement for EVERYONE without reading the chapter for which I congratulate you.
Any PC/Server with enough ram will push queries like a maniac. Also, check #4...
queries... I think this book isn't just a book about getting fast queries.
3. Selection of an OS. Seriously. It actually compares threading models on OSes compared to performance with MySQL. This is great stuff.
Great, now that i've bought this book, i'm supposed to _CHANGE_ OS in my server farm, just to gain a few more queries/sec ?
Steady there! Whoa! This isn't advertised as a book for people who are happy with what they have and know they did it right by instinct. This could just be a book for people who want to learn something before they decide what to do NEXT.
4. Selection of a filesystem on an OS. Yes, nitty gritty. Includes talk about journaling vs non-journaling benefits and warnings.
Filesystems? Oboy, anyone who knows about SQL performance knows that if you access _DISK_ during execution, you are screwed anyway.
And whats going to happen to your data when you power down? I suppose you've been committing all your data to files on the disk? It might just be useful to know about the benefits and dangers of the various filesystems.
Anyone who knows anything about indexing knows that btrees and rtrees are optimised for pulling data off random access media. There's better index forms if you are going to be ram based. I forget what they are because all my data is disk based.
5. Load balancing mySQL. How to do this through a load balancer. Why and how it's different than load balancing web servers.
This seems like a decent chapter.
6. How to backup live using replication.
DOH! We are talking backup here, right? Backup means archiving a copy of the database. MySQL allows you to use the magic command "cp" (or scp, or rsync, or..) Best # so far
Totally wrong. If you want to back up a database that is being modified, you need to lock it and that would stop updates. This certainly needs a chapter.
You could read a few pages on mysql.com about this instead if this book offends you. It sounds like you might need to. Some people prefer web sites to books.
7. I love this: Detailed information on how the replication system actually works. I love understanding the inner workings because then I know how/why things went wrong.
The updated node sends commands to the other machines to update as well. If something goes wrong, you could rollback the transaction.
If there are transactions? I can use mysqlbinlog and convert to ascii logs, edit the logs, and replay the edited logs. If you didn't know about this, maybe you acknowledge the need for a chap
blog.sam.liddicott.com
Really, SQL isn't even relational, it's just a rubber-stamp of DB2 (Yay, ANSI!). If you need relational flexibility, use a relational query language.
Uh, MSDE?
I don't respond to AC's.
You graduated from college knowing everything about software development and database administration did you? Or just maybe did you actually have to learn some things as you went along like every other mortal on the planet because you didn't know it all?
Serving pageviews (SELECTs) is one thing, the things I listed are another. Considering Yahoo! used to use flatfiles and BDB, I'd say MySQL was a step up in organization without being overpowered.
It's not about sucking, it's about features. MySQL has the least, so it's at the bottom of the barrel when people go shopping based on features they need.
www.sqlite.org
It seems to go like this.....
DBAs and those who actually know how to use all the features in the high-end database exploit the features in the database. PostgreSQL supports the most features of the free DBs and it's pretty fast when you use connection pooling compared to MySQL, so they go with it.
People see this and immediately declare MySQL crap, dredge up all those reasons not to use it, and declare PostgreSQL the winnar! [sic] People love conflict....
So, you can use whatever you want. I'm sure Yahoo! picked MySQL because of fast SELECTs and that's all that is really important to them. I pick PostgreSQL because I don't want to have to convert from MySQL to PostgreSQL because I need some feature.
"The authors acknowledge that the book isn't meant for the MySQL beginner. Even so, the book does spend a little time going over a refresher of MySQL. If the reader is looking for a book on the basics of MySQL, though ..."
Speaks well for the authors ... as though they're pro (assuming that section isn't just fluff).
I'd take that as an indication of quality.
-- When you look to see how the system works, you usually find that it doesn't.
1. write a 294 pages book about an oxymoron
2. profit!!!!
There you are, staring at me again.
Now if I remember correctly, SAP DB was the ADABAS-D code that MySQL picked up when they partnered with SAP last year. I belive they renamed it MAX DB.