MySQL Database Design and Optimization
This book focuses on MySQL 4.0/4.1 but also gives consideration to v.3.23 users as well as a nod toward v.5. The layout of each chapter gives a description of the topic of the chapter, followed by the meat of the chapter, a summary and what's next (how the context of this chapter ties into the subject of the next). There are numerous "notes", cautionary flags, tips, screen shots, code examples as well as thoughts from each author that provide explanatory asides to the content. The authors also provide references to other volumes, as needed.
A glance through the table of contents will give the reader a precise overview of what to expect in this book: Review of MySQL Basics; MySQL Column and Table Types; Keys, Indexes and Normalization; Optimizing Queries With Operators, Branching and Functions; Joins, Temporary Tables and Transactions; Finding the Bottlenecks, MySQL Programming; and Looking Ahead.
Chapter 1: Review of MySQL Basics gives a very quick (under 50 pages) summary of how to connect to the MySQL server; MySQL's identifiers and naming conventions for databases, tables and columns; a review of MySQL's syntax, writing basic queries and using basic commands (create, drop, select, insert, update, delete); and a discussion of the use of table, column and expression aliases. This section, while adequate, is clearly intended as an analysis of core information necessary to proceed to further chapters.
Chapter 2 follows with MySQL Column and Table Types, which deal with datatypes and structures used to store the data. The goal here is to help the reader design effective tables (and therefore create a well-designed and efficient database) suited to the particular type of data at hand. Numeric types are covered in depth; strings, the null value, ENUM and SET are also addressed as well as common "gotchas" and developer errors.
Keys, Indexes and Normalization come naturally in Chapter 3, with optimal data handling the goal: the chapter addresses getting data in efficiently and getting the results out efficiently, eliminating redundant data, appropriate uses of indexes and common index creation errors.
The core of the book is clearly Chapter 4, "Optimizing Queries with Operators, Branching, and Functions." Here, optimization skills are honed; manipulation and filtering of data is one of MySQL's strengths and this chapter shows the reader how to replace less-than-ideal program logic with SQL constructs to precisely adjust query performance. There's a good demonstration here of outputting a list of member data to a web page. The ultimate goal in this chapter is to provide the reader good skills that translate into better efficiency and faster database interaction. As the authors point out, one obvious logical consequence of this is easier migration between platforms and programming languages.
The next reasonable step is to look at additional features that MySQL has up its sleeve that will save the developer time and effort in the overall scheme of application development. Chapter 5, "Joins, Temporary Tables, and Transactions" discusses three of these additional features. The authors carefully point out that each of these eliminate excess queries needed to pull data, decrease code overhead, minimize the need to store data as application logic, decrease the number of bugs that appear in code and help guarantee data integrity (an aspect of database design that unfortunately often takes a back seat to other priorities as developers are often not concerned with the validity of data in a real world sense; i.e. from the user's perspective).
Chapter 6, "Finding the Bottlenecks," addresses modifying system configuration variables outside of the default and how these can dramatically affect performance. The authors look at some available free tools that help monitor server performance and enable configuration changes including mytop, WinMySqlAdmin, phpMyAdmin and the new MySQL Administrator (available from MySQL AB). MySQL caching capabilities and the ability to decrease repetitious read/writes to disk (good table, key and query caching within MySQL) are discussed. Finally, database interoperability and abstraction layers are mentioned in terms of performance penalties vs. making life easier for the programmer.
MySQL Programming is the topic of Chapter 7, where a very good discussion of the MySQL API is provided. There are a lot of useful examples in this chapter covering many of the common MySQL APIs available (PHP's MySQL and MySQLi, Pythons's MySQLdb, ODBC, Perl's DBI), along with feature discussions and examples.
The final chapter, "Looking Ahead," examines MySQL v.4.1, 5.0 and 5.1 and some eagerly awaited new features, including stored procedures, stored functions, views and triggers.
This is a well-rounded volume on MySQL design. There are excellent examples and the flow of the text is conversational without being rambling and unstructured. The authors have obviously taken great pains to minimize tangents and extraneous information; pithy, but with sufficient detail in mind. The reader is left with neither the sense of being overwhelmed nor longing for an explanation for a glossed-over topic. This book is pretty much a "must have" for a MySQL programmer looking to bridge the gap between novice and professional.
You can purchase Beginning MySQL Database Design and Optimization from bn.com. Slashdot welcomes readers' book reviews -- to see your own review here, read the book review guidelines, then visit the submission page.
Wow! So I guess other databases don't get Slashdotted? That's impressive! (note sarcasm).
MySQL is popular because its easy
You say that like it's a bad thing... is something better if it's difficult to use?
And as an aside, I worked at a biotech. We had an 80 gig database with a quarter billion entries. We were running platform neutral SQL and used MySql, Oracle and MS-Sql server.
We ditched the "commercial grade" databases because they were to slow! MySql did a great job under enormous load.
Agile Artisans
Need a subquery
Not everyone does
FULL JOIN
Again, not everyone does
Check constraints
Still don't, or the application can
trigger logic
You know what I'm seeing a pattern here. I guess it seems that while MySQL doesn't do everything you want, it does enough of what other people want so it would seem that at least to other people, MySQL does not suck.
As far as not throwing errors, we've either used different versions, or your talking about something I haven't come across, since I've seen MySQL throw plenty of errors.
"I use a Mac because I'm just better than you are."
Don't you want to start with just database design and SQL before you would want to move to a book about a specific RDBC implementation? If it is just about database design then the title of the book is wrong.
Then again, if you wish to explain about setting up the database itself, access rights and so on, then the book might be for beginners. Once again, the title would not fit the book.
As anyone should know, the steps in software development are: get it working, get it right, get it optimized. Let's hope that the book does not go to deep into the optimized part in a too early stage.
1) I don't think you're required to buy a MySQL licence unless you need other licence requirements or support from the company.
2) You have any proof that Postgres is faster than mysql? Everything I've seen indicates that mysql is still *slightly* faster than postgres. Most of the systems I've worked with I haven't seen much of a difference. Personally I only use Postgresql for everything, but on other systems mysql seems to be about the same.
It's unfortunate that your comment got modded up, since it makes a claim that isn't true.
I use MySQL in maybe 60% of my projects, and I have been bitten by a majority of those issues. MySQL has some serious problems as an SQL-compliant database (let alone as a *relational* database), and that site simply points them out in an objective manner. In fact he says they aren't bugs or mistakes, but *documented, yet unexpected, behavior*.
I would recommend anyone who is using MySQL (by choice or otherwise) to read and understand that article (and keep in mind some of those problems have been fixed, which he points out).
Also, as a general comment about the closing sentence of your post, have you considered that some pieces of software might be better than others? Or one piece of software might adhere more closely to a standard than another? These are all topics worth discussion.
I have started a company and am in the process of putting my precious mission critical data in a MySQL database. RAID 1 is used to prevent loss of data, and apart from that DVDs will be burnt using the superdrive.
My database app is low volume data traffic to/from the db.
Can anyone tell me where MySQL fails when it comes to mission critical stuff? I'm very eager to find out.
Bert
I'm not so certain. A lot of folks think that MySQL is a good idea; it seems to me that it is in almost every case a mistake, and so posting such things helps ensure that this is well-known. Much like linking to GNU/Linux resources when an article concerns Windows.
FWIW, I've used both PostgreSQL and MySQL.
First of all, I like and use MySQL. I have it running on two public web sites and I use at home for my intranet. It is fast, easy to use, and works well for my purposes. It may not be perfect, but I think it is good. Anyway, on with the reply:
Constraints, triggers, and stored procedures are important to have in a database. While I can make up for MySQL's shortcomings in application code, this is not a good idea. I should not have to validate data in the application. If I could put that code in the database, it will be more robust. Not only would that provide a single point to validate from the application, it would also provide validation for scripts and console access.
Full joins and subqueries help, but in a different way. I should not have to execute two queries to get one result set (minus intermediate data). Queries should be atomic -- one piece of code that executes together. For example, let's say I have two queries that could be combined with a sub select. In between the queries running, something changes the data, invalidating my intermediate values and second query. While the underlying data is fine, it appears to the user that there is data corruption. The second query could return incorrect data or just error out for no valid reason. Sure, most nested selects are not necessary, but sometimes they are. Full joins are not usually necessary, but in about 0.0001% of the queries they are.
Maybe for the typical blog site these features are not all that critical, but for enterprise-class databases they are. The database application I use at work is bad enough in Oracle (poor DB design), it would fall apart at the seams in MySQL. Constraints, triggers, etc. help keep it working until we can reengineer it.
24 beers in a case, 24 hours in a day. Coincidence? I think not!
> > Need a subquery
> Not everyone does
but when you do, you are Sh^Hadly Out of Luck.
> > FULL JOIN
> Again, not everyone does
see above.
> > Check constraints
> Still don't, or the application can
I hope you'll pardon my putting this so bluntly, but this is an absolutely classical n00b mistake. In my experience, a useful data store doesn't have "the" application or "the" interface. It has several to start with and eventually grows many of them. That is why the data store itself has to maintain any data integrity/business rules inviolably. The alternative is to keep n (for some large n) code bases perfectly in synch and functioning exactly the same way.
> > trigger logic
> You know what I'm seeing a pattern here. I guess
> it seems that while MySQL doesn't do everything
> you want, it does enough of what other people
> want so it would seem that at least to other
> people, MySQL does not suck.
It doesn't do enough of what people need when they're doing things that involve money, for example. There just isn't any way to use MySQL for an application that requires Generally Accepted Accounting Practices (GAAP) short of building a true relational database as middleware.
> As far as not throwing errors, we've either
> used different versions, or your talking about
> something I haven't come across, since I've
> seen MySQL throw plenty of errors.
See the "gotchas" page, so often quoted here, re: the failure to throw errors when it's supposed to.
What part of "A well regulated militia" do you not understand?
PostgreSQL doesn't have stored procedures, yet anyway. When you talk about stored procedures in SQL, you aren't talking about a function that returns something (which postgres users have been able to define forever.) You're talking about a procedure in the Pascal sense: it performs a sequence of operations on the database, without having any result to return.
That said, its easily faked by having a function in postgres where everyone ignores the return value.
Hi, I'm one of the authors of the book.
:)
1. Thanks to Mary for the positive review.
2. Thanks to Mike Hillyer for his invaluable help with the book. Say what you like about Visual Basic (I happen to loathe it, myself), Mike's an excellent programmer, and his knowledge of MySQL is superb. In fact, part of the way through the process of writing this book, he was hired by MySQL AB to work with the teams developing the Connectors and the new GUI tools. His site VBMySQL.com provides a valuable and unique resource for VB and other Windows developers wanting to build DB applications who'd like to use an actual database instead of Access and don't feel like condemning themselves or their users to paying for SQL Server. Rather than flame him for his language and platform choices, you should commend him for introducing many Windows programmers to an Open Source technology. (BTW, you might be interested in knowing that he also uses Linux and programs in C++ as well.) It was a privilege to have him work on the book with us, and it's a privilege to work with him now at MySQL AB. And he's a damn good writer.
3. We wrote the book because there's a lot of MySQL installations out there, and a lot of very badly done MySQL databases. Granted, there are some things that MySQl isn't (yet). But it is fast and stable -- or can be. And it's certainly possible to throw those advantages away through poor DB and application design by people who don't know the difference between a database and a spreadsheet or who don't know how to leverage SQL to do their heavy lifting for them. We chose not to spend a great deal of time with enforcing foreign keys because a great many administrators are still running MySQL 3.23 and don't bother to make InnoDB available. Besides, if you expect people to understand key constraints, you have to get them to normalise first, and many devs don't even do that.
4. We wanted to encourage PHP developers to make the transition to ext/mysqli as soon as possible.
5. I don't know what other people may have experienced with Apress, but they've been damned nice to me, and I can tell you that Gary Cornell does answer his email, even when it comes from a lowly writer who's not yet even signed a contract. Speaking of which -- their contracts are much better than Wrox' or Wiley's. And since I've been associated with them, they've dumped at least one bad editor and another one that I'd heard some not-so-favourable things about.
6. While we didn't cover this in the book, fans of Postgres might wish to take note: We already have a working Cluster implementation, and we're anxious to see what yours will look like.
Il n'y a pas de Planet B.
Coding without subqueries teaches you a lot about SQL.
Seriously, before reading stuff about how to get around not having subqueries, I was writing much less efficient SQL code.
Now, I rarely ever need subqueries even though they're available -- I've learned to optimize many of them into joins, or pre-query the information I want seperately since I'll usually reuse it several times elsewhere.
- Michael T. Babcock (Yes, I blog)