MySQL Database Design and Optimization
This book focuses on MySQL 4.0/4.1 but also gives consideration to v.3.23 users as well as a nod toward v.5. The layout of each chapter gives a description of the topic of the chapter, followed by the meat of the chapter, a summary and what's next (how the context of this chapter ties into the subject of the next). There are numerous "notes", cautionary flags, tips, screen shots, code examples as well as thoughts from each author that provide explanatory asides to the content. The authors also provide references to other volumes, as needed.
A glance through the table of contents will give the reader a precise overview of what to expect in this book: Review of MySQL Basics; MySQL Column and Table Types; Keys, Indexes and Normalization; Optimizing Queries With Operators, Branching and Functions; Joins, Temporary Tables and Transactions; Finding the Bottlenecks, MySQL Programming; and Looking Ahead.
Chapter 1: Review of MySQL Basics gives a very quick (under 50 pages) summary of how to connect to the MySQL server; MySQL's identifiers and naming conventions for databases, tables and columns; a review of MySQL's syntax, writing basic queries and using basic commands (create, drop, select, insert, update, delete); and a discussion of the use of table, column and expression aliases. This section, while adequate, is clearly intended as an analysis of core information necessary to proceed to further chapters.
Chapter 2 follows with MySQL Column and Table Types, which deal with datatypes and structures used to store the data. The goal here is to help the reader design effective tables (and therefore create a well-designed and efficient database) suited to the particular type of data at hand. Numeric types are covered in depth; strings, the null value, ENUM and SET are also addressed as well as common "gotchas" and developer errors.
Keys, Indexes and Normalization come naturally in Chapter 3, with optimal data handling the goal: the chapter addresses getting data in efficiently and getting the results out efficiently, eliminating redundant data, appropriate uses of indexes and common index creation errors.
The core of the book is clearly Chapter 4, "Optimizing Queries with Operators, Branching, and Functions." Here, optimization skills are honed; manipulation and filtering of data is one of MySQL's strengths and this chapter shows the reader how to replace less-than-ideal program logic with SQL constructs to precisely adjust query performance. There's a good demonstration here of outputting a list of member data to a web page. The ultimate goal in this chapter is to provide the reader good skills that translate into better efficiency and faster database interaction. As the authors point out, one obvious logical consequence of this is easier migration between platforms and programming languages.
The next reasonable step is to look at additional features that MySQL has up its sleeve that will save the developer time and effort in the overall scheme of application development. Chapter 5, "Joins, Temporary Tables, and Transactions" discusses three of these additional features. The authors carefully point out that each of these eliminate excess queries needed to pull data, decrease code overhead, minimize the need to store data as application logic, decrease the number of bugs that appear in code and help guarantee data integrity (an aspect of database design that unfortunately often takes a back seat to other priorities as developers are often not concerned with the validity of data in a real world sense; i.e. from the user's perspective).
Chapter 6, "Finding the Bottlenecks," addresses modifying system configuration variables outside of the default and how these can dramatically affect performance. The authors look at some available free tools that help monitor server performance and enable configuration changes including mytop, WinMySqlAdmin, phpMyAdmin and the new MySQL Administrator (available from MySQL AB). MySQL caching capabilities and the ability to decrease repetitious read/writes to disk (good table, key and query caching within MySQL) are discussed. Finally, database interoperability and abstraction layers are mentioned in terms of performance penalties vs. making life easier for the programmer.
MySQL Programming is the topic of Chapter 7, where a very good discussion of the MySQL API is provided. There are a lot of useful examples in this chapter covering many of the common MySQL APIs available (PHP's MySQL and MySQLi, Pythons's MySQLdb, ODBC, Perl's DBI), along with feature discussions and examples.
The final chapter, "Looking Ahead," examines MySQL v.4.1, 5.0 and 5.1 and some eagerly awaited new features, including stored procedures, stored functions, views and triggers.
This is a well-rounded volume on MySQL design. There are excellent examples and the flow of the text is conversational without being rambling and unstructured. The authors have obviously taken great pains to minimize tangents and extraneous information; pithy, but with sufficient detail in mind. The reader is left with neither the sense of being overwhelmed nor longing for an explanation for a glossed-over topic. This book is pretty much a "must have" for a MySQL programmer looking to bridge the gap between novice and professional.
You can purchase Beginning MySQL Database Design and Optimization from bn.com. Slashdot welcomes readers' book reviews -- to see your own review here, read the book review guidelines, then visit the submission page.
...SQL Performance Tuning is an excellent book. It has a lot of good discussion on when to use certain SQL contructs and how to check your database to ensure you're actually getting improvements.
PLUG: Which SQL queries are taking the most time? PQA home page, download.
The Army reading list
right here
___
If you think big enough, you'll never have to do it.
OPTIMIZE TABLE `tablename`
h tml
or you can look here
http://dev.mysql.com/doc/mysql/en/OPTIMIZE_TABLE.
The RFDs (Request for Discussions) for both PostgreSQL and MySQL are on news.group. In about one month, both groups will be voted on, if it passes, the groups will be found under comp.databases.*.
& Su rveyID=36
If you want more information, visit news.groups with your usenet server.
Right now, there aren't ANY postgresql or mysql groups under the big 8 comp. domain.
Remember to stay tuned for the CFV so they get voted into the domain! Here is a nice web poll you can take to voice your support of the groups getting into the big 8 usenet hierarchy:
http://scripts.postgresql.org/survey.php?View=1
Vote yes, so they know there is support for a big 8 comp.databases.postgresql newsgroup as one does not exist yet!
MySQL is popular because it is known and because every $20 a month web hosting service supports it. Period. It's not any easier than other databases... the fact that MySQL supports a limited subset of SQL and only allows you to use the basics doesn't mean that more powerful databases don't also let you use the basics.
At least they finally added support for subqueries. That was always the nail in their coffin as far as I was concerned.
-1 Uncomfortable Truth
If you are in the UK you can get the book here for 10% off and free delivery up to Dec 25th :)
A few are floating around for £20 as well.
Don't forget stored procedures. MySQL doesn't support them (I think the new version is going to, though). On PostgreSQL you have their own SP language or you can write the procedures in C... also, I think transaction support is not as good in MySQL as in other databases, but I don't know for sure.
Oh, and sequences. MySQL has the auto increment property for a field, but doesn't have sequences that can be independent of a table.
Go hug some trees.
A reply to #1 (because I have no idea what #2 is really like, I've heard stories both ways)
MySQL with the release of version 4.0 changed the licence of it's libraries from LGPL (commercial code can link to these libraries) to GPL (commercial code cannot link to these libraries, anything that does becomes GPL). Any program you write that uses the mysql libraries is considered a derivative work by the GPL. So unless you write your own mysql libraries (to the undocumented wire-protocol of mysql 4.0), you have to buy a license for anything not open source.
This was the biggest reason for my move to Postgres. I make my living building web applications for people, my clients get the source code so that they can make modifications, but not the right to redistribute the source code (unless I use anything GPL, then they get the full GPL licence). So it's nice to be able to use a proper database server, and support the developers of that database server without having to open the source of everything I create. (Most things I create however, would just be another useless entry on freshmeat, so I don't feel too bad about not contributing these things back)
Why do people keep saying MySQL doesn't have transactions or subqueries? For heaven's sake, please update your 4-year-old facts!
Oh, and PostgreSQL is actually better for ANSI SQL compliance. It's one of the most SQL-standard compliant database I've ever seen (certainly more standard than Firebird). Call me when Firebird has TIMESTAMP WITH TIME ZONE datatype.
Aside from taking advice from an AC who says something "sux" (real technical term there. Care to back it up?) There are no "auto incrementing" datatypes in postgresql, only default nextval('somesequence'). Note the "default" there.
Before calling this "broken" ask yourself how you'd insert a row into a database with two columns having the same number, without having an external object to get that number from. I use this all the time in billing systems that deal with families. One field for the ID, one field for the ID of who gets the bill, usually the same. In postgresql I can use nextval('sequence') for the first field and lastval('sequence') for the second, in a single command. In MySQL InnoDB only appears to support a single AUTO_INCREMENT column, so I'd have to insert the record, locate the record and read its id, then update the record with that id in the second field.
Not only that, MySQL can only count by one, unlike postgres which can use whatever number is next in the sequence: positive, negative, or even zero.
Use InnoDB tables instead of MyISAM tables and you can have foreign key contraints. Also, if you consider an application where data is only ever input through the web or some other front-end GUI, foreign key contrains aren't necessarily needed (I'll admit they're still good to have and use), because you can control the input through restricted UI elements like drop-downs, radio buttons, etc.
And MySQL 4.1 HAS been certified production-ready, for what that's worth.
Coding without subqueries is a pain though, you're absolutely right, although not a deal-breaker. We got by with temporary tables for a long time, and now we can use subqueries to do the same thing.
Views are on the 5.0 agenda I know, not so sure about synonyms, but I'll not try to argue that you can do everything with MySQL that you could with Postgres. I think what it boils down to is that very many of the common applications of MySQL out there simply do not NEED the extra functionality these tools provide, so it's a fine choice.
"Recta non toleranda futuaris nisi irrisus ridebis"
I'm using 4.1 (its not a production release yet)...
MySQL 4.1 series marked as stable
Also, if you consider an application where data is only ever input through the web or some other front-end GUI, foreign key contrains aren't necessarily needed (I'll admit they're still good to have and use), because you can control the input through restricted UI elements like drop-downs, radio buttons, etc.
:)
I know a lot of people do that (myself guilty on occasion), but you're the first person to publicly admit it
Constraints are very important.
First, in the app, you check the input for basic security reasons (remove quotes and backslashes and whatnot). Even if it's a radio button, that wouldn't slow me down from sending your website invalid data. Check it every time.
The constraints are important because if you have several parts to an application, or perhaps several applications all accessing the same data, you don't want one part to adversely affect another. If one part of the app tries to insert invalid data that would confuse another part of the application, the constraint will prevent the insertion and raise an error right there. Then you very quickly know where the bug is. If you had no constraint, the bug could appear 3 days later in some report of sales data, and you'd have no idea where to find the bug.
Constraints have saved me a LOT of time bug-hunting.
Social scientists are inspired by theories; scientists are humbled by facts.
The lack of referential integrity means you cannot guarantee the state of the data in the database, unless you run a whole bunch of queries to check.
If there is only one way to get data in, through one application that can enforce the integrity on entry, and check it on retrieval, you may be OK. As long as you know nobody will ever be updating it directly in the database.