MySQL Database Design and Optimization
This book focuses on MySQL 4.0/4.1 but also gives consideration to v.3.23 users as well as a nod toward v.5. The layout of each chapter gives a description of the topic of the chapter, followed by the meat of the chapter, a summary and what's next (how the context of this chapter ties into the subject of the next). There are numerous "notes", cautionary flags, tips, screen shots, code examples as well as thoughts from each author that provide explanatory asides to the content. The authors also provide references to other volumes, as needed.
A glance through the table of contents will give the reader a precise overview of what to expect in this book: Review of MySQL Basics; MySQL Column and Table Types; Keys, Indexes and Normalization; Optimizing Queries With Operators, Branching and Functions; Joins, Temporary Tables and Transactions; Finding the Bottlenecks, MySQL Programming; and Looking Ahead.
Chapter 1: Review of MySQL Basics gives a very quick (under 50 pages) summary of how to connect to the MySQL server; MySQL's identifiers and naming conventions for databases, tables and columns; a review of MySQL's syntax, writing basic queries and using basic commands (create, drop, select, insert, update, delete); and a discussion of the use of table, column and expression aliases. This section, while adequate, is clearly intended as an analysis of core information necessary to proceed to further chapters.
Chapter 2 follows with MySQL Column and Table Types, which deal with datatypes and structures used to store the data. The goal here is to help the reader design effective tables (and therefore create a well-designed and efficient database) suited to the particular type of data at hand. Numeric types are covered in depth; strings, the null value, ENUM and SET are also addressed as well as common "gotchas" and developer errors.
Keys, Indexes and Normalization come naturally in Chapter 3, with optimal data handling the goal: the chapter addresses getting data in efficiently and getting the results out efficiently, eliminating redundant data, appropriate uses of indexes and common index creation errors.
The core of the book is clearly Chapter 4, "Optimizing Queries with Operators, Branching, and Functions." Here, optimization skills are honed; manipulation and filtering of data is one of MySQL's strengths and this chapter shows the reader how to replace less-than-ideal program logic with SQL constructs to precisely adjust query performance. There's a good demonstration here of outputting a list of member data to a web page. The ultimate goal in this chapter is to provide the reader good skills that translate into better efficiency and faster database interaction. As the authors point out, one obvious logical consequence of this is easier migration between platforms and programming languages.
The next reasonable step is to look at additional features that MySQL has up its sleeve that will save the developer time and effort in the overall scheme of application development. Chapter 5, "Joins, Temporary Tables, and Transactions" discusses three of these additional features. The authors carefully point out that each of these eliminate excess queries needed to pull data, decrease code overhead, minimize the need to store data as application logic, decrease the number of bugs that appear in code and help guarantee data integrity (an aspect of database design that unfortunately often takes a back seat to other priorities as developers are often not concerned with the validity of data in a real world sense; i.e. from the user's perspective).
Chapter 6, "Finding the Bottlenecks," addresses modifying system configuration variables outside of the default and how these can dramatically affect performance. The authors look at some available free tools that help monitor server performance and enable configuration changes including mytop, WinMySqlAdmin, phpMyAdmin and the new MySQL Administrator (available from MySQL AB). MySQL caching capabilities and the ability to decrease repetitious read/writes to disk (good table, key and query caching within MySQL) are discussed. Finally, database interoperability and abstraction layers are mentioned in terms of performance penalties vs. making life easier for the programmer.
MySQL Programming is the topic of Chapter 7, where a very good discussion of the MySQL API is provided. There are a lot of useful examples in this chapter covering many of the common MySQL APIs available (PHP's MySQL and MySQLi, Pythons's MySQLdb, ODBC, Perl's DBI), along with feature discussions and examples.
The final chapter, "Looking Ahead," examines MySQL v.4.1, 5.0 and 5.1 and some eagerly awaited new features, including stored procedures, stored functions, views and triggers.
This is a well-rounded volume on MySQL design. There are excellent examples and the flow of the text is conversational without being rambling and unstructured. The authors have obviously taken great pains to minimize tangents and extraneous information; pithy, but with sufficient detail in mind. The reader is left with neither the sense of being overwhelmed nor longing for an explanation for a glossed-over topic. This book is pretty much a "must have" for a MySQL programmer looking to bridge the gap between novice and professional.
You can purchase Beginning MySQL Database Design and Optimization from bn.com. Slashdot welcomes readers' book reviews -- to see your own review here, read the book review guidelines, then visit the submission page.
...SQL Performance Tuning is an excellent book. It has a lot of good discussion on when to use certain SQL contructs and how to check your database to ensure you're actually getting improvements.
PLUG: Which SQL queries are taking the most time? PQA home page, download.
The Army reading list
nono. you're right.
MySQL is there cause people aren't willing to learn the full potential of SQL which postgresql takes advantage of.
In general, people try MySQL because
1) It's popular among php boards and its name is reknown
2) It's easy
Then you see those sites with mysql query errors when under heavy load or whichever silly programming mistakes.
MySQL is popular because its easy
MySQL is popular because its easy
Funny, I thought bashing certain software packages was popular because it was so easy!
I tried for 5 years to come up with a clever sig...only to realize that I am not clever.
right here
___
If you think big enough, you'll never have to do it.
Ok, but the others are really not that hard. Take Firebird for instance. It installs with a double-click. Its self-tuning, nothing for the user to configure if he/she does not want to. Just as easy as MySQL to integrate into PHP. And with a low memory footprint.
There really is no reason to stick with the hack job that is MySQL except for the fact that you already know it. I realize that is not an insignificant reason, but the next time you have to deal with the BS of MySQL, consider switching.
3) because it's cheaper than MSSQL Server and performs as good if not better.
A good way save money for a larger company, and a
sometimes the only choice for a startup.
If you think
3) It's probably already available at your hosting provider, whereas Postgresql probably isn't (vicious cycle with #1)
Wow! So I guess other databases don't get Slashdotted? That's impressive! (note sarcasm).
MySQL is popular because its easy
You say that like it's a bad thing... is something better if it's difficult to use?
And as an aside, I worked at a biotech. We had an 80 gig database with a quarter billion entries. We were running platform neutral SQL and used MySql, Oracle and MS-Sql server.
We ditched the "commercial grade" databases because they were to slow! MySql did a great job under enormous load.
Agile Artisans
MySQL is not easy. Need a subquery? You're out of luck, pal. A FULL JOIN? Sorry. Check constraints? No dice. Throw an error instead of taking a bad guess when you've fatfingered an input? Oops. And God help you if you need some trigger logic for auditing purposes, because MySQL AB certainly won't.
What part of "A well regulated militia" do you not understand?
I know this is kinda off topic, but it is kinda about optimizing MySQL, but I deleted 60% of the rows in a 60MB MySQL table on my webhost and the size hasn't changed. Is there some command to "pack" the space?
mySQL may not be appropriate in a mission-critical situation, but that does not make it bad for all situations; if you need speed in prescedence of everything else then mySQL is probably the right tool for the job. If you need data integrity, ACID compliance etc then PostgrSQL, Oracle etc are the right tool for the job.
Outright saying one or the other is a POS only makes you look stupud.
There's mischief and malarkies but no queers or yids or darkies within this bastard's carnival, this vicious cabaret.
And as an aside, I worked at a biotech. We had an 80 gig database with a quarter billion entries. We were running platform neutral SQL and used MySql, Oracle and MS-Sql server.
We ditched the "commercial grade" databases because they were to slow! MySql did a great job under enormous load.
You may be right there. I've heard good things about MySQL and smaller databases. Oracle, at least, is not really designed for small databases as well as it is for large ones, I think. All of the overhead doesn't really begin to provide value until you're using a good bit of data.
I don't respond to AC's.
OPTIMIZE TABLE `tablename`
h tml
or you can look here
http://dev.mysql.com/doc/mysql/en/OPTIMIZE_TABLE.
The RFDs (Request for Discussions) for both PostgreSQL and MySQL are on news.group. In about one month, both groups will be voted on, if it passes, the groups will be found under comp.databases.*.
& Su rveyID=36
If you want more information, visit news.groups with your usenet server.
Right now, there aren't ANY postgresql or mysql groups under the big 8 comp. domain.
Remember to stay tuned for the CFV so they get voted into the domain! Here is a nice web poll you can take to voice your support of the groups getting into the big 8 usenet hierarchy:
http://scripts.postgresql.org/survey.php?View=1
Vote yes, so they know there is support for a big 8 comp.databases.postgresql newsgroup as one does not exist yet!
Then why not use PostgreSQL? For companies it's cheaper (MySQL licence for commercial use is almost as much as the basic Windows 2003 Server Licence), and for open source lovers it's even more free.
And that of course ignores the fact that it's also technically superior - faster, proper data integrity, and real SQL (sorry, but for an SQL programmer, the kiddie pool syntax available in MySQL is The show stopper)
MySQL is the very thing open source is supposed to be against - software being used just because everyone else is using it, without regard for if it actually has any merit. MySQL is the Windows of the database world.
MySQL is popular because it is known and because every $20 a month web hosting service supports it. Period. It's not any easier than other databases... the fact that MySQL supports a limited subset of SQL and only allows you to use the basics doesn't mean that more powerful databases don't also let you use the basics.
At least they finally added support for subqueries. That was always the nail in their coffin as far as I was concerned.
-1 Uncomfortable Truth
If you are in the UK you can get the book here for 10% off and free delivery up to Dec 25th :)
A few are floating around for £20 as well.
Insert "MySQL isn't a real database so this book makes no sense" comment here.
INSERT INTO mysql_sucks (comment) VALUES ('MySQL isn't a real database so this book makes no sense');
I'm sure "SlashdotMedia" will improve on all the wonders that Dice Holdings blessed us all with
I just checked it out on my usenet server, and it is true. There are NO MySQL and Postgresql groups in the comp.databases.* hierarchy on usenet.
I would recommend anyone who uses these databases to stay tuned to news.groups and find out how to vote for the creation of these groups on usenet. The result would be the creation of the following groups:
comp.databases.mysql
comp.databases.postgresql
escaping is good.
Need a subquery
Not everyone does
FULL JOIN
Again, not everyone does
Check constraints
Still don't, or the application can
trigger logic
You know what I'm seeing a pattern here. I guess it seems that while MySQL doesn't do everything you want, it does enough of what other people want so it would seem that at least to other people, MySQL does not suck.
As far as not throwing errors, we've either used different versions, or your talking about something I haven't come across, since I've seen MySQL throw plenty of errors.
"I use a Mac because I'm just better than you are."
Don't forget stored procedures. MySQL doesn't support them (I think the new version is going to, though). On PostgreSQL you have their own SP language or you can write the procedures in C... also, I think transaction support is not as good in MySQL as in other databases, but I don't know for sure.
Oh, and sequences. MySQL has the auto increment property for a field, but doesn't have sequences that can be independent of a table.
Go hug some trees.
Don't you want to start with just database design and SQL before you would want to move to a book about a specific RDBC implementation? If it is just about database design then the title of the book is wrong.
Then again, if you wish to explain about setting up the database itself, access rights and so on, then the book might be for beginners. Once again, the title would not fit the book.
As anyone should know, the steps in software development are: get it working, get it right, get it optimized. Let's hope that the book does not go to deep into the optimized part in a too early stage.
I've been doing the PHP/SQL thing for money for a while now, and I've been able to meet all my needs so far. I've taken a peek at Postgres, but I haven't found a good explanation and usage examples for some of the features that mysql lacks. I understand nested queries, transactions, and foreign keys, however I haven't found anything that helps me understand stored procedures, views, or triggers.
Computers are useless. They can only give you answers.
-- Pablo Picasso
I'm getting off on a tangent here, but:
I don't think the grandparent is trolling MySQL's ability to handle heavy loads but rather its tendency to attract n00bs.
You should disable display of PHP errors on production web sites, for instance ("php.ini" even tells you to), and use the server's error log instead (or a separate log for PHP if you prefer), but few webmasters seem to bother anymore. It's also easy enough to catch MySQL errors and code pretty error messages for them for when your site is getting Slashraped, so there's no excuse for letting PHP barf cryptic database error messages everywhere (no matter what you're database server is running) as many sites do.
Seriously, is there any other reason for these MySQL book reviews? They all sound the same, and who buys books on MySQL optimization anyway? The manual and 'EXPLAIN' should be enough for anyway.
Enough ranting, continue your too-highly-moderated offtopic flamewars.
"'Yrch!' said Legolas, falling into his own tongue."
Postgres doesn't yet unless you've got cygwin, and the version of postgres that will run on Windows (version 8, I think?) isn't out yet.
But it's pretty close, so I'll try it and then wait a few months until other people have tested it for me ;)
How is this a rant? Are you saying the problems they list don't exist?
I like that site cause it contains no spin: it just lists the facts and provides references to the documentation. Is it the facts that bother you?
___
If you think big enough, you'll never have to do it.
Nice review, but if the book is not from O'Reilly, I probably won't buy it.
...Install PostgreSQL
ZERO
1) I don't think you're required to buy a MySQL licence unless you need other licence requirements or support from the company.
2) You have any proof that Postgres is faster than mysql? Everything I've seen indicates that mysql is still *slightly* faster than postgres. Most of the systems I've worked with I haven't seen much of a difference. Personally I only use Postgresql for everything, but on other systems mysql seems to be about the same.
It's unfortunate that your comment got modded up, since it makes a claim that isn't true.
I use MySQL in maybe 60% of my projects, and I have been bitten by a majority of those issues. MySQL has some serious problems as an SQL-compliant database (let alone as a *relational* database), and that site simply points them out in an objective manner. In fact he says they aren't bugs or mistakes, but *documented, yet unexpected, behavior*.
I would recommend anyone who is using MySQL (by choice or otherwise) to read and understand that article (and keep in mind some of those problems have been fixed, which he points out).
Also, as a general comment about the closing sentence of your post, have you considered that some pieces of software might be better than others? Or one piece of software might adhere more closely to a standard than another? These are all topics worth discussion.
I have started a company and am in the process of putting my precious mission critical data in a MySQL database. RAID 1 is used to prevent loss of data, and apart from that DVDs will be burnt using the superdrive.
My database app is low volume data traffic to/from the db.
Can anyone tell me where MySQL fails when it comes to mission critical stuff? I'm very eager to find out.
Bert
Ever here of a product called MAX DB, put out by MySQL AB? It does all those things, and more.
How true. For instance, emacs is a much better piece of software than vi.
taken! (by Davidleeroth) Thanks Bingo Foo!
I just assumed that it congealed in a gutter somewhere.
Experienced users will find quite a lot of valuable information that will extend their existing knowledge base.
Experienced users.. MySQL .. Experienced users... MySQL .. man, talk about cognitive dissonance!!
I guess some experienced users are stuck *maintaining* MySQL databases so maybe that's what he's talking about..
What would be nice is a book review (and book) showing developers the best practices to PHP and Pear DB development so that PHP programmers can create apps that are SQL database agnostic -- i.e. can have a MySQL or PostgreSQL backend without much code change in PHP. A mere change in the PHP line telling it what server, where, and login will only be required. I'm sure this book is probably already written, anyone care to point me where?
Linux at home
INSERT INTO mysql_sucks (comment) VALUES ('MySQL isn't a real database so this book makes no sense');
Are you sure MySQL could handle such a complicated procedure?
MySQL is the very thing open source is supposed to be against - software being used just because everyone else is using it...
You've clearly read Slashdot at least once, since you're posting here. What rational person could observe the mindless conformity of Slashdrones and then expect nonconformity and critical thought from the same group of people?
Open source amounts to a few hundreds projects making worthwhile contributions to society, while mindless code monkeys uselessly scratch their own itches in the the other 6.3 million projects. If the filthy bastards would take a shower, they'd have fewer itches.
MySQL runs on Windows
Postgres it is then. Thanks for the tip.
And PHP is the MySQL of the language world. Sweet irony.
A reply to #1 (because I have no idea what #2 is really like, I've heard stories both ways)
MySQL with the release of version 4.0 changed the licence of it's libraries from LGPL (commercial code can link to these libraries) to GPL (commercial code cannot link to these libraries, anything that does becomes GPL). Any program you write that uses the mysql libraries is considered a derivative work by the GPL. So unless you write your own mysql libraries (to the undocumented wire-protocol of mysql 4.0), you have to buy a license for anything not open source.
This was the biggest reason for my move to Postgres. I make my living building web applications for people, my clients get the source code so that they can make modifications, but not the right to redistribute the source code (unless I use anything GPL, then they get the full GPL licence). So it's nice to be able to use a proper database server, and support the developers of that database server without having to open the source of everything I create. (Most things I create however, would just be another useless entry on freshmeat, so I don't feel too bad about not contributing these things back)
First of all, I like and use MySQL. I have it running on two public web sites and I use at home for my intranet. It is fast, easy to use, and works well for my purposes. It may not be perfect, but I think it is good. Anyway, on with the reply:
Constraints, triggers, and stored procedures are important to have in a database. While I can make up for MySQL's shortcomings in application code, this is not a good idea. I should not have to validate data in the application. If I could put that code in the database, it will be more robust. Not only would that provide a single point to validate from the application, it would also provide validation for scripts and console access.
Full joins and subqueries help, but in a different way. I should not have to execute two queries to get one result set (minus intermediate data). Queries should be atomic -- one piece of code that executes together. For example, let's say I have two queries that could be combined with a sub select. In between the queries running, something changes the data, invalidating my intermediate values and second query. While the underlying data is fine, it appears to the user that there is data corruption. The second query could return incorrect data or just error out for no valid reason. Sure, most nested selects are not necessary, but sometimes they are. Full joins are not usually necessary, but in about 0.0001% of the queries they are.
Maybe for the typical blog site these features are not all that critical, but for enterprise-class databases they are. The database application I use at work is bad enough in Oracle (poor DB design), it would fall apart at the seams in MySQL. Constraints, triggers, etc. help keep it working until we can reengineer it.
24 beers in a case, 24 hours in a day. Coincidence? I think not!
Okay, why?
In the database world, there is a solid, underlying foundation: relational theory. This was developed, what, 30 years ago? It has *provable* characteristics.
So when you want a database, pick the one that has the most of those characteristics. Unfortunately, most folks seem to think it's Emacs vs. vi. Just pick the one that "feels" better, or that they learned first. That's fine for text editors, but not, say, for the foundation of a mission-critical finance system.
And Vim is better than either..
You call 80 GB a "small" database? Granted, I'm sure certain entities like credit reporting bureaus and the federal gov't have terabytes of data, but 80 gig seems fairly large to me..
You make a good point. From now on I will not program in C++ until I find an open source compiler that respects the export keyword.
Just to inform you, a few weeks ago the server I manage crashed despit the mirroring (one disk failed, and the ext3 filesystem has corrupted entry).
The machine have mysql and postgresql databases. result is simple: almost all mysql users have trashed tables, absolutly _NO_ users have any glitch with postgresql. If your data have any values, forget mysql, only use postgresql. Periode.
For me, data integrity is the highest priority, all other aspects are just cosmetic.
I've used MySQL for years in very high volume, large scale, distributed "mission critical" applications (if you call running your whole company "mission critical"). Not once has the database failed to perform it's job. Networking, Hard Disks, Memory, etc... They all failed. It's funny to read all the self described "experts" on here rip the product when the list of household names successfully and reliably using it on a daily basis is long and distinguished (like my johnson).
> > Need a subquery
> Not everyone does
but when you do, you are Sh^Hadly Out of Luck.
> > FULL JOIN
> Again, not everyone does
see above.
> > Check constraints
> Still don't, or the application can
I hope you'll pardon my putting this so bluntly, but this is an absolutely classical n00b mistake. In my experience, a useful data store doesn't have "the" application or "the" interface. It has several to start with and eventually grows many of them. That is why the data store itself has to maintain any data integrity/business rules inviolably. The alternative is to keep n (for some large n) code bases perfectly in synch and functioning exactly the same way.
> > trigger logic
> You know what I'm seeing a pattern here. I guess
> it seems that while MySQL doesn't do everything
> you want, it does enough of what other people
> want so it would seem that at least to other
> people, MySQL does not suck.
It doesn't do enough of what people need when they're doing things that involve money, for example. There just isn't any way to use MySQL for an application that requires Generally Accepted Accounting Practices (GAAP) short of building a true relational database as middleware.
> As far as not throwing errors, we've either
> used different versions, or your talking about
> something I haven't come across, since I've
> seen MySQL throw plenty of errors.
See the "gotchas" page, so often quoted here, re: the failure to throw errors when it's supposed to.
What part of "A well regulated militia" do you not understand?
Its a troll :)
I'm not a fan of MySQL tho. Yes, it is fast... As long as you are only querying data.
If you have multiple users updating data, MySQL isn't such a good option.
for enterprise-class databases
Then don't use MySQL there, and that was my point. There's no shortage of DB products to use, so use the right tool for the job. I never said no one needed stored procedures, triggers or subqueries, I said some people don't need them. Every time a MySQL story comes up, people act like its the only DB around and they're forced to use it.
SAP seems to disagree that MySQL has no place as an enterprise database though.
"I use a Mac because I'm just better than you are."
In my example, we saw the same speed difference with database creation (which had lots of reads and maybe 10% writes) and also when we used the database in our application, which was all reads.
I haven't really tried Postgres though, so I can't speak to it... I've got to play with sometime and see if it grabs me.
Agile Artisans
access is also popular because it's easy. can you see the trend?
If you mod me down, I will become more powerful than you can imagine....
I would be curious to know how you think Postgress recovers data that no longer exists?
Perhaps a more likely conclusion to draw from your "experience" is that your MySql tables were on the part of the disk that crashed and the Postgres tables were not.
Btw, by the time you get out of high school, take a spelling class. You're worse than most of the posters here.
Agile Artisans
I assume you only code in assembler?
Agile Artisans
I've found FireBird to be an excellent choice for better ANSI SQL compliance, support for ACID transactions, single-file database portable across platforms, etc. It's also one of the easiest to administer that I've come across. I like that the embedded version is a single shared libary that implements the client API. I've not used it for a truly large database. I've used DB2, Oracle, SQL Server (the only database I've used that became corrupted and lost production data) and Informix.
As for MySQL, the lack of transactions and poor support for SQL has put me off it for real projects (no subqueries--life's too short).
Database design should be a generic RDBMS book for the most part. It does not make much sense to repeat table design techniques and philosophy for each RDBMS product. (However, giving vendor-specific tips and limits is understandable.) It might be cheaper to purchase and write a generic book about table design because it can be written and printed for multiple products. Then again, many publishers simply copy-and-paste semi-generic topics with slight custom tuning.
Table-ized A.I.
Views, synonyms and referential integrity (foreign key constraints) would be very nice too.
When I find out why VHS became more popular than technically superior Betamax, I'll figure out why Mysql is more popular than Postgres.
If you need speed over everything else then just use BerkeleyDB and cut out the cutesy middle layer. (MyCuteLittleSQL;^) tm)
"And that of course ignores the fact that it's also technically superior - faster, proper data integrity, and real SQL (sorry, but for an SQL programmer, the kiddie pool syntax available in MySQL is The show stopper)"
I'd love to see your detailed explanation on this one. Faster?? Are you high? The planner in Postgres sux. Proper data integrity?? Why does it allow me to stick a value in a field that's supposed to be auto-incremented without producing an error??
Yes, database normally can recover from a crash since the real ones use something called a journal. When they come up in an inconsistent stater, they read the journal and figure out how to get into a consisent state. The poster you are responding to is not the only person to point out MySQL and data corruption issues. It is far too common to be considered a productioin database. Also, you don't use views, synonyms or more importantly, referential integrity??? Come on, the poster, who was probably not a native English speaker may not spell well, but at least he doesn't need to take a class on databases.
In such a scenario, I'd have to say mySQL isn't really there. MySQL is a terrific solution for all kinds of web publishing / application needs. But if finances are at stake, you gotta go with a product where you can sue someone over stuff that goes wrong.
$5 / month hosted VPS on linux = awesome!
Bert-
Honestly, use PostgreSQL if you need a SQL database. There are a ton of reasons not to use MySQL, especially for M.C. data. I ship a product that supports nearly every db known to man (PostgreSQL, DB2, SQL Server (MS & Sybase), Sybase SQL Anywhere, Oracle, NCR Teradata, Informix, SAP DB) but we will not support MySQL anytime soon. Data corruption problems, lack of referential integrity, no triggers, no views, no synonyms, no stored procedures, these aren't things that are optional for a modern database. If you don't need any of that, do yourself a favor and use Berkley DB.
No wonder I can't find useful answers for mysql questions!
"Evil thrives when good men do nothing"
... more like a boxer playing chess
----
Not to be confused with Col.
Aside from taking advice from an AC who says something "sux" (real technical term there. Care to back it up?) There are no "auto incrementing" datatypes in postgresql, only default nextval('somesequence'). Note the "default" there.
Before calling this "broken" ask yourself how you'd insert a row into a database with two columns having the same number, without having an external object to get that number from. I use this all the time in billing systems that deal with families. One field for the ID, one field for the ID of who gets the bill, usually the same. In postgresql I can use nextval('sequence') for the first field and lastval('sequence') for the second, in a single command. In MySQL InnoDB only appears to support a single AUTO_INCREMENT column, so I'd have to insert the record, locate the record and read its id, then update the record with that id in the second field.
Not only that, MySQL can only count by one, unlike postgres which can use whatever number is next in the sequence: positive, negative, or even zero.
Face it, if you want to write commercial apps you have to CHOOSE.
I worked in a _HUGE_ database project and it was slow as hell because we couldn't use native solutions for optimization, because we were required to maintain code compatibility. They had licensed Informix, and we got to maintain it compatible (to justify a gazillion dollars investment in an already obsolete DB).
So, want to use Limit? No thanks. Want to find out the thread ID's? No thanks. Want to improve performance by using native mySQL functions? No thanks. Named locks? logic functions inside queries? etc etc...
I use phplib DB_SQL to keep layers separation. Frankly, if you want to be 100% "neutral" you would end up using only the most basic features of SQL (select... from... where. ta-da!), and that's useless.
Let the other guys handle SQL adaptation when the business decides to adopt another implementation. Still, if you code multi-tier you won't have much problem there. I see multi-tier programming having many more advantages than simply maintaining SQL neutrality.
PostgreSQL doesn't have stored procedures, yet anyway. When you talk about stored procedures in SQL, you aren't talking about a function that returns something (which postgres users have been able to define forever.) You're talking about a procedure in the Pascal sense: it performs a sequence of operations on the database, without having any result to return.
That said, its easily faked by having a function in postgres where everyone ignores the return value.
Yes, subqueries make the programmer's life a LOT easier. But IMHO using joins works in most cases. There's a little problem, it requires a perfect understanding of how they work, or you'd get in trouble. Also, I've read that joins is generally more efficient than doing subqueries.
Oh yes, I'm using MySQL 4.0x and it has unions, too.
If everything else fails, use temporary tables.
I would disagree with the comment that it is not built for the enterprise. I work for a major telecom company and we use it to track SS7 signaling(Every call made in the US generates 5 events that are logged for traces) . Our vendor has load tested it extensivly. If they believe and I have witnessed it take a mass calling event and log it properly then I would call that reliable. Mass Calling is >=100,000 calls at 5 messages per call. Half a million entries in about an hour and a half should work just fine.
Seems to me as if people always thought of MySQL 3.23. InnoDB is supplied with MySQL 4, and it's A.C.I.D. compliant.
Problem is, very few people use it because they don't know it (InnoDB is relatively new, 2 years ago we didn't have MySQL 4).
yo dadda sucks my ass
Chernobyl 'not a wildlife haven' - BBC News
MySQL is popular because it is good. It's simple to install, the C client is probably the most straightforward in an RDBMS, it's reliable, and it performs well enough for many small to midsize web sites. Is it an Oracle? Well, no, but, then again, not everyone needs a 400hp car and not everyone needs a 20k database server!
This is my sig.
Can I use this database as a replacement for large databases, such as Oracle Databases?
I have no idea whether MySQL implements this properly, but given the idiocy they used to spew about relational integrity, I sure wouldn't bet on it. After all, giving wrong answers is usually faster.
I do understand stored procedures, views and triggers, but I don't understand foreign keys !... may be I missing some chapters in the book !
Betamax had higher image quality, but paid the price by having shorter run time on each tape. VHS became more popular because it's longer two hour run time allowed a user to fit an entire movie on the tape.
I think this is directly applicable to the MySQL vs. PostgreSQL debate because for a very long time MySQL has been available and easy to use on Windows and on other platforms. PostgreSQL has not been available on Windows except through the use of Cygwin with problems.
In both cases its about fitting user's (especially novice users) needs.
MySQL's behavior is erratic in that it is e.g. incredibly incosisitent when it comes to NULL-related behavior. Just admit it and move on with your life, you'll be a happier person.
One post hardly consitutes carpet bombing. In fact, the only single person I see prolonging this thread (by responing to these so-called "anti-MySQL" people as you like to label them) is you.
I, for one, am happy that I know about that page in case I ever decide to deploy MySQL for anything. Much of the documented behavior is downright scary and absolutely necessary to know about if you care about correctness at all.
HAND.
So, you've been lucky to not hit those cases or you just haven't been doing enough testing to notice the problems. Even so: One anecdote does not equal universal truth, dumbass.
FAT32 can hold 80GB of mp3s with no problem, and FAT32 is not exactly Oracle.
/dev/null. It can take the INSERTs as fast as I can send them, and never bogs down my system with pesky I/O bandwidth, and never tells me it's full.
It's not how much a DB can hold, it's how easy/fast it is to get exactly the information about your information you want; nothing more, nothing less. The ease and performance of getting such information must scale also, not just the amount the DB can hold.
Heck, you know what's a great database?
So, 80GB can be a lot if you have demanding requirements about the details of your data and the relationships with other data. If not, it's just some bytes (that you optionally want back). Then it doesn't matter, just use FAT32.
Social scientists are inspired by theories; scientists are humbled by facts.
Commercial DBs often require a degree of tunning. They may also be doing things like keeping transaction logs.
Plus if you are using platform neutral SQL that works on MySql, you are basically using a small subset of SQL. You could probably get much better perfomance using the imporved features they offer.
The questions is, do you need those things, are they worth the extra cost? If not, MySql is going to be fine for you.
I like postgresql, but thanks for trashing that guy's post.
PostgreSQL has great recovery from a power failure. But if the bytes aren't there, the bytes aren't there. In fact, if the inodes aren't there, as far as postgresql is concerned, the bytes aren't there.
I assume that MySQL has some basic ability to recover from power failure. I'd be a little concerned that the database would be inconsistant, but it would probably not be corrupt. I'd like to know if anyone has any experience with power failure corrupting a DB of either type (pg or mysql).
Social scientists are inspired by theories; scientists are humbled by facts.
Access's backend DB sucks. Access does not suck. It's a valuable tool to manage databases and also to create client access software to distribute to the people who need to access the data. Web interfaces have made it less necessary, but I bet it's a lot quicker in Access still.
Social scientists are inspired by theories; scientists are humbled by facts.
The lack of referential integrity means you cannot guarantee the state of the data in the database, unless you run a whole bunch of queries to check.
If there is only one way to get data in, through one application that can enforce the integrity on entry, and check it on retrieval, you may be OK. As long as you know nobody will ever be updating it directly in the database.
Grandparent: and apart from that DVDs will be burnt using the superdrive.
Parent: If there is only one way to get data in, through one application that can enforce the integrity on entry, and check it on retrieval, you may be OK.
Sounds like you have two routes to your data right there, whatever application does the updating *and* mysql_dump (or whatever it is called that does the backups) Ask yourself, is there any moment in my application I do not want a backup to be made? The correct use of transactions should ensure there is not. Running backups at night is a proven unseccesfull sollution to this ;-) Just a side note, your application threatsafty should be perfect to trust you application for data integrity control. It is just as close to perfect as your database system code right?
Postgresql isn`t the only free as in speech alternative, there is sap-db as well which even has a lot of oracle compatibility features.
Hi, I'm one of the authors of the book.
:)
1. Thanks to Mary for the positive review.
2. Thanks to Mike Hillyer for his invaluable help with the book. Say what you like about Visual Basic (I happen to loathe it, myself), Mike's an excellent programmer, and his knowledge of MySQL is superb. In fact, part of the way through the process of writing this book, he was hired by MySQL AB to work with the teams developing the Connectors and the new GUI tools. His site VBMySQL.com provides a valuable and unique resource for VB and other Windows developers wanting to build DB applications who'd like to use an actual database instead of Access and don't feel like condemning themselves or their users to paying for SQL Server. Rather than flame him for his language and platform choices, you should commend him for introducing many Windows programmers to an Open Source technology. (BTW, you might be interested in knowing that he also uses Linux and programs in C++ as well.) It was a privilege to have him work on the book with us, and it's a privilege to work with him now at MySQL AB. And he's a damn good writer.
3. We wrote the book because there's a lot of MySQL installations out there, and a lot of very badly done MySQL databases. Granted, there are some things that MySQl isn't (yet). But it is fast and stable -- or can be. And it's certainly possible to throw those advantages away through poor DB and application design by people who don't know the difference between a database and a spreadsheet or who don't know how to leverage SQL to do their heavy lifting for them. We chose not to spend a great deal of time with enforcing foreign keys because a great many administrators are still running MySQL 3.23 and don't bother to make InnoDB available. Besides, if you expect people to understand key constraints, you have to get them to normalise first, and many devs don't even do that.
4. We wanted to encourage PHP developers to make the transition to ext/mysqli as soon as possible.
5. I don't know what other people may have experienced with Apress, but they've been damned nice to me, and I can tell you that Gary Cornell does answer his email, even when it comes from a lowly writer who's not yet even signed a contract. Speaking of which -- their contracts are much better than Wrox' or Wiley's. And since I've been associated with them, they've dumped at least one bad editor and another one that I'd heard some not-so-favourable things about.
6. While we didn't cover this in the book, fans of Postgres might wish to take note: We already have a working Cluster implementation, and we're anxious to see what yours will look like.
Il n'y a pas de Planet B.
On PostgreSQL you have their own SP language or you can write the procedures in C...
Or in perl or python! Imagine using the same language you use for web scripting for controling data integrity, right there in the database where everybody has to/can use it, where it is neatly wrapped inside your transactions, and where it can do any work with you data to reduce it to a sane size dataset before being send accross the line.
No, mysql is just a dangerouse code for your data. Just list how many filedescriptors mysql keep open on all the tables it uses. For exemple on my server now there is 148 filesdescriptors open from mysql, majority of them are on tables.
Now if you look at postgresql process, you see only a few pipes and socket and only two logs files open, no tables. Even if you make some big query, is not easy to see filedescriptors open on table from postgresql. Postgresql take care to protect at most as possible your data.
The probability to trash a posgresql table is not null, but the window is very thin. Mysql instead manage to have almost 100% probability to trash your data. What happens to my serveur data is just the consequence of the design of the code running on it.
I'll back this up, i look after a lot of database servers + web servers. Never seen postgresql get trashed, even under the worse situation (raid mirror fail, filesystem courpted, most files on partiation are unreadable. postgresql is still running ok??)
I've repeatily seen mysql databases die with minor issues like server under too heavy a load and to many writes at same time, or unclean shutdown etc.
Though the one bonus of mysql db's i've found is that you can move there tables between different archetures without dump and restore. Not that you should be doing that anyway. Only database i've seen where this actually works.
Postgresql keeps a resonsible journal tables stretching back a few days. I've found recover possible before with the journal logs when parts of disk arn't there. Its not a disk error but file system error. If mysql sync'd better and had journaling it would be a lot more reliable.
Confucius say, "Find worm in apple - bad. Find half a worm - worse."
At the bottom of the
And yet they both get the job done, day in and day out. Imagine that!
Take your condescending attitude and fuck off, already. You're not helping anyone at all. You're just being an asshole and the world has enough of those already.
In fact, since you've provided absolutely NO constructive critism at all, I'm inclined to think that you know absolutely nothing about the topic and are just playing the "everything except the industry standard 'best' is just a toy" approach so that other elitists will think you do.
No, MySQL really will trash your tables if you drop power or otherwise end the program without giving it a chance to catch signals and flush buffers. MySQL keeps too many balls in the air for the sake of speed. The OP is right - integrity is FAR more important than speed if the accuracy and persistence of your data matters. PostgreSQL does a much better job of covering its bets -- logged transactions can be recovered even if the buffers aren't flushed before you quit. That's why their updates are slower, they're done right.
BTW - not all people who post here are from USA, and I'd like to see _you_ spell properly in portugese...
Make sure you don't take that backup while anyone is using the database. The lack of ACID means that you could back up half an update and make your backup useless.
(Also make sure you don't try to do this as a live filesystem backup, regardless of what DB you're using, unless they have a system to allow for live filesystem backups. Most DBs keep the recent transactions cached in memory instead of being written to the drive immediately. Use the appropriate dump tools and backup the dump.
"I use MySQL in maybe 60% of my projects, and I have been bitten by a majority of those issues"
Ok, look someone's not reading here... I'll make this one last post, but I really have to stop this thread.
I never said that the items listed were incorrect. They would not work as rhetoric if they were lies. However, the page is designed, not to help (all of the items listed are in the documentation on the MySQL site), but to discourage the use of the database. Any database can be ranted about, but for some unknown reason people choose to rant about MySQL quite a lot.
I think part of this is that PostgreSQL as a project has felt that their time and effort on full ANSI compliance was not rewarded, and so a lot of specifically PostgreSQL folks want to rail on about how MySQL did not make such an attempt, but instead focused on non-standard features like their full-text searching.
That's fine, but let's not confuse those disagreements for any actual problems with either product.
MySQL has some serious problems as an SQL-compliant database (let alone as a *relational* database)
You got those backwards. SQL compliance implies that you are a relational (e.g. set theory applied to a table structure) database. MySQL might be non ANSI SQL 9X compliant, but that does not make it a non-relational database. There are ANSI SQL features which MySQL simply does not support.
and that site simply points them out in an objective manner. In fact he says they aren't bugs or mistakes, but *documented, yet unexpected, behavior*.
It is most certainly not objective, and if you really think that it is, then I don't think we can have this conversation rationally.
Ok, that's it. Thanks guys, but I really do need to go get on with my life. If my comments here have been helpful, great, if not... well, good luck with your work anyway.
Yes, I know there are no auto-incrementing datatypes per se, it creates a sequence for that function, I work with Postgres everyday.
So a sequence is created, the only way Postgres can implement auto-increment. Then I insert a row where I specify a value for a column that has a sequence function and it takes it. In my mind that should generate an error, I'm inserting data where I shouldn't. I guess you don't think it's a problem.
Yes, the sequence flexibility is better than MySQL, but I've never needed a field to count up by 3's so it doesn't matter to me.
(P.S. In MySQL you reference the LAST_INSERT_ID for your example)
I second this. Also see here.
-- Qu'est-ce que la propriété intellectuelle? It is thought control.
I believe that its the other way around. Its based on MySQL and used for things of this nature. I think that I saw it on MySQL's site before the SAP deal, but I could be wrong on both counts.
"I use a Mac because I'm just better than you are."
However, this is slashdot, where open-source is good and proprietary is bad. Therefore, a novice will see that MySQL and Oracle are both databases, but one's open-source, so MySQL must be better for their critical data and systems.
Whoops.
SAP probably still sells more systems that run against Oracle and DB2 than against MySQL. Further, let's not hold up SAP as a shining example of openness -- unless you've ever supported it, you do not know the horrors that lie within. If you ever wanted to see a fascist company from a support perspective (oh, you didn't follow our recommendations? No support for you!)
MySQL is an excellent alternative to Access or as a simple backend for data without ACID or sophisticated design.
To use an analogy, it's a garden trowel. Use it to dig up your flower box. However, it's not the right tool to till or cultivate a cornfield.
Hell, DB2 didn't have sequences until version 8. Of course, one can semi-implement them in databases which don't support sequences by:
Create a table with a single IDENTITY/AUTOINCREMENT column.
Instead of fetching from a sequence, do an insert into this table, then select using whatever your system is for getting an IDENTITY/AUTOINCREMENT column back. Delete the row when done to prevent you from having a giant table of sequential values.
When you want to reset the sequence, drop and recreate the tables.
I don't think you understood my post at all. I wasn't advocating oracle, I was just pointing out that an 80GB database is quite common, and in order to impress someone with a database, you need to show what you can do with 80GB, not that you have 80GB.
Social scientists are inspired by theories; scientists are humbled by facts.
You can do the same thing in MySQL.
It's not a problem. I think every database allows this.
And for your comment to his example, you still have to use two queries (one insert, one update) whereas with Postgres (and Oracle) it would only be one.
Agreed. I've oft seen it discussed in the Gnome development circles (nothing against KDE, I just don't read their dev lists) that a killer app would be an Access-like front end to MySQL/Postgres/etc. for Linux (or even platform agnostic). Access is a great system for designing small to medium sized databases with. The back-end blows under load.
- Michael T. Babcock (Yes, I blog)
I have fully sync'd MySQL binary logs, rotated, lasting 7 days, on a seperate partition, mounted with sync on. I've had hard power drops and not lost data. I lost data on MyISAM tables with no logging a few years ago, but I wouldn't expect that to have worked anyway. I actually had a query log so I could reconstruct the data myself anyway from a tape backup + log files.
I don't expect software to survive situations the OS might not even survive. My servers are battery-backed, but I still use tapes. I use RAID-1, RAID-10 or RAID-5, but still use tapes. I have log-structured filesystems, but I still use tapes.
No tape backup, no security.
Now a binary logging system that writes directly to a secondary tape unit, that I'd like.
- Michael T. Babcock (Yes, I blog)
How the hell is that insightful?
You MySQL haters do realize that we're not just talking about MySQL version 1.0 anymore, right? Linux 1.0 kinda sucked too.
MySQL has binary logging of queries, the InnoDB back-end supports referential integrity as well as consistency (at least in my tests).
Perhaps you didn't read the MySQL manual and shut off autocommit before testing? Or did you actually test it before spouting off?
BEGIN;
SELECT Amount from Account1 where ID = Foo;
UPDATE Account2 SET Amount = Amount + Bar WHERE ID=Foo2;
-- spawn new process --
SELECT Amount from Account2 where ID = Foo2;
# returns original amount or blocks
-- back in original process --
UPDATE Account1 SET Amount = Amount - Bar WHERE ID = Foo;
COMMIT;
-- spawn new process --
SELECT Amount from Account2 WHERE ID = Foo2;
SELECT Amount from Account WHERE ID = Foo;
Both return the right values. Do it yourself.
You pulled the plug half-way through? Big deal, still good.
Its called active development.
- Michael T. Babcock (Yes, I blog)
Of course you can do that. But it's a complex transaction, not very efficient. You have to insert, select, then delete, just to get a new value, instead of just selecting it from a sequence... and yes, you can write a function that inserts, selects, deletes and finally returns the selected value, so that you only have to select from the function, but internally it's doing the same thing...
Go hug some trees.
You're right, I got confused. They're functions. But like you say, you just ignore the return value and execute the function by doing a 'select function(param1, param2)' and that's it. You can define a function that returns void, like in C, so that's practically a stored procedure (except that to execute it you still have to do 'select function()'.
Functions in PostgreSQL can even return rows (at least in version 7.4 which I'm using, you can return a row). Outbound parameters are not yet supported, I hope they include that feature for version 8.
Go hug some trees.
I never said it wasn't a hack or somewhat inefficient -- but it's the best way to get that done in systems which don't support sequences.
Sadly, most people who work in systems without sequences never even consider their existence, so they use identity/autoincrement columns indiscriminately, and you can't undo that kind of design decision simply.
Thank you very much, you just saved me 25 MB.
Do you think MySQL has speed benefits only because MySQL developers are so much better than all other RDBMS developers, and not because of speed hacks that put your data at risk?
Do you think robust RDBMS design is just dumb luck?
LRC, the best-read libertarian site on the web
The stable version does NOT have subqueries. Version 4.1 does, but that is still beta quality software.
That not up-to-date. 4.1 isn't considered beta since 2004-06-24 and had it's first production release over a month ago, on 2004-10-23. See here
The transaction support in 4.0 is horrible and slow. Rollbacks are O(n) operations (for InnoDB, that is) etc...
Slow rollbacks are not really a problem, if you don't abuse them. That is, don't write your transactions with failure as an expectation (like inserting a row and waiting if it fails due to a unique key), but do your own checks (most often you don't need additional queries, because you already have the data or can expand an existing query). Not only does that make your queries faster with most databases (even those with good rollback support), but also makes your application more robust (in case of db changes)
Keep an eye on which arguments are silently dropped in replies. Not always, but often times it's very telling.
Nope. I was running MySql in a non-transactional mode... and that makes all the difference.
Do you think robust RDBMS design is just dumb luck?
Do you think every program that isn't your personal favorite must be "bad"?
Agile Artisans
> MySQL licence for commercial use is almost as much
> as the basic Windows 2003 Server Licence
Apples and oranges. Windows 2003 Server doesn't include a database.
And MySQL licensing doesn't include any per-seat or per-connection crapola either.
Since InnoDB is marked stabled since Feb 2002 and you use it "for years" I doubt you're using transactions.
You should not rely on a non-ACID DB in a mission critical environment. You might have luck for several years - but there's no guarantee.
And there are a lot of ways where MySQL will fail miserably in case of failure, here's just a simple example:
update balances
set balance=balance-1000.0
where id=some-id;
Now your NIC fails (or your router or whatever)
now your application tries to book that to another customer:
update balances
set balance=balance+1000.0
where id=some-other-id;
Oops, that failed. But where's the money gone? It vanished....
The existence of a non-transactional mode is a speed hack that puts your data at risk. You can save a lot of disk I/O by using a random number generator instead of your "database".
It looks like Open Office is targeting MS Access in their next release.
:). Does Open Office 1.9.mXX render a document differently than your other office suite (be it Word, Wordperfect, Works or etc)? Do your part:
I have been testing the 1.9.mXX releases (currently preparing for their first beta release) for a few weeks now, and most issues have been minor formatting changes, and a couple crashes here and there -- all of which have been reported. The one major issue I have come across is the problem where graphics are lost after saving changes to a document Issue 36459 where the fix has already been checked into 1.9.m64.
The pre-beta milestones have improved import/export filters, and many new features most people will like. It needs people to stomp on it in their worst way (ie, regular usage
o Create a seperate file with only the portion that changed from the other suite.
o Create a PDF of the document from the other suite.
o Create a PDF of the document from the latest development snapshot of OpenOffice (using the same pdf program if possible -- I use the Open Source PDFCreator which is a ghostscript front end).
o Zip the files (I use the Open Source 7-zip program)
o Report the issue (and create an account if you don't have one) with your best description of the problem with a step by step list of what actions need to be performed to reproduce the problem.
o Attach the zip file of the three files you created before.
The development code needs more people who are able to write bug reports -- that means don't install it on grandma's computer. If you think "it doesn't work" is a good bug report then please move to the next message, thanks.
There: Something at a specific location.
Their: Owned by someone.
Please make sure your english compiles.
That's a non sequitur. Why would I generalize my MySQL experience to other databases that I've never used? Don't try to shift the blame for MySQL suckage onto its critics.
LRC, the best-read libertarian site on the web
Still, you sort of have a point. There are plenty of applications that work well within MySQL's limitations. Then again, they also work well with other low-end DBMSs, like FireBird/Interbase and PosgreSQL. (Let's leave Oracle out of the discussion -- you don't use that kind of software unless you have deep pockets and serious support infrastructure.) Now, if you know for sure that your application will never evolve beyond MySQL's limitations, I suppose there's no harm in using it. But most application do evolve.
I don't usually care for the usual "It sucks!" versus "It rules!" debates you see on Slashdot. Everything has its strengths and weaknesses. But in this case I have to say that MySQL does suck. Not in the sense that it's totally useless (obviously not, since so many people use it). But it does suck you into a development path that you will come to regret following.
Boo to the Mod who gave this a Troll. This is a simple and true statement. Plus anyone who says 'no worries' is obviously polite ;)