Slashdot Mirror


Good Database Design Books?

OneC0de writes "I am the Director of IT for a small/medium sized marketing company, where I personally write the code that runs our applications. We use a variety of technology at our office, the majority of which rely on MS-SQL and MySQL databases. I am familiar with tables, SQL queries, and have a general understanding of how the SQL databases work. What I'm looking for is a good book, particularly a newer book, to explain general database design techniques, and maybe explain some relational tables. We have some tables that have million of rows, and I'd like to know the best method of designing these tables."

13 of 291 comments (clear)

  1. Database in Depth by Anonymous Coward · · Score: 5, Informative

    Database in Depth: Relational Theory for Practitioners
    Publisher: O'Reilly Media; 1 edition (May 1, 2005)
    Language: English
    ISBN-10: 0596100124
    ISBN-13: 978-0596100124

    Best DB book i have ever owned/read/seen!

    1. Re:Database in Depth by Anonymous Coward · · Score: 3, Informative

      I agree wholeheartedly that Database In Depth is one of the best DB books in print - but would recommend for this reader instead Date's slightly later book SQL and Relational Theory, which replaces the Tutorial D examples with SQL and goes more in depth into how to use SQL relationally.

  2. O'Reilly by JWSmythe · · Score: 3, Informative

    O'Reilly books are your friend. The "... in a Nutshell" books are a good place to start, and then proceed into the more advanced books. They have 25 titles related to MySQL and 53 titles related to Microsoft SQL. There are usually a few to browse through at the large chain book stores.

    --
    Serious? Seriousness is well above my pay grade.
  3. Re:Somewhere, a coder is polishing his resume by OneC0de · · Score: 5, Informative

    I'm the article poster. Our company is relatively small, with an IT staff of less than 5, and total company size less than 50. I write all the code, simply because none of our other IT pros are comfortable enough writing it. If there were "coders" under me to ask, please believe I would use them as a resource first.

  4. Re:A Few Suggestions by RobertM1968 · · Score: 4, Informative

    We have some tables that have million of rows, and I'd like to know the best method of designing these tables.

    I'm a developer, not a database expert. But it seems that every now and then I have to get my hands dirty with data modeling. "The best method" is probably a really vague concept. If you have serious hardware constraints than the best method changes from an easily maintainable system to something more complex. There's give and take in database design and I guess a million rows is really something that a traditional relational database should be able to handle. So I'd suggest any book that teaches data modeling will suit you here.

    eldavojohn makes some excellent points and gives some great suggestions. Keep in mind, like elda suggests, nothing is cut and dry. Configuration, resources, numbers of connections for specific data, etc; all will have an impact (or should) on what you should do and how you should design.

  5. Re:A Few Suggestions by hguorbray · · Score: 5, Informative

    http://www.amazon.com/Case-Method-Entity-Relationship-Modelling/dp/0201416964

    I used this book at Foothill college in an intro to data management class and it taught me more than any of the dozen oracle classes I took once I got past the terminology of tuples, etc

    this one is also well-recommended:
    http://www.amazon.com/Database-Systems-Design-Implementation-Management/dp/0760049041

    and this one is good for people without dba or architect background:
    http://www.amazon.com/Database-Design-Mere-Mortals-Hands/dp/0201752840/ref=sr_1_1?ie=UTF8&s=books&qid=1278629171&sr=1-1

    I would stay away from the vendor specific books as good database design s/b dbms agnostic

    -I'm just sayin'

  6. Good SQL design books: by 8282now · · Score: 5, Informative

    IMHO: Joe Celko's SQL for Smarties (http://www.amazon.com/Joe-Celkos-SQL-Smarties-Programming/dp/0123693799/ref=sr_1_2?ie=UTF8&s=books) has shown itself to be very nice book when the need to go beyond the basics to a little deeper understanding of SQL is needed.

    There are many other books on the subject all the way to source material from Date and Dodd but Celko seems to be well informed and writes fairly well, I think.

  7. Do you know relational algebra? by Cyberax · · Score: 3, Informative

    Do you know relational algebra? If you don't, then I highly recommend:

    Codd, E.F. (1990). The Relational Model for Database Management (Version 2 ed.). Addison Wesley Publishing Company. ISBN 0-201-14192-2.

    It's MUCH better to know the fundamentals of database systems and then try to figure out details than vice-versa.http://ask.slashdot.org/story/10/07/08/2142211/Good-Database-Design-Books?art_pos=1#

  8. Would you like to reduce your development time?.. by meburke · · Score: 3, Informative

    ...and improve your quality and maintainability?

    Back in the 70's and early 80's we learned a methodology called, "Data Structured Systems Design" and the fundamental presupposition was that everything could be expressed logically and accurately by describing it as relationships in set theory. I have not seen anything since that surpasses the quality and maintainability of database applications and systems.

    Someone already mentioned Joe Celko's book "SQL for Smarties" and I would recommend you first read his, "Thinking in Sets" before any of his other books.

    I would also suggest some earlier books by Ken Orr and Jean Dominique Warnier. If you learn the Warnier-Orr approach to DESIGNING the system before doing any coding, you will reduce the time necessary for maintaining the system. I have seen hundreds of small IT shops like yours, and much of the time Systems Analysis and Design is neglected and performed "off-the-cuff" by programmers who can't wait to get to the coding. I didn't originally believe Ken Orr's assertion that spending twice as much time designing the system would result in a sharp time reduction for overall project completion, but through experience and observation I became a believer.

    --
    "The mind works quicker than you think!"
  9. Three practical lessons by Invisible+Now · · Score: 5, Informative

    These three lessons may not all be in any one book, but they can help in the real world:

    1) Learn what SQL Injection is and how to defend against it. It will ruin your day and could severely damage your current employment situation.

    2) Abstract your schema from your front-end applications. Stored procedures are easy to write and can provide security and if well written stop injection attacks. They will let you change your database design without breaking your deployed apps. Just update the internal code in the P. Middleware and objects can do this, too.

    3) Bergstrom's law of sailing says: "You can get away with anything in less than 5 knots of wind." Similarly, any little box or blade with 2 to 4 gs of RAM can easily handle 5 to 10 million row tables. Dedicate the server to MySQL or MS SQL so they can cache and buffer efficiently and they will outperform much bigger boxes trying to run too many schemas and DBs concurrently. Learn to index. Don't be too puritanical about normalization. Returning a customer address should require 6 joins. And remember that moving that moving large recordsets across the LANWAN may take much more time than the server query.

    You probably already know all this... but maybe someone else reading this doesn't.

    --

    "Knowing everything doesn't help..."

  10. Re:Somewhere, a coder is polishing his resume by Lord_Frederick · · Score: 3, Informative

    Our CIO has a programming background and once fixed some database code we were having problems with. This is a 10,000 person organization with an IT staff of around 300. It's not hard to imagine a small company where the IT director takes on some programming tasks.

  11. Re:modeling is even more important by luis_a_espinal · · Score: 3, Informative

    In the modern days of cheap disk, big disk caches, and large ram, proper modelling is more important than strict normalization.

    Back when those books were written, disk was expensive and not cached, RAM was very expensive, and machines had terrible I/O bottlenecks.. Normalization is critical under these circumstances for maximum performance.

    Today, these normalization techniques will increase performance but not as much as you might think. Really it is best to concentrate efforts elsewhere, especially for a one-person shop.

    All of that normalization work requires coding changes and it will undoubtedly make the code much less readable and maintainable.

    <facepalm/> Performance? I wasn't even thinking of that as a reason to understand normalization. I'm thinking data integrity at least in the conceptual model.

    The value of normalization is not so much in performance but in considering and planning what a decent data/information model should look like.

    You normalize your *model*, your blueprint, and de-normalize as you see fit with the resources available. The actual tables and tablespaces might not (will not) look 1-to-1 to the model, but you still have a model of information.

    And when you do deviate from your model, you do so consciously; you know exactly where you are deviating from the model; and you know why. Without a model, or worse, with a badly crapped model, you don't know what you have.

    It is good to exploit the hardware capabilities we have now, BUT without at least having a conceptual understanding of normalization, this is almost always a sure way to get into a corner where the only option is to throw more hardware to the problem.

    The distinction here now is that people deploy hardware strategically, but because they have no choice: shit won't run without. Hardware is cheap. Operational costs are not. Understanding normalization is to (relational) data modeling and building what modularity and structure are to OO design (and software building in general.)

    Normalization is not about performance (even if its immediate effects in the past were performance related.) It is about reduction of unnecessary data redundancy that compromises data integrity.

    Not performance. Data Integrity.

    Even with the hardware that we have today, I still have to see a well-designed model that does not in great part enforce the 2nd normal form and most attributes of 1st normal form (in particular about avoiding duplication of rows and maintaining regular columns.)

  12. Re:A Few Suggestions by postbigbang · · Score: 3, Informative

    Yeah, no one said that people can't strangle themselves and do foolish stuff. Big production apps need to be written by people that know what they're doing.

    But your citations are at the edge of the curve. You're a black belt, and your stuff better run fast and cleanly or your creds are dirt. These are civilians. They learn, and hopefully know when it's time to get a pro into the equation before they hurt themselves.

    Some won't, but the same can be said for car repair and even nuclear physics (viz the LHC forehead slappers).

    Your kewl self knows this stuff cold. Let other people learn, even if they get hurt. If they have to pay to get stuff fixed, it's a risk that they likely knowingly take from the onset.

    I wish there were real tools with real front ends that you could give to a civilian, knowing they couldn't hurt themselves. But like a chainsaw, you have to hope that when they fire things up, they know a little about what they're doing.

    --
    ---- Teach Peace. It's Cheaper Than War.