Beginning SQL?
$ynergy writes "I have been seeing more and more job listings for SQL programmers so, naturally, my interest as been sparked. I have picked up a few materials but, soon realized that it would be easier to apply if I had experience using database software. Would everyone agree? So I am looking for resources, online or in print, that would give a beginner a real in depth look at using database SW." There are at least two issues here: a) learning standard SQL (pick a standard, any standard :) and b) learning all the idiosyncracies of a particular database system. Probably learning basic SQL is the way to start.
The oreilly book 'practical postgresql' is online for free at http://www.commandprompt.com/ppbook/. It has some useful information about postgresql and sql in general.
I learned SQL (MySQL style I guess, thats all I've ever used, flame me please, its only a filesystem or something) just by reading the online manual. After you see what it does, theres really not much to it. I think the programming is more on the other end, rather than on the SQL end.
Some slashdotter's may tell you to learn MySQL or PostgreSQL because they are open source. This is true, and it's good because they come with almost any Linux distribution. Unfortunately, business aren't looking for those skills, so it won't help you.
Here are some Monster stats (for open US jobs):
Tips and Tricks for Mozilla
Phillip Greenspun's book SQL for Web Nerds is a very nice introduction to SQL. It would be a good idea to grab a copy of PostgreSQL or one of those Oracle demo cds that are as common as AOL cds, and work through the exercises in it.
Please avoid MySQL if you are just learning SQL. You'll just have to unlearn all of the workarounds for the features (such as real transactions, and referential integrity to name two) which it is missing when you move to a real database.
I always had a hard time finding these on the Sybase site, so I thought I would throw them up on my webspace for this post. I used these for my University Databases course and they were golden. They're not great for learning SQL from scratch, but they're an excellent reference for those tricky queries or table manipulations you might run into. I still use them occasionally for non-sybase DBMSs as well. I hope you find them as useful as I did.
My favorites are:
www.sqlcourse.com
www.sqlcourse2.com
These are good beginner sites that allow you to practice through a java app.
iRepairIT - iPhone, Mac, & PC Repair
This issue is a *lot* more complex than it first seems. There's a lot of really bad SQL code out there, and many of the authors don't even realize how little they know.
The problem is that it takes time and experience to really develop a sense for how to use the data. If you're a programmer, you should have at least some familiarity with performance issues even if you don't always pick the best algorithm for the problem. Likewise with a SQL database you really need to understand why 3NF is important, why referential integrity is a really good idea, etc. It's not uncommon for databases to span many gigabytes and a bad design can literally cost millions of dollars as you throw more hardware and expensive database licenses at the problem.
This isn't just theoretical - ghosting can be a problem with 3NF data, and you need to know how to recognize it and fix it. (More precisely, how to fix it without using 1NF or 2NF, which both have serious problems that 3NF fixes.)
Then there's the issues of views. It's easy to understand read-only views, but updateable views make life incrediby interesting. But this is critical - a bad updateable view will create a lot of subtle errors in your database.
Other issues - how do you access the data? This is everything from JDBC or Pro*C to JSP tag libraries. How do you handle bad data, or bad assumptions? (Nothing teaches you how hard it is to get a unique identifier like trying to actually find unique identifiers for real data.)
Finally, many of these sites aren't just looking for SQL knowledge, they're looking for specific packages like Oracle Financials.
I think the best way to illustrate just how much there is to learn is that a friend recently decided to get Oracle certification to help land jobs. She's been focusing on databases for almost a decade, yet she still had to study hard for the exams. I've been doing intermittent database work for even longer and have pulled several rabbits out of my hat - yet I know I would struggle to pass just one part (of four) of the exams.
But on the question at hand, my advice is to get an introductory text and start solving some problems. Create a database listing your CDs, then extend it to handle DVDs and VHS tapes, then extend it again to handle books and magazines. Create an index to keep track of your softball or bowling league stats - the teams, the players, the individual and team stats. You'll learn more from one or two reasonably large problems than you'll learn from a dozen books.
For every complex problem there is an answer that is clear, simple, and wrong. -- H L Mencken
Do *not* learn the actual query language first. Learn database theory and design before anything else. Don't even consider doing anything with a database until you know the six forms of normalization (at a bare minimum you need to know the first three; the second three are "gravy" for many applications and not even appropriate all the time). This includes knowing the requirements to reach each level of normalization within a database.
I have seen so many database layouts for various applications that have practically brought me to tears through their sheer stupidity. These were layouts designed purportedly by people who "knew SQL." There is a tremendous difference between "knowing SQL" and "knowing proper database design and implementation." Unfortunately, many people who claim to be database programmers do not realize there is a difference and assume that since they know the syntax of SQL, they know how to design a database.
I would recommend that you begin with "Database Systems" by Connolly and Begg. Read it cover to cover, then read it again. When you're done reading it the second time, skip through to the end of each section and do all the exercises without rereading any of the text. Once you can answer a majority of the questions correctly, then begin to consider designing database layouts. Before you look the book up on fatbrain or amazon, be warned that it is not light reading. It's 1,200+ pages, but is well worth it.
The ISBN is: 0201708574.
When you actually understand how databases work and how to effectively use them, you will thank yourself tremendously for taking the time up front. If you dive right in to learning the syntax of the query language without understanding the basics of design and implementation, you will make one stupid mistake after another with no end in sight. Then, someone more knowledgeable than yourself will come along and will have to start everything over from scratch to fix your screwups.
Doing it right the first time is especially important when designing databases for large systems. If you screw something up and don't learn from your mistake until you have millions of records in tables that are being quickly updated 24/7, fixing that mistake is going to be a nightmare and could very well cost your company a tremendous amount of money through downtime and resources spent on the fix and conversion.
Trying to keep this post from getting too long: the key is that there is absolutely no substitute for a solid understanding of the theory behind database design. You simply cannot be anything more than a witless hack at databases without this understanding. You will churn out terrible database layouts almost every time (unless you have an unbelievably lucky streak) and your projects will suffer because of this.
Sorry if this sounds harsh, but it really, truly is worth spending the time to learn the theory and design before trying to apply your efforts to a real world project. Of course, if you're impatient you can play around with a server at the same time you learn the theory. But do not make the mistake of neglecting the theory in favor of quickly learning the syntax.
Enough already with the "use a search engine" comment. I think it's safe to assume that almost any "Ask Slashdot" question could be answered to some degree by typing the question into google...
what you won't get from google is a decent, "peer reviewed" answer... just because some tutorial or site is on the 10th page of a google search doesn't mean it's not the best. Likewise, the first ten results might not be the most relelvant. People ask things of slashdot because they are looking for answers from people who have opinions and experience, not from bots who have been tricked by judicious usage of meta tags. I guess the next "First Post" will be "First Use Google Answer to Ask Slasdot".... geeze.
That being said, fair enough on the rest of the comment.
I would recommend Database Processing by David M. Kroenke, ISBN 0130648396.
After having read and understood that book I would start looking at a commercial dbms like Oracle, DB2 or MS SQL Server as they are they most frequently used.
In my opinion MySQL and Postgres are fine products, but if you're looking to get an overpaid job, go with Oracle...
Load a billion records into your MySQL database through 20 tables, then do random 10 table joins. Thats why.
Postgresql doesn't do quite as well as Oracle (much much smaller gap now though) but it has a smaller starting size.
Rod Taylor
I'd use whatever SQL I could easily lay my hands on, and that allowed me to build some sort of application. I find it real hard to learn a tool just for the sake of learning it. It comes so much more easily when you apply it. I see a number of comments comparing SQLs. Personally, I started with Oracle, and currently use mostly NCR Teradata (Try inserting 30 million 200 bytes rows into a table in 8 seconds with Oracle ;-) ). I'd say about 80% of my Oracle knowledge transferred, even though Teradata is pretty strange animal (very distributed). Do others have opinions about what percentage of SQL knowledge transfers from one flavor to another?
Why can't people be clear?!
First you say: job listings for SQL programmers
:) and b) learning all the idiosyncracies of a particular database system. Probably learning basic SQL is the way to start.
Then you think you wish you: had experience using database software
and compound it by believing: that would give a beginner a real in depth look at using database SW.
Then michael (with his double at-sign) comments: a) learning standard SQL (pick a standard, any standard
You're all wrong!
What do you want? There are *four* separate issues here.
- Learning SQL
- Learning embedded SQL
- Learning DB management (and CASE, etc.)
- Learning DB ideosynchracies
I will assume that issue 3 and 4 are not the case. (because anyone who gets these things mixed up probably would be pretty horrible at designing tables anyway.For issue 2, read the documentation of the language you need. They all do it differently.
Have you read my journal today?
Yes, I know that. But with 2.5gigs it should be able to do genetic research =)
We learn Oracle quite extensively at school. I have found that PostgreSQL resembles Oracle far better than MySQL did.
As for anything, to learn is to do... therefore, I recommend you get both PostgreSQL and Oracle from the websites. Oracle is freely available for educational purposes, and PostgreSQL is free anyway.
Besides, PostgreSQL has a very good SQL reference, which also lists what is and what is not ANSI SQL (boy, it came as a great surprise to me that LIMIT is *not* ANSI SQL!)
Anyway, I recommend you get both of these database systems, find some tutorials here and possibly using Google, and learn by experimenting... that usually is the best way.
Good luck!
The 2.5 gig also includes stuff like the OEM agent and management server, the management tools (all with GUI), the names server, development libraries (for C and Java, plus others like Cobol and Fortran), network configuration tools (also GUI), the graphical installer tools, documentation, and sample database. There is undoubtedly lots of other stuff but I don't have access to an Oracle installation at the moment to tell you what is in each directory.
Next time you do an install, maybe you can uncheck the parts of the distribution that you don't want.
Ho! Haha! Guard! Turn! Parry! Dodge! Spin! Ha! Thrust!
There is an amazing SQL book by Graeme Birchall called DB2 SQL Cookbook, downloadable as an PDF. It contains all the funky stuff you can do with SQL on DB2. DB2 is pretty close to the ANSI SQL standard so a most of the stuff should work on other databases as well... The url is: "http://ourworld.compuserve.com/homepages/Graeme_B irchall/HTM_COOK.HTM
Pure lameness? I guess being a beginner at something doesn't measure up to your level of "nerdiness"... that's sad. The guy said he had done a bit of research, but it seeemed like everything he found was geared to people who already knew what they were doing.... it's like telling a newbie "read the man page"... with experience, this is perfectly acceptable, but when you're just starting out, terse documentation doesn't make sense.
/. they picked a lame question... Geek snobbery isn't nice, and it isn't very useful... and you certainly won't stop "lusers" from being "lusers" by being mean to them.
I guess I won't debate whether it's nerd news or not... but it obviously matters to all those companies who are willing to pay big $$ for a database programmer.
I still maintain my point... if you don't have anything productive to say, don't bother with posting... if you think the question was lame, tell CowboyNeal or somebody
SELECT FreeTips FROM SlashdotAudience WHERE Subject = 'SQL';
======================================
Writers get in shape by pumping irony.
Try www.sqlcourse.com and www.sqlcourse2.com
That's where I learned SQL. It uses an interpreter and a live practice database.
Most people would die sooner than think; in fact, they do.
(* This isn't just theoretical - ghosting can be a problem with 3NF data, and you need to know how to recognize it and fix it. (More precisely, how to fix it without using 1NF or 2NF, which both have serious problems that 3NF fixes.) *)
Sometimes people take the normalization rules too extremes and try to divide entities into sub-entities. I suggest one not get split-happy.
Normalization should remove unnecessarily duplicate data, but if you start to use it as a sub-grouping mechanism, then you can create problems IMO. Just because two fields are somewhat "related" today, does not mean they will be tomarrow. The normalization rules as worded sometimes don't consider possible future changes.
Just a caution to keep in mind.
Table-ized A.I.