How Would You Improve SQL?

← Back to Stories (view on slashdot.org)

Posted by Cliff on Thursday November 3, 2005 @10:25AM from the select-*-except-column99-from-table dept.

theodp asks: "It was the best of languages, it was the worst of languages. SQL's handy, but it can also drive you nuts. For example, if you want all 100 columns from a table, 'SELECT *' works quite nicely. However, if you want all but 1 of the 100 columns, be prepared to spell out 99 column names. Wouldn't it not make sense to provide a Google-like shorthand notation like 'SELECT * -ColumnName' (or DROP=syntax like SAS)? So how would you improve SQL?"

3 of 271 comments (clear)

Min score:

Reason:

Sort:

Check out LINQ... by 0kComputer · 2005-11-03 10:38 · Score: 3, Informative

If you want to get an idea of some cool SQL improvements, check out the
http://msdn.microsoft.com/netframework/future/linq /
LINQ (Language Integrated Query) project for c# 3.0. Some cool stuff tht i never really thought about.

For example, their select statements go backwords ie from table, select column1, n2, n3 etc... Seems kinda wacky at first, but it makes sense since you really should know what table your'e selecting from before you specify the columns.

ex.

public void Linq3() {
List products = GetProductList();

var expensiveInStockProducts =
from p in products
where p.UnitsInStock > 0 && p.UnitPrice > 3.00M
select p;

Console.WriteLine("In-stock products that cost more than 3.00:");
foreach (var product in expensiveInStockProducts) {
Console.WriteLine("{0} is in stock and costs more than 3.00.", product.ProductName);
}
}

--
Top 10 Reasons To Procrastinate
10.
Re:Be practical by neura · 2005-11-03 11:52 · Score: 4, Informative

Actually, it'd be faster if you listed out every column name. If you're talking about faster to write out the code for, you're obviously not writing a query for a program that's intended to be used much. There's absolutely no reason you should be deploying code containing a query that does "select *" or anything like it. You're making the database do the work of looking up the list of columns names every time that query runs. There are much more useful things to spend your caching space on (if you have any).

If you really can't stand to write queries containing the actual column names, you should be using some type of abstraction layer in whatever language you're writing your code in.

If you're not writing code and just making queries by hand to test the results, then you're even further off your rocker. (this also applies in general to the statement you made) Why would you ever NOT want to select that last value out of 100? is it going to keep your output from wrapping? (lol)

Also, those of you saying that you should never have 100 columns in your table, you're certifiable lunatics as well. If you have 100 columns that are used in every record and have very little or no duplication per row, there is no reason you should break this up into multiple tables!!! Then the database has to do joins, which again require more processing power and disk usage. It's also hard to maintain multiple tables when you really have one table after you normalize it.

For those of you that say this isn't normalized... I'm not even really sure how to answer that.... If you have several tables all with a strict 1:1 relationship, they should be in ONE table. Anything else is considered denormalized, not yet normalized. (aside from being just plain BAD!)

For those of you that say you'd never need that many columns in one table or split across multiple tables, however you'd like to think the world should work. I have an example of just that. My wife does genetic research, primarily statistical analysis of sequence data (in various forms, but that's the easiest way to sum it up). We've had discussions on this particular topic, where she had been told by someone else that she would get better performance in Oracle if she split her one table into several tables containing a smaller number of columns, each.

This is just simply not true. It also is a perfect example of a situation where you would actually need a large number of columns. There were specific bits of data that needed to be looked up quickly (like, 45'ish). You can't store it all in one column (or even just a few) and use regexes to find the bits you're looking for. You also don't want to be doing a lot of joins unless you really need to, you know.. when you actually have data that would fit into some form of normalization. Technically, you CAN do this stuff, but not if you want decent performance. If you didn't want decent performance, you could just leave the data in a text file and shell out a grep command. *sigh*

Anyway, enough ranting, but seriously people... Get a clue. Get some experience with these issues. Don't just pipe up because "hey, I've worked with databases and while I probably don't understand them very well, I don't know anybody else that understands them at all, so I'm kind of an expert!"
Re:Better NULL handling? by Zathrus · 2005-11-04 05:15 · Score: 3, Informative

On the other hand, using rows instead of columns complicates any interesting data manipulation you're going to do on your web quiz signifigantly

No, it simplifies them. RDBMS's make it very easy to do things on a row basis, not so much on a column basis. It is utterly trivial to get "usefull" [sic] data out of a table structure that's properly designed, but nightmarish to get it if it's not. For instance, if you have 100 columns how do you answer the question "what percentage of respondants answered at least 80% of all questions?". You can't easily. And if you add or remove a question then you will have to touch a great deal more code than if you had implemented the table sanely in the first place.

he OPs point stands - these are not normalization issues

Yes they are! This is practically a poster child of normalization (or lack thereof). In fact it's an utterly trivial example of normalization that doesn't actually raise any difficult issues over speed, accessibility, etc. where you often run into problems with normalization.

is a real weakness of the relational model compared to OO design

There's a vast difference between relational databases and object oriented databases. I've yet to see anyone (including Oracle) do objects-in-relational decently. And SQL is inherently a relational paradigm -- you wouldn't want to use it for an OODB because it would just be inappropriate.