An Alternative to SQL?
Golygydd Max writes "Dave Voorhis from the University of Derbyshire has developed a program incorporating Tutorial D, a language designed to overcome of the shortcomings of SQL, and developed some years ago by Hugh Darwen and Chris Date. Until now, no-one had done anything with it but Voorhis is hoping for wider adoption; although we think it would be like pushing water uphill though." Update: 10/13 12:43 GMT by T : An anonymous reader writes "It's being picky I know, but the university in question is in fact called The University Of Derby, not Derbyshire."
What are the shortcomings to sql? it seems to be able to handle anything you'd need it to do.
Try using Lotus Domino for a week. You'll be begging to go back to SQL.
"Have you ever thought about just turning off the TV, sitting down with your kids, and hitting them?"
Is there anything that SQL can't do? I've been using various RDBMS for years and it hasn't come up yet.
SQL also has decades of optimizations in reliable code...no one will be dropping their Oracle license over this.
I use SQL a lot and I agree that has failings. The clumsiness inherent in, say, nested joins is quite amazing when you consider how important databases are in modern industry. This is a consequence of the "near-English"ness that SQL strives for, but that property is also what causes people to adopt SQL in the first place. We'll probably look back at SQL in five years and laugh... but weren't people saying that five years ago?
apterous.org
so to overcome the (not really all that many) shortcomings of sql we will all learn how to use something completely new. Yeah, adoption going to be quik and complete........
"goodbye and hello, as always" ~Prince Corwin, from Zelazny's Amber series
"SQL is sloppy and unpredictable; Tutorial D is a correct relational database language."
sounds a lot like
"C is sloppy and unpredictable; Pascal is a correct programming language."
The correlation between ignorance of statistics and using "correlation is not causation" as an argument is close to 1.
Seems like it can handle just about everything but maybe I'm not thinking outside the box. The biggest limitation is my lack of knowledge about how to do the things I want to do.
The man who trades freedom for security does not deserve nor will he ever receive either. - Benjamin Franklin
Any language is going to have quirks and inconsistencies between dialects.
SQL has worked so far but if Tutorial D really is better then bring it on.
I've gotten over SQL the "short commings" though
If you read the article, this isn't about replacing SQL, but more about testing new ideas and languages that could replace SQL. This is better than just saying, "We have a better language. Switch now or be assimilated.", and I'm glad someone's finally taking this approach. Unfortunately, the article only mentions one specific problem with SQL, but I'm sure there are others that these people might eventually solve.
Hurricane Ivan: A 17th century prison collapsed. All of the inmates escaped.
One big thing I think a database could use is a hierarchy key instead of using parentid's as "foreign keys", is just one of the shortcomings. If I wanted to make a threaded thing for my forums, for example, I'd have to make a big PHP script just to sort it properly. I would have loved to have MySQL do it automatically. SQL has a very limited syntax, as well, and is inconsistent. "INSERT INTO table VALUES ('', '', '');" That's one of the only times you see the parenthasis used in that way. You don't see it here: "SELECT * FROM table WHERE id = '';" or "UPDATE table SET id = '';" I would like to have the ability to easily use my own functions just as you do with any language. It's especially important in database software.
Ask and Discuss your HTML and other web dev stuff
For those interested, the paper describing this language (linked to from the article) is available here. There's a link to the grammar of the language at the end of that paper.
I use SQL quite a lot. It's certainly great for a lot of things, but it does have some limitations here and there. For instance, trying to deal with things like hierarchical structures, or joining on having identical/similar children, is a nightmare in SQL. Even if the query doesn't need to be efficient to run, it can still be extremely complicated to write and test. SQL simply wasn't designed or intended to deal with those sorts of structures.
Unfortunately, short of using external code outside the database, it's so often a choice between using SQL or nothing else for writing a query in a particular database rather than an option between SQL and another language. In some ways it's like being forced to write every program in C or every program in Java or every program in Lisp, where realistically one or another might be better suited to a particular task.
I suppose one of the reasons for only supporting SQL is that a predictible query language makes it easier to arrange data structures so they can be queried most efficiently. Still, it'd be nice to see an alternative front-end language or two supported in one or more of the major databases. Not every query needs to be ultra-efficient, and there have been many times where I would've liked to trade an efficient query execution for a language where what I wanted was more writeable.
The article criticises SQL but the author has little familiarity with SQL for example:
"but the syntax is often inconsistent and unless you use one of the many vendor-specific supersets of SQL it can be tricky to express complex series of operations in a concise manner."
But in fact, SQL is so popular because complex expressions need little changing from specific vendor offerings. If people choose to program using the subsets, then well and good, but the ANSI standard is generally thought to be sufficient. This is like arguing for the abolishment of HTML and XHTML because Microsoft make a flawed browser - hopefully the database language is better than the reasoning here.
It then goes on to say "The idea is that there should be no arbitrary restrictions on the syntax of the query language, but at a lower level the database shouldn't run up against idiotic limitations. The limitation in existing implementations that generates the most comment from the various parties in the debate is the problem with "null" values in relational databases. Put simply, a database field has a type (50 characters, for instance, or a floating point number to two decimal places, or an 8-bit integer), but when you don't fill the field in (i.e. it's "null") it loses all its meaning. Even the ANSI standards state that if a field is null it's said not to exist - so if you ask a database for "all entries where field X is not equal to 47" it won't return any of those where field X is null because instead of saying "Null doesn't equal 47", the value "null" is deemed not to be comparable with any non-null field."
Well, for starters, null is not numeric zero, null is the absence of any data whatsoever, and every SQL doc in the world tells you to not mistake it for zero. Any arithmetic expression containing a null always evaluates to null. For example, null added to 7 is null. All operators (except concatenation) return null when given a null operand. That's exactly why it's the ANSI standard.
If you want to find "all entries where field X is not equal to 47" then pass your attribute a value like "0".
SQL is neither clunky nor obsolete. Tutorial D may actually be a better database modelling method, but the article's criticisms aren't sufficient to exault Tutorial D whatsoever. The "Project D" syntax and model may possibly be better, but these criticisms aren't a convincing reason to scrap SQL.
Si tacuisses philosophus mansisses. If you had kept quiet, you would have remained a philosopher.
And if it were actually relational, then it might be interesting in the current discussion. But it ain't. That it comes up is funny, in context, because mistaking things like XML for relational is something that Date regularly has massive heart failure over.
I forget what 8 was for.
That may be, but I haven found few operations easier to express in relational algebra than in SQL.
, B))
...
for example:
I want the name field from a if it's id is in b.
In relational algebra,
PROJECT[A.name](THETA_JOIN[A.id=B.id](A
or
p[A.name](A |X|[A.id=B.id] B)
Sorry, ascii sucks.
In SQL
SELECT DISTINCT A.name FROM A,B WHERE A.id=B.id;
I find the SQL version to be more readable, etc. The same functionality is provided by both and is easily transferable.
cartesian production becomes SELECT * FROM A,B
natural join becomes (A NATURAL JOIN B)
theta join becomes SELECT DISTINCT * FROM A,B WHERE
selection becomes SELECT * FROM A,B WHERE
projection becomes SELECT columnA, columnB FROM A,B
With nested queries, everything is easily translatable from relational algebra to SQL (Technically, all SELECTS should be SELECT DISTINCT, but whatever). Otherwise, temporary tables can be used.
The real reason relational algebra seems easier to deal with is because you can use symbols to represent things and there are no data types. It is an abstract language that cannot be implemented because it is under-defined.
If you try expressing a hairy SQL query with relational algebra syntax, you will end up with a hairy relational expression.
see, if you were born around 1940 you could have been using IMS/DB, VSAM, ISAM, IDMS, etc back in the 70s.
Tons of opportunities there for low-level access to your data. Of course, there's a reason that all those database management systems were abandoned for a 'busted ass super high level language'. It's because they sucked to maintain, they didn't evolve well as business requirements changed over time, and if you had the *most* basic of business questions - you'd never get an answer without a month of writing code.
But don't despair - pick up a little more SQL and you may find it isn't that tough.
The cost is not in learning. People learn new languages every day. The problem is in compatibility.
Let's say that tomorrow MySQL adopts Tutorial D. Ok, so you are going to write code that talks to MySQL and you think about Tutorial D.... well, MySQL still talks SQL, and it turns out that so does everything else. Someday you may want to switch databases because suddenly DB2 is all the rage. If you use Tutorial D, you can't switch.
Any move away from SQL would have to be a broad industry move or would have to come from an effort that is so divorced from the current uses of databases that compatibility simply is not an issue.
By creating a new language, "Tutorial D", developers are excluding the other languages as much as they're including new features in the new language. Why not just add a Java package that includes the new syntax? To get anywhere in software development, even Tutorial D code will have to interoperate with existing systems and programmers with existing skills. Someone will have to code a "Tutorial D" JDBC driver, and ODBC, and all kinds of middleware that eats performance, developer time, and introduces the maintenance pitfalls of complexity. And by adding a package to an existing language, they can skip reimplementing the features of the existing language that they include in this new one, like loops, branches and character output. The effort seems as vain as the endless 19th Century conceits of inventing complete philosophical systems from scratch, to serve the reputations of egomaniacs dominating university debates. Why can't everyone just speak Object, with procedural slang and set-theoretical poetry?
--
make install -not war
I might also introduce keywords POSSIBLY and CERTAINLY that collapse tri-state logic (true, false, maybe) into boolean logic. Thus, POSSIBLY(a = 5) would be true when a is UNKNOWN but CERTAINLY(a = 5) would be false.
Date advocates a different approach - no NULL at all. Instead, he has some sort of parallel table structure; a row in one table for the value being present and in another for the value being absent. With some more complex way of constraining it so there would be no contradictory information in the tables. I don't like this approach - having no NULLs seems simpler than having two, but not once you add in the weirdness of contraints. And not once you realize many tables have multiple nullable columns. Joining so many tables together would get ridiculous quickly.
In practice NULL seems to not be a huge problem for me. Occasionally a field can either unknown or inapplicable, and I need to distinguish between the two; I have to do a kludgy thing with another field and a CHECK constraint. But for the most part, it's just an extra half second of thought when writing the logic, which isn't too bad. But it does trip newcomers. It would be worth fixing if you were designing a new relational query language from scratch.
Yeah, yeah. We all know that hardly anything the W3C do are true, bona fide, ISO standards, but they are still standards as far as most people are concerned. No need to be so pedantic.
You just have to expand your mind a teensy bit. You can do with a single line of SQL what may take several pages to do with an object-oriented language. I can understand how someone who isn't used to that much power may find SQL a bit confusing at first, but with a little effort and experimentation, it starts to become second nature.
Not that it really matters to ME. I took a test for my current job that included a few very basic SQL queries. Not only did I ace it, but I was later told that most of the other candidates who claimed to know SQL wound up handing the test back blank. So, your refusal to adapt to a new and powerful tool is giving me a distinct competitive advantage. Perhaps I should just say "thank you."
--A/C
Why don't you participate in the development in one of the OSS database systems (like Postgres) to [add temporary views] Bingo, there is your 'variable' functionality without inventing a whole new language.
1. It is not that likely that I will get a chance to use it at work.
2. SQL stinks in other ways that I would like to see fixed. In other words, I think it is time to explore a complete overhaul. Why be stuck with a "good enough" language forever? We are finally getting away from COBOL as the dominent biz language, so how about we work to get away from its relational cousin known as SQL?
They keep tacking stuff onto COBOL to try to modernize it, but the result is a language only a mother can love. They are even adding OOP to it.
Table-ized A.I.
> Basically, if you have to mentally add an implicit "AND X IS NOT NULL" to every condition, wouldn't it be better to make everything explicit and clear instead?
The wacky part is using NULLs as a primary excuse to develop another language. First off - NULLs are optional in relational databases. Don't like them? Fine, don't use them. declare your columns 'NOT NULL'. It's that easy.
Secondly, most work-arounds for unknown data suck:
* The easy ones involve keeping an 'unknown' value row in most of your tables. That works great - except for high cardinality columns that aren't lookups - like monetary fields, etc.
* the more common easy work-around is to reserve some value as NULL (2000/01/01 for a date, -1 for an integer column, 'n/a' for a varchar, etc). This really sucks - not only do you have to exclude this from your queries, but you need to know exactly what value to check for, and it's very likely that you'll need a variety of values for each type.
* the approach I've seen most often by the more skilled of the anti-null crowd involves the creation of more tables - in order to isolate the nullable columns onto other tables that have condition rows. That's ok - but it involves an *explosion* in the size & complexity of the data model. There are some benefits - like it may do a better job of explicitely describing the subject process. But the downside - is that we already *denormalize* (prejoin) many tables together for better performance - primarily for reporting & analysis. Of course, the anti-null crowd also believes that denormalization is a sin - and that database vendors could theoretically provide great performance without denormalization. Unfortunately, their supporting analysis is based upon transactional rather than analytic systems. Ultimately, this approach looks like something that was baked up by people who might work on database software, but don't actually build real-world systems using it.