"It can be used to specify what a concept means and how it relates to (some) other concepts. "
The RM has this as well.
"We both have to agree to use the same metalanguage, like OWL, and we have to agree on some of the vocabulary in the language that we are making up. But, with a relational database, or indeed XML, you have to agree on all of it."
Nope. In the case of the RM, you'd need to agree on domains (which is basically what OWL is specifying with their 'class' attribute). "Bob IS-A Person" is the same as Bob "element of" domain person.
"Good grief, and you accuse the semantic web people of having their head in the clouds. Performance is a key issue in the success or failure of any system."
I accused no one of such things. I agree performance is critical -- but in this case, performance is damning the SQL PRODUCTS, not the relational model. Would you damn internal combustion engines (on performance reasons alone orthogonal to other considerations) if you drove a car that got.1MPG because of a flawed engine design? No, you'd go back and try and improve the implementation (namely the engine design).
"SQL is based around a relational data model. The relational data model is not designed to store hierarchives, or graphs or any sort. You can not make graph queries against the data, and it wont work efficiently."
How is the RM not designed to store hierarchies???! That is patently and wholly false!
SQL is bad for a number of reasons, none of which have to do with the Relational Model.
I'm interested as to what formal limitations predicate logic has when it comes to data management (what we're talking about is data management - integrity, manipulation, storage, etc.). Am I saying that predicate logic will never be replaced by something better? No, but you've not proved that when it comes to data management that predicate logic is inadequate.
"I'm sorry, I don't know what point you think you're proving" You said that "The relational model is not set theory, it's the relational calculus." which is incorrect. "I don't want to store them in a relation, they're nodes of a tree!"
Why does it matter if the relational model can provide adequate (which it does) manipulation and integrity what it is stored as?
Remember that when you 'implement' a tree (say in C++) you are not creating a tree-like data structure in memory. You're merely making a really big linked-list (to be slightly imprecise) in a linear memory system.
"If I can't express my manipulations within that model, I'm stuffed! (Or rather stuck doing manipulations by hand in a computationally complete programming language extension... say PL/SQL)"
Although, loosely speaking, SQL is relationally complete it does plenty of things wrong and/or in a very roundabout way. From DBDebunk:
First, SQL additionally supports lots of things that are deliberately excluded from the relational model (pointers are a good example), so it lets you to do lots of nonrelational things--a fact that introduces a great deal of complexity, though it doesn't add any power (you can't do anything useful with those nonrelational features that you couldn't do without them).
Second, there are some relational things that can be done in SQL only with a great deal of circumlocution. Try writing a relational comparison in SQL, for example, or an SQL constraint to express a general join dependency.
Third, SQL is an extremely redundant language, in the sense that all but the most trivial of queries can be formulated in literally thousands of different ways: another fact that makes life extremely difficult for the poor old user--and, likely as not, means the user has to get into performance issues, since those different formulations will almost certainly not all have identical performance characteristics. [Ed. Note: In 1987 I published in Database Programming and Design the results of running the same query expressed 7 different ways against 5 different SQL products and got response times ranging from 2 seconds to 2500 seconds!!! Since then SQL has become even more redundant.]
Fourth, never mind that SQL was originally meant to be a language for dealing with relational databases specifically--the fact is, it's pretty badly designed considered simply as a language. I mean, there are several well-established principles of good language design, but there's little evidence that SQL was designed in accordance with any of them. As a consequence, its syntactic and semantic rules can be very difficult to learn and remember. For example, do you know the rules for writing join expressions and/or union expressions?--which, I can't help adding, are not just different from each other but almost completely different. E.g., A NATURAL JOIN B is legal but A UNION B is not, while SELECT * FROM A UNION SELECT * FROM B is legal but SELECT * FROM A NATURAL JOIN SELECT * FROM B is not.
"As for your manipulations, they're only declarative if they're expressible in the logical model... And mine are NOT!"
But they are. See next response.
"The constraints that I'm talking about are precisely the tree structure itself, the form of nodes and their linkage"
Given a set of nodes and linkages you can easily generate a logical model: Tree( Link, Node_Data (whatever that may be), Parent, Level )
I can't really draw a picture but maybe this will help (taken from Pr
"The problem is that relational model performs very badly with this sort of data."
Performance is an implementation detail. Namely, the SQL DBMS (if that's what they're using) that the GO DB is using are inadequately implementing hierarchies (if they implement them at all). It's up to the vendors et al to "get it right" and they haven't yet.
As for the last part, I'm unfamilliar with the problems the GO DB ran into, so I can't say whether or not it's a problem with RDBMS (unlikely) or with specific SQL products (most likely)
"How will you build a distributed system which can still maintain referential integrity like a RDBMS."
Not sure what you're asking. Does XML/RDF/etc. provide for referential integrity in this fashion? How does the DNS system verify that entries in outlying servers are 'correct'?
But if you're asking something like "If I'm sent a relation XYZ which refers to other relations for integrity don't I have to have those sent to me as well?" -- well, if you want to do any local manipulation (UPDATE/INSERT) of that data and have it still be correct, then you can either have your local DBMS store *something* or you can have your local DBMS perform the constraint checking across the wire. For example if you get a list of orders with customer IDs embedded in them and you try and locally insert a new one then there would have to be a series of constraints fired in order to verify that you can do what you're trying to do. As I said, this can be accomplished a number of ways but is in no means an unsolvable problem (it might be difficult to implement, which is probably why XML mostly doesn't care about that).
"This would work but only if people were all using the same schema."
Isn't that the point of RDF and OWL etc.? That everyone has to use the same XML tags (or XSD) in order for this whole thing to work?
If the message is not agreed upon beforehand then you are correct in saying that there can be no automatic communication. Of course, XML does not make this problem go away.
The semantic web is a way to interrelate data from disparate systems, no? It's trying to turn the gobs and gobs of 'semi-structured' data into something meaningful. The RDBMS does this already in a really nice way. As I said before -- if people didn't create 'semi-structured' data in the first place we would already have the semantic web. The RDBMS is the best solution to create, store, and manipulate data that the semantic web needs.
"The relational model is not set theory, it's the relational calculus."
Nope, read Codd's paper if you don't believe me.
Basically the RM is (taken from DBDebunk.com): Theory: predicate logic and set mathematics Structure: R-tables (precise definition!) Integrity: domain, column, table and database integrity Manipulation: R-operations (restrict, project, join, etc.)
"What more I want is to manipulate my data, not just represent it!"
The relational calculus of which you spoke is one way to manipulate it! That's the "Manipulation" part above.
"In my example it's precisely my point that the different nodes cannot be in the same relation"
What example of business data would require you to store different entities in the same relation?
"you lose the ability to manipulate your data in the form is actually takes."
The RM is a logical model -- if you were a DBMS maker and you really wanted to implement (e.g. on the physical level) a hierarchy as a series of nodes and pointers you can. However, from an application standpoint you don't need to store the hierarchy in any other fashion.
Tell me -- how do you apply constraints and business rules to your tree in your example? Procedurally, and "by hand" (e.g. you have to code your own rules). The RM provides for declarative constraints AND declarative manipulation tools which are far simpler than the procedural counterparts (and as we all know, the simpler the better, less buggy, etc.).
"You said that there are no graphs that have cycles"
"This kind of data is everywhere. Why do you think XML exists? Why do you think functional programming languages have such a following?"
I'm talking about the mixing of different entity types in your example (float mixed in with string mixed in with whatever). It's clearly a case of confused design. But that's neither here nor there.
Huh? You say I can store a tree, but that I can't query for a (sub)tree is 'moot'?!?
No, I'm saying that your entity types are mixed and so they'd never be in the same relation to begin with.
I think you're confusing graphs and trees.
Check your math -- a tree is a specialized graph.
Maybe I should talk to the wall instead... no, I'll tell you what, I'll get back to my banana farming!
You're the one who has not provided any substance to your arguement -- simply because you are unaware of how to do it in the relational model does not mean that it can't be done. Trees (and graphs in general) are mathematically defined using sets and so therefor can be logically represented in the relational model by definition.
A trivial example would be some sort of manager hierarchy: Employee( EmpID, Name, etc. ) Manages( ManagerID, EmpID )
That is a set-based logical model of the data in the hierarchy "Manages".
The question would be: What would you do with such a database?
But more practically, from what it seems, those would not be in the same relation because they do not have the same schema. So the question is sort of moot.
However, if you wanted to, say, make the attribute of the same domain you could easily do this, provided there are no cycles in the graph (which would make it not a graph by definition).
Your results would come back as a relation.
Why do you think that hierarchies cannot be modeled in a relation or that you cannot manipulate them in a relational system?
You'll have to explain it a bit more clearly, then.
"by hand"? Are you referring to some sort of bizarro language which does all sorts of magic for you that is supposed to be some sort of 'gotcha!' moment?
Can you describe the problem using predicate logic and set theory? If so, you can model it relationally.
There's a chapter in "Practical Issues in Database Management" which illustrates how a RDBMS should handle hierarchies (e.g. trees as in your example) and how, stuck with SQL products, one would do it in SQL.
The relational model supports hierarchies and also supports as complex of constraints as you would like. You have it confused with poor implementations (SQL products et al)
Not to say it must be in a RDBMS in order to go on the web, just that it must be in a distributed RDBMS in order to achive the goals that this 'semantic web' business is trying.
"I look forward to the web going down for schema updates."
One major point of the relational model is that internal schema changes would not affect applications -- that's what views are for.
Of course, the same (presumed) result (breakage) would occur should someone change RDF or OWL, too.
The real problem is that people are creating so-called semi-structured data in the first place. This is a band-aid approach to try and make sense of the large amounts of junk on the web. Unfortunately, it won't work (or won't work without significant headache/difficulty).
The real solution is a system of distributed RDBMS. You create your content in the DBMS and then the DBMS serves it to clients which also have a DBMS system embedded in it.
If the client is a 'web' browser they then display the format however you specified.
If the client is a search engine such as google they then add the content to their own DBMS.
This has the additional advantage that no additional 'meta-data' is required because it's integrated into the relational model. There are also plenty of additional benefits that this so-called 'semantic web' nonsense does not - and none of the drawbacks of the semantic web, either. In short: it's the best solution.
Dijkstra has a few things to say on the topic (which I've posted before but is always relevant):
On Education, Specifically:
The ongoing process of becoming more and more an a-mathematical society is more an American specialty than anything else (It is also a tragic accident of history).
The idea of a formal design discipline is often rejected on account of vague cultural/philosophical condemnations such as "stifling creativity"; this is more pronounced in the Anglo-Saxon world where a romantic vision of "the humanities" in fact idealizes technical incompetence. Another aspect of that same trait is the cult of iterative design.
Industry suffers from the managerial dogma that for the sake of stability and continuity, the company should be independent of the competence of individual employees. Hence industry rejects any methodological proposal that can be viewed as making intellectual demands on its work force. Since in the US the influence of industry is more pervasive than elsewhere, the above dogma hurts American computing science most. The moral of this sad part of the story is that as long as the computing science is not allowed to save the computer industry, we had better see to it that the computer industry does not kill computing science.
And then on Computer Science in general (could be extended to 'science'):
I hope very much that computing science at large will become more mature, as I am annoyed by two phenomena that both strike me as symptoms of immaturity.
The one is the widespread sensitivity to fads and fashions, and the wholesale adoption of buzzwords and even buzz notes. Write a paper promising salvation, make it a "structured" something or a "virtual" something, or "abstract", "distributed" or "higher-order" or "applicative" and you can almost be certain of having started a new cult.
The other one is the sensitivity to the market place, the unchallenged assumption that industrial products, just because they are there, become by their mere existence a topic worthy of scientific attention, no matter how grave the mistakes they embody. In the sixties the battle that was needed to prevent computing science from degenerating to "how to live with the 360" has been won, and "courses" -- usually "in depth"!-- about MVS or what have you are now confined to the not so respectable subculture of the commercial training circuit. But now we hear that the advent of the microprocessors is going to revolutionize computing science! I don't believe that, unless the chasing of dayflies is confused with doing research. A similar battle may be needed." --Edsger W. Dijkstra, My Hopes Of Computing Science, 1979
"Context of action" -- you'll have to define that term. Otherwise the whole thing is unintelligable.
Business rules and security are data management issues which, by definition, belong in the DBMS.
Re:Blame should be shared between coder and langua
on
PHP and SQL Security
·
· Score: 1
What are you talking about? This doesn't make much sense.
Programmatic ways to express SQL?!?
Isn't OO development basically tying code and data?
Plus, business rules belong intimately tied to data -- that's so you can't circumvent them. Hence the whole idea of a relational database management system.
Suggesting that the XML be compressed to a binary format is just about crazy and illustrates the point I'm trying to make.
Let's go over the points:
1) it is text based so can be easily edited by humans when necessary We're using a text format to describe (in essence) a picture. What benefit is there at all to being able to edit the text file by hand? Absolutely none! Do you think 3D animators build these things by hand? No, they load it into a complex and expensive program which visually renders the scene for them, then they drag and drop stuff around. Hell, you'd be hard pressed to find any game developer that draws their models by hand with glVertex.
2) there are XML editors that can simplify the process. Again, useless in the context of a graphics file format.
3) It has many standard tools and toolchains So do plenty of binary formats, and there's nothing stopping these 30 vendors from coming up with a simple set of APIs like the XML folks did, too.
4) XSLT is maturing nicely as a transformation engine Which is again useless.
5) XQuery / XML DBMS Useless in this context. And useless in general, too. We already have a better query language (SQL) and far better DBMS systems (Oracle, Sybase ASE, PostgreSQL, etc.)
6) Finally, it is by nature extensible, allowing for different ways to put in comments, add in vendor specific extensions that are easily ignored by other vendors (or used when possible), provide for upgrade paths and the like. These capabilities are not unique to XML. Plus, if this is indeed supposed to be a standard why would you allow for vendor specific extensions? That is counter to the "universal" format in the first place!
So, in short, XML provides no practical nor theoretical benefit to this file format. They should steer well-clear of XML and come up with a compact, efficient file format. It doesn't necessarily have to be binary (there are far less verbose text formats than XML) but it sure as hell shouldn't be XML.
It may or may not outperform Oracle. I'm sure Oracle would be plenty fast in that situation, too, so it really doesn't matter. Is "faster" better than "acceptably fast"? I'm sure you'd get much better performance from writing your own DBMS without going through MySQL.
Of course, what about the costs that are incurred when you have to procedurally implement your own constraints in your application logic? Or the cost of cleaning up your data when something goes wrong? These kind of costs are hidden in the single-query performance test.
"It can be used to specify what a concept means and how it relates to (some) other concepts. "
The RM has this as well.
"We both have to agree to use the same metalanguage, like OWL, and we have to agree on some of the vocabulary in the language that we are making up. But, with a relational database, or indeed XML, you have to agree on all of it."
Nope. In the case of the RM, you'd need to agree on domains (which is basically what OWL is specifying with their 'class' attribute). "Bob IS-A Person" is the same as Bob "element of" domain person.
What sort of real-world entity requires a 'heterogeneous' tree structure?
You can do that in SQL products by simply having 'NULL'able columns and then filling the ones you need. Not that NULLs are any good, mind you.
"You haven't even shown me a query for a homogenous subtree... "
That's because one does not exist (per se) in SQL.
If you use Oracle then you can try CONNECT BY
The 'recommended' wayin SQL would be to have something like this (where EXPLODE is a recursive table function):"This calculus no more represents all of set theory than trees represent all graphs!)"
What parts of set theory does relational algebra miss?
"Good grief, and you accuse the semantic web people of having their head in the clouds. Performance is a key issue in the success or failure of any system."
.1MPG because of a flawed engine design? No, you'd go back and try and improve the implementation (namely the engine design).
I accused no one of such things. I agree performance is critical -- but in this case, performance is damning the SQL PRODUCTS, not the relational model. Would you damn internal combustion engines (on performance reasons alone orthogonal to other considerations) if you drove a car that got
"SQL is based around a relational data model. The relational data model is not designed to store hierarchives, or graphs or any sort. You can not make graph queries against the data, and it wont work efficiently."
How is the RM not designed to store hierarchies???! That is patently and wholly false!
SQL is bad for a number of reasons, none of which have to do with the Relational Model.
See my previous post.
"I'm sorry, I don't know what point you think you're proving"
You said that "The relational model is not set theory, it's the relational calculus." which is incorrect.
"I don't want to store them in a relation, they're nodes of a tree!"
Why does it matter if the relational model can provide adequate (which it does) manipulation and integrity what it is stored as?
Remember that when you 'implement' a tree (say in C++) you are not creating a tree-like data structure in memory. You're merely making a really big linked-list (to be slightly imprecise) in a linear memory system.
"If I can't express my manipulations within that model, I'm stuffed! (Or rather stuck doing manipulations by hand in a computationally complete programming language extension... say PL/SQL)"
Although, loosely speaking, SQL is relationally complete it does plenty of things wrong and/or in a very roundabout way.
From DBDebunk:
"As for your manipulations, they're only declarative if they're expressible in the logical model... And mine are NOT!"
But they are. See next response.
"The constraints that I'm talking about are precisely the tree structure itself, the form of nodes and their linkage"
Given a set of nodes and linkages you can easily generate a logical model:
Tree( Link, Node_Data (whatever that may be), Parent, Level )
I can't really draw a picture but maybe this will help (taken from Pr
So how does OWL/RDF make 'sense' out of a tag that I just made up?
"The problem is that relational model performs very badly with this sort of data."
Performance is an implementation detail. Namely, the SQL DBMS (if that's what they're using) that the GO DB is using are inadequately implementing hierarchies (if they implement them at all). It's up to the vendors et al to "get it right" and they haven't yet.
As for the last part, I'm unfamilliar with the problems the GO DB ran into, so I can't say whether or not it's a problem with RDBMS (unlikely) or with specific SQL products (most likely)
"How will you build a distributed system which can still maintain referential integrity like a RDBMS."
Not sure what you're asking. Does XML/RDF/etc. provide for referential integrity in this fashion? How does the DNS system verify that entries in outlying servers are 'correct'?
But if you're asking something like "If I'm sent a relation XYZ which refers to other relations for integrity don't I have to have those sent to me as well?" -- well, if you want to do any local manipulation (UPDATE/INSERT) of that data and have it still be correct, then you can either have your local DBMS store *something* or you can have your local DBMS perform the constraint checking across the wire. For example if you get a list of orders with customer IDs embedded in them and you try and locally insert a new one then there would have to be a series of constraints fired in order to verify that you can do what you're trying to do. As I said, this can be accomplished a number of ways but is in no means an unsolvable problem (it might be difficult to implement, which is probably why XML mostly doesn't care about that).
"This would work but only if people were all using the same schema."
Isn't that the point of RDF and OWL etc.? That everyone has to use the same XML tags (or XSD) in order for this whole thing to work?
If the message is not agreed upon beforehand then you are correct in saying that there can be no automatic communication. Of course, XML does not make this problem go away.
The semantic web is a way to interrelate data from disparate systems, no? It's trying to turn the gobs and gobs of 'semi-structured' data into something meaningful. The RDBMS does this already in a really nice way. As I said before -- if people didn't create 'semi-structured' data in the first place we would already have the semantic web. The RDBMS is the best solution to create, store, and manipulate data that the semantic web needs.
"The relational model is not set theory, it's the relational calculus."
Nope, read Codd's paper if you don't believe me.
Basically the RM is (taken from DBDebunk.com):
Theory: predicate logic and set mathematics
Structure: R-tables (precise definition!)
Integrity: domain, column, table and database integrity
Manipulation: R-operations (restrict, project, join, etc.)
"What more I want is to manipulate my data, not just represent it!"
The relational calculus of which you spoke is one way to manipulate it! That's the "Manipulation" part above.
"In my example it's precisely my point that the different nodes cannot be in the same relation"
What example of business data would require you to store different entities in the same relation?
"you lose the ability to manipulate your data in the form is actually takes."
The RM is a logical model -- if you were a DBMS maker and you really wanted to implement (e.g. on the physical level) a hierarchy as a series of nodes and pointers you can. However, from an application standpoint you don't need to store the hierarchy in any other fashion.
Tell me -- how do you apply constraints and business rules to your tree in your example? Procedurally, and "by hand" (e.g. you have to code your own rules). The RM provides for declarative constraints AND declarative manipulation tools which are far simpler than the procedural counterparts (and as we all know, the simpler the better, less buggy, etc.).
"You said that there are no graphs that have cycles"
That was a mistake on my part.
I'm talking about the mixing of different entity types in your example (float mixed in with string mixed in with whatever). It's clearly a case of confused design. But that's neither here nor there.
No, I'm saying that your entity types are mixed and so they'd never be in the same relation to begin with.
Check your math -- a tree is a specialized graph.
You're the one who has not provided any substance to your arguement -- simply because you are unaware of how to do it in the relational model does not mean that it can't be done. Trees (and graphs in general) are mathematically defined using sets and so therefor can be logically represented in the relational model by definition.
A trivial example would be some sort of manager hierarchy:
Employee( EmpID, Name, etc. )
Manages( ManagerID, EmpID )
That is a set-based logical model of the data in the hierarchy "Manages".
What more do you want?
The question would be:
What would you do with such a database?
But more practically, from what it seems, those would not be in the same relation because they do not have the same schema. So the question is sort of moot.
However, if you wanted to, say, make the attribute of the same domain you could easily do this, provided there are no cycles in the graph (which would make it not a graph by definition).
Your results would come back as a relation.
Why do you think that hierarchies cannot be modeled in a relation or that you cannot manipulate them in a relational system?
You'll have to explain it a bit more clearly, then.
"by hand"? Are you referring to some sort of bizarro language which does all sorts of magic for you that is supposed to be some sort of 'gotcha!' moment?
Can you describe the problem using predicate logic and set theory? If so, you can model it relationally.
There's a chapter in "Practical Issues in Database Management" which illustrates how a RDBMS should handle hierarchies (e.g. trees as in your example) and how, stuck with SQL products, one would do it in SQL.
The relational model supports hierarchies and also supports as complex of constraints as you would like. You have it confused with poor implementations (SQL products et al)
"is the perfect model for all data forever"
Maybe, maybe not. Explain how it is inadequate given todays "computational power and storage requirements".
"Data must fit into the relational model"
All data can be represented relationally.
Not to say it must be in a RDBMS in order to go on the web, just that it must be in a distributed RDBMS in order to achive the goals that this 'semantic web' business is trying.
"I look forward to the web going down for schema updates."
One major point of the relational model is that internal schema changes would not affect applications -- that's what views are for.
Of course, the same (presumed) result (breakage) would occur should someone change RDF or OWL, too.
The real problem is that people are creating so-called semi-structured data in the first place. This is a band-aid approach to try and make sense of the large amounts of junk on the web. Unfortunately, it won't work (or won't work without significant headache/difficulty).
The real solution is a system of distributed RDBMS. You create your content in the DBMS and then the DBMS serves it to clients which also have a DBMS system embedded in it.
If the client is a 'web' browser they then display the format however you specified. If the client is a search engine such as google they then add the content to their own DBMS.
This has the additional advantage that no additional 'meta-data' is required because it's integrated into the relational model. There are also plenty of additional benefits that this so-called 'semantic web' nonsense does not - and none of the drawbacks of the semantic web, either. In short: it's the best solution.
On Education, Specifically:
And then on Computer Science in general (could be extended to 'science'):
"Context of action" -- you'll have to define that term. Otherwise the whole thing is unintelligable.
Business rules and security are data management issues which, by definition, belong in the DBMS.
What are you talking about? This doesn't make much sense.
Programmatic ways to express SQL?!?
Isn't OO development basically tying code and data?
Plus, business rules belong intimately tied to data -- that's so you can't circumvent them. Hence the whole idea of a relational database management system.
Sort-merge is very, very good if you have pre-sorted data (which is often the case with B-tree indexes and/or clustered indexes).
Sounds like the particular query in question wasn't using a sort-merge when it should (could) have.
Suggesting that the XML be compressed to a binary format is just about crazy and illustrates the point I'm trying to make.
Let's go over the points:
1) it is text based so can be easily edited by humans when necessary
We're using a text format to describe (in essence) a picture. What benefit is there at all to being able to edit the text file by hand? Absolutely none! Do you think 3D animators build these things by hand? No, they load it into a complex and expensive program which visually renders the scene for them, then they drag and drop stuff around. Hell, you'd be hard pressed to find any game developer that draws their models by hand with glVertex.
2) there are XML editors that can simplify the process.
Again, useless in the context of a graphics file format.
3) It has many standard tools and toolchains
So do plenty of binary formats, and there's nothing stopping these 30 vendors from coming up with a simple set of APIs like the XML folks did, too.
4) XSLT is maturing nicely as a transformation engine
Which is again useless.
5) XQuery / XML DBMS
Useless in this context. And useless in general, too. We already have a better query language (SQL) and far better DBMS systems (Oracle, Sybase ASE, PostgreSQL, etc.)
6) Finally, it is by nature extensible, allowing for different ways to put in comments, add in vendor specific extensions that are easily ignored by other vendors (or used when possible), provide for upgrade paths and the like.
These capabilities are not unique to XML. Plus, if this is indeed supposed to be a standard why would you allow for vendor specific extensions? That is counter to the "universal" format in the first place!
So, in short, XML provides no practical nor theoretical benefit to this file format. They should steer well-clear of XML and come up with a compact, efficient file format. It doesn't necessarily have to be binary (there are far less verbose text formats than XML) but it sure as hell shouldn't be XML.
... let it not be XML-based. If there is a God in heaven he will not let it be in XML!
Yikes! I'm an ASE DBA and I would never recommend MySQL. Heck, even the *free* version of ASE (ASE11.0.0.3 for Linux) is better than MySQL!
If you want a free, open-source DBMS I really like PostgreSQL, although I prefer ASE.
It may or may not outperform Oracle. I'm sure Oracle would be plenty fast in that situation, too, so it really doesn't matter. Is "faster" better than "acceptably fast"? I'm sure you'd get much better performance from writing your own DBMS without going through MySQL.
Of course, what about the costs that are incurred when you have to procedurally implement your own constraints in your application logic? Or the cost of cleaning up your data when something goes wrong? These kind of costs are hidden in the single-query performance test.
No DBMS currently 'supports the relational model'. They implement parts that are easy and ignore the rest.