Slashdot Mirror


With XML, is the Time Right for Hierarchical DBs?

DullTrev asks: "The hierarchical database model existed before the far more familiar relational model. Hierarchical databases were blown away by relational versions because it was difficult to model a many-to-many relationship - the very basis of the hierarchical model is that each child element has only one parent element. However, we now live in a web world that demands quick access to a variety of data on a variety of platforms. XML is being used to facilitate this, and XML has, of course, a hierarchical structure." Do you think a hierarchical database would really be a better answer for storing XML data over the existing relational counterparts?

"There have been some pushes to create pure XML databases (info on XML in connection to databases is here and info on XML database products is here) with claims that as they support XML natively, they can offer many advantages over relation databases.

Some of these claims include speed, better handling of audio, graphic and other digital files, easier administration, and handling of unexpected elements. Software AG, a German firm, produce and sell a suite of XML products, including Tamino, a native XML database. They have lots of information on why they think there database is great, not surprisingly, but no benchmarks. So, do the Slashdot community think that with XML the time has come for hierarchical databases? Or is it better simply to use a relational database that can output in XML, or script your way to achieve the same goal?"

276 comments

  1. SQL queried XML database in PHP by jayant_techguy · · Score: 2, Informative

    I found this SQL queried XML database in PHP. Seems very kewl.

  2. I don't think so. by webprogrammer · · Score: 2, Insightful

    Hierarchical databases won't take over because they're relational counterparts are already so well developed. A relational database can do everything a hierarchical one can, with few exceptions. Even if there is a slight gain to using a hierarchical system, there are much fewer solutions, and consequently the one's that do exist aren't as well developed, so implenting one is more difficult.

    --
    Tim ODonnell (trying to be the most
    1. Re:I don't think so. by Netmonger · · Score: 4, Insightful

      I don't agree - look at LDAP. The benefits for LDAP'fying services is clear. With a hierarchial database, specific queries can target a subset of the entire database, without the over head of having seperate tables and/or database for varying information. For keeping track of 'real world' objects: People, Printers, IPS, etc.. the advantage is that the system used to organize them is similar to the actual grouping going on. Managers have employees 'underneath' them. Its basically taking the organizational concepts used for filesystems and applying them to database design. I havent done any performance testing LDAP vs. SQL for similar schema setup, but from what I understand one of the other benefits is fast lookups. Sounds like a good project! To implement databases in both LDAP and SQL and measure the performance of similar queries!! :)

      --
      -- NeTMoNGeR
    2. Re:I don't think so. by webprogrammer · · Score: 1

      I agree with what you say, but what is it that doesn't allow you to do that in a relational database? There may be instances where a hierarchical database fits (as in the example you give), but how is this a great advantage over a relational database? Yes it may be quicker in some instances, but a properly designed relational based system could force you to connect only to the database that contains the necessary data. With your method, you'd have to connect to the whole database first, then target a subset. Of course, under some circumstances hierarchical may have the advantage. Probably a project like you say could be done fairly easily with something like PHP. I'm not sure about this, but you might even be able to use pre-supplied wrappers so you don't have to bother making sure your LDAP and SQL query mean the same thing. You could just use the same query for both, with wrappers around each. Maybe I'll try that sometime.

      --
      Tim ODonnell (trying to be the most
    3. Re:I don't think so. by petis · · Score: 1

      I agree with you. For example oracle implements a hieararchical search with the start with .. connect by prior .. statement. It works like a charm. I don't think this is possible to do in mysql and friends, but it is probably just a matter of time.

    4. Re:I don't think so. by tzanger · · Score: 2

      I agree with what you say, but what is it that doesn't allow you to do that in a relational database?

      Multiple values for attributes immediately come to mind.

      For instance: Bob Smith has 3 phone numbers. With a hierarchial database such as LDAP, you simply list them as

      • telephoneNumber: (519) 555-1212

      • telephoneNumber: (604) 555-1212
        telephoneNumber: (905) 555-1212
      In a relational database you must either leave room for the most you think you will run into, use a "joiner" table (the real term escapes me at this moment) or similarly kludge together a solution. Hierarchial databases are a pain in the ass for many things, but storing multivalued data is not one of an RDBMS' strong points.
    5. Re:I don't think so. by Craig+Davison · · Score: 1

      The "joiner" table is not a kludge.

      Besides, many LDAP implementations - ActiveDirectory for example - have a relational DB at their core.

      Disclaimer: I always hated LDAP.

    6. Re:I don't think so. by __aanonl8035 · · Score: 1

      Whats wrong with this?

      |id | custnum | custname | telephone |
      |1 | 45890 | Bob Smith | (519) 555-1212 |
      |2 | 45890 | Bob Smith | (604) 555-1212 |
      |3 | 45890 | Bob Smith | (905) 555-1212 |

    7. Re:I don't think so. by jd142 · · Score: 2, Interesting

      Nah, this example is pretty simple. You don't even have to use a join or bridge table, which is what I've heard them called. Those are only needed when you have two objects that have a many to many relationship. For example, if you were doing a database of computer repairs, you might have a table of customers and a table of techs. Since there would be a many to many relationship here, you'd have a work order table or something, to show that tech1 worked with cust1, cust2, cust1, cust3, cust3, and that cust1 had service calls by tech1, tech2, tech2, tech1, etc.

      In this case, unless you had a table of phone number data that contained information about the number (like who paid for it, the day it was installed, the type of service available, the type of line, etc) you could get by with just one employee/number table, like this:

      bobid phone1
      bobid phone2
      bobid phone3

      which is pretty simple, with a combinatio key of employid/phonenumb. You could still have a separate table with the phone number info, with the phone number as the primary key if you wanted to track the other data.

      Most people overthink relational databases and don't really break things down like they should and make well formed tables. Of course, you can chang ethe table structure based on how the database is going to be used. Sometimes is is better to denormalize the table for search efficiency.

      What I think is most interesting are the OODBMS, but it seems to me that they would have an increased overhead on their searches.

      bob

    8. Re:I don't think so. by Anonymous Coward · · Score: 0

      Excuse me, but joining tables (which is what you would probably do in this example) isn't a kludge in relational databases, it's the bread and butter of what it is actually good for.

    9. Re:I don't think so. by brunson · · Score: 3, Informative

      This is a terrible example. You are trying to describe a scenario that requires a many to many relationship. The intermediary "joiner" or cross-reference table is only necessary if you have a need to keep both joined tables normalized, i.e. you want each distinct telephone number, as well as each person object, to be stored in the database only once.

      You've already given up the possibility of normalizing your phone numbers in the heirarchical model (my roomates home phone is the same as mine and it shows up in LDAP twice, once for me and once for him), so a simple many to one join to the telephone number table will allow you to list a home phone twice, once for each of us.

      Now, if the data you are modeling truely requires a many to many relationship (your model needs to handle the real world, you can't change the world to fit the limitations of your tools), you have no way of representing that information in a normalized fashion in a heirarchical model. The so called "kludge" of an x-ref table from the relational world is not even an option.

      The heirarchical model is so limited and simplistic that it can be implemented in a single, self-referential table in a relational database, and can even be queried in a recursive manner (oracle has had 'connect by prior' for dealing with these models since I started with the product 10 years ago).

      From my view as a mathematician, and not a computer programmer, the relational model is so much more robust and powerful than a heirarchical model it hardly warrants discussion.

      --
      09F911029D74E35BD84156C5635688C0
      Jesus loves you, I think you suck
    10. Re: I don't think so. by Inthewire · · Score: 1

      Whats wrong with this?

      |id | custnum | custname | telephone |
      |1 | 45890 | Bob Smith | (519) 555-1212 |
      |2 | 45890 | Bob Smith | (604) 555-1212 |
      |3 | 45890 | Bob Smith | (905) 555-1212 |

      Well...no need to store the custname if you are already storing the custnum...duplicaton of data, and all that.
      You'd just have another table, keyed by custname, that would hold 'Bob Smith'

      --


      Writers imply. Readers infer.
    11. Re:I don't think so. by DavidJA · · Score: 1

      Because it breaks the rules of normalization(sp?). How much space is being wasted by repeating the CustName,CustNum & ID 3 times? And what if this customer has orders? Under what instance of Bob Smith does the order get associated with?

      A better solution is to create two tables. One for customers (id/custnum/custname) and one for contact numbers (id/custid/phonenumber), with custnum being the same number which is stored in the first table.

      For more info try here

    12. Re:I don't think so. by drodver · · Score: 3, Insightful

      Why do you assume relational databases are more developed than hierarchical?? The company I work for has been using our own hierarchical database for 25 years. They had the potential to become what Oracle is today but decided to stay focused on the medical industry. The serious problem with relational databases is they have traditionally not handled sparse data well at all. In the case of a patient every time they come for a visit there are tens of thousands of possible data points that can be entered, but most usually are empty. For tasks such as these relational databases have been completely impractical. With the use of indexing a heirarchical database can do everything a relational database can do.

    13. Re:I don't think so. by DJerman · · Score: 2
      Both are right and wrong. The denormalized tables can be made blindingly fast with appropriate index(es), but the joined tables are more space-efficient and flexible. You must know the "typical query" and know the number and kind of records and know the physical layer before you can do the math and settle the question.

      Third normal form is wrong if you *always* join two tables in the same way. You may waste storage with the un-normalized table but if you always join the tables you're wasting time and swap space (temp segments, whatever) reconstructing the d**n thing over and over again. Build it once and be done. OTOH, if you usually just pick other information and get phone numbers once in a blue moon, normalize away. In oracle, choose a cluster and almost get both benefits (sacrificing both space and time, but less).

      --
    14. Re:I don't think so. by The+Raven · · Score: 1

      Perhaps a poorly designed relational database would be impracticle. Something like the following is Standard Operating Procedure in the case of sparse data for relational databases:

      Patient
      -------
      PatientID
      Name

      Data
      ----
      DataID
      DataTypeID
      PatientID
      Value

      If you have no data points for a customer, no space is used. If you have 500, 500 records are stored. Simple as that. It is called a One-to-Many relationship. Hierarchal databases ONLY can do One-to-Many relationships. Flat File databases cannot do relationships at all... Flat databases DO have the problem you mentioned. Relational databases do not. Perhaps you have not looked at relational databases since 25 years ago?

      Raven
      --
      "I will trust Google to 'do no evil' until the founders no longer run it." Hello Alphabet.
    15. Re:I don't think so. by tzanger · · Score: 2

      Disclaimer: I always hated LDAP.

      I'm glad you said that; I dislike LDAP too, but it definately has its place, as do hierarchial databases. Please see my other comment in this thread for more information.

    16. Re:I don't think so. by tzanger · · Score: 2

      You've already given up the possibility of normalizing your phone numbers in the heirarchical model (my roomates home phone is the same as mine and it shows up in LDAP twice, once for me and once for him), so a simple many to one join to the telephone number table will allow you to list a home phone twice, once for each of us.

      How have I given up the possibilty of normalization? It's normal for more than one person to have the same phone number! I would fully expect the number to be duplicated: If I'm searching for you, I find your number. I don't want to have to split the many-to-one reference later when you move! That makes absolutely no sense for 99% of directory applications!

      Hell I'll even go and figure it out (for LDAP anyway): If your directory is primarily interested in telephone numbers, you would organize it such that the DN would have the telephone number in it. In that case, the number for you would have two person entries: one for your roommate and one for you. Most instances will have DNs without phone numbers in them though, because phone numbers tend to change.

      Now, if the data you are modeling truely requires a many to many relationship (your model needs to handle the real world, you can't change the world to fit the limitations of your tools), you have no way of representing that information in a normalized fashion in a heirarchical model. The so called "kludge" of an x-ref table from the relational world is not even an option.

      Exactly my point: There is no WonderTool; each tool has a specific purpose. I have a more in-depth comment elsewhere in this thread which goes into why RDBMSes aren't suited for this type of application.

    17. Re:I don't think so. by KenSeymour · · Score: 1

      What about updates?

      So if you repeat the name 3 times, and you have an update screen for individual records, what do you do?

      1) find all the other records with the same name and change them as well.
      2) change only the name that was selected
      3) offer the user of changing the name for the one record or all of them.

      So let's say you let the user change on of three Bill Smith entries to William Smith, then you have
      the problem of trying to recall all the records for that person.
      Do you query for first name = 'Bill' or first name = 'William' ?

      If you use normal form, you avoid all these problems.
      I just love people who denormalize tables early in a project over concerns about performance.
      They don't consider the costs of having to fiddle
      with all the data problems that result.

      An engineer's opinions about perforamnce may have been formed back when we were all useing 80286 chips and 8 bit disk controllers.
      I would like to encourage people to at least try
      a normalized database.
      Get a competent DBA to help you with indexes to make sure your queries run fast.
      Try different things and measure them.

      You would be amazed what a DBMS can do with your joined tables
      on a well tuned database on good modern hardware.

      --
      "We can't solve problems by using the same kind of thinking we used when we created them." -- Albert Einstein
    18. Re:I don't think so. by Anonymous Coward · · Score: 0

      If there's a one to one relationship, relational databases work fine. For one to many databases, you significantly underestimate the advantages (speed, flexability, maintanence, scalability) of hierarchial databases.

      I work with hierarchial database software. It was hierarchial since day one (sometime in the 1970s) and currently handles vast amounts of data quickly and reliably. Another division has a product that does essentially the same thing, but uses a relational database, and is only sold to small customers.

    19. Re:I don't think so. by Funkitup · · Score: 1

      I have done quite alot of work with relational databases and have implemented my own 'object' database. This is similar to a heirarchical database only that the relationships between objects are defined by other objects, this being a more flexible approach than a purely heirarchical DB. The object database used SQL Server to store the data underneath, however I am going to redevelop this with a different approach.

      What are the benefits of using an 'object' based approach?

      1. Flexibility of schema
      It is possible to link objects to each other very easily in many possible ways. This allows data to be amalgamated from many different schemas.

      2. Ease of use
      Partly due to the flexibility of the schema, object based databases are fundamentally easier for the layman to comprehend. They tend to be a much clearer abstraction than the obfuscated table format used by relational databases.

      3. Richness of data
      The data (as long as it is intelligently entered) is stored with information that explains the meanings of the relationships between objects far better than column names and table names. In the past I've had to look at database definitions and struggled to actually understand exactly what the data is supposed to represent.

      As I said earlier I am getting around to coding my own object based database, free software of course! The idea is to set up a group of 'info servers' around the world which hold public information. Sort of like the WWW but the information will be much richer and publically moderated. Anyone interested in more information or helping out (coding or otherwise) should email me at infoglue@yahoo.com.au without the spam protection tags!

    20. Re:I don't think so. by flacco · · Score: 2
      In a relational database you must either leave room for the most you think you will run into,

      No self-respecting DB designer would do this unless there was a very good special-case reason to do so.

      use a "joiner" table (the real term escapes me at this moment) or similarly kludge together a solution. Hierarchial databases are a pain in the ass for many things, but storing multivalued data is not one of an RDBMS' strong points.

      The "joiner" table you refer to is hardly a "kludge". It is an accurate representation of the association between items.

      Here's an example of your hierarchy breaking down:

      Let's say one of the telephoneNumber items is actually the front desk number in an office shared by a few dozen employees. Now let's say the number changes - you have to change that number in several dozen places.

      In a properly modeled database, you change the number in one place, and the "joiner" tables just point to it, so they get the change automatically.

      Personally I think directory information would be much better represented in a relational DB, but I understand the trade-off in the interest of speed.

      --
      pr0n - keeping monitor glass spotless since 1981.
    21. Re:I don't think so. by Anonymous Coward · · Score: 0

      If by "Hierarchal Databases" you mean "XML-ish Databases", then I kind of agree, but only time will tell. There's a lot of data that will fit this model nicely.

      If by "Hierarchal Databases" you mean "all non-flat file relational databases" or "all multi-dimensional databases", then I disagree with you 100%. There is a whole field of multi-dimensional databases out there, there are plenty of solutions and the implementation is almost always easier, cheaper and faster. In addition, the long term support is essentially guaranteed to be easier and require less man-power than any of the non-free (e.g. Oracle, M$SQL, Informix, etc.) solutions out there.

    22. Re:I don't think so. by MikeBabcock · · Score: 2

      The only place I've ever found relational databases to be lacking is i the self-referencing department.

      You have an ID, a value and a ParentID, for example, where the ParentID refers to the ID in its own table. This is fine for describing the data, but querying it well is very difficult and the subject of many discussions.

      Look up JBase or other Pick derivitives for some non-relational databases (multi-value, to be specific).

      --
      - Michael T. Babcock (Yes, I blog)
    23. Re:I don't think so. by tzanger · · Score: 2

      Let's say one of the telephoneNumber items is actually the front desk number in an office shared by a few dozen employees. Now let's say the number changes - you have to change that number in several dozen places.

      True, but I would have tried to design the hierarchy to avoid that. To be more specific, refer to this .pdf (ps and an ugly jpeg also available). This is what I'm using for my directory format (I'm writing a perl Outlook .csv to .ldif convertor) -- Company-wide information goes under the company, and only the differences are put under the contact's BusinessContactInfo branch. There's only one place that needs changing there...

      There's also the possibility of just using an LDIF modify command. I'm not really great at LDAP yet but I believe that it is possible to have the LDAP server walk the tree and modify the telephone number from (xxx) yyy-zzz to (aaa) bbb-ccc. Just because it's possible doesn't mean it's nice to do; walking the tree isn't something I'd like to ask the server to do on a daily basis, hence my attempt at organization.

      The "joiner" table you refer to is hardly a "kludge". It is an accurate representation of the association between items.

      I referred to it as a kludge because to get the benefit of a hierarchial database in a relational one, everything must be done in joiner tables. i.e. you have a table of names. Then a table of names and telephone numbers. Now a table of names and addresses. Don't forget the table of names and spouses. Or the table of names and contact categories. And so on and so on and so on. There's no longer any structure, just a bazillion tables all linking each other. Normalization bliss, perhaps, but a pain in the ass to work with.

      aside: If anyone is interested, the utility will soon be done. The actual convertor is done, but now I'm trying to get the second portion to actually fill the LDAP directory "smartly" to avoid the types of problems brought up by flacco. i.e. when it adds a contact, check to see if the company is already there and if so, if the BusinessContactInfo is idential to the company's info. If so, strip it out. if not, try to figure out how to best add it) -- if anyone's interested in helping me make the directory design better or maybe just wants a copy of the csv-to-ldif script, let me know. I'm new to all this but I want to get my company's umpteen-thousand contacts out of Outlook-Only land.

      aside 2: Does anyone know how to get ghostscript to spit out nice png or jpeg files from postscript input? ps2pdf works great for .pdf, but I can't seem to figure out how to turn on the anti-aliasing for png/jpeg.

    24. Re:I don't think so. by ynohoo · · Score: 1

      I've gotta agree with brunson, and I'm coming from a business analysis/programmimg perspective. Everywhere I've worked, management are constantly coming up with new ways they want to analyse their data, and hierarchical databases just aren't flexible enough to handle it. Sometime they only want to run a query ocassionally, so it does not matter that it is hideously inefficient. The development time for such analysis of a hierarchical database would usually render such requests too expensive to contemplate - and telling management "it can't be done" is a sure way to get yourself, and the hierarchical database, shown the door! And if that inefficient way of analysing the data become a regularly used feature, another index is fairly simple to add (with a slight performance hit for its maintenance, of course). Hierarchical databases suffer from the same problems as rigid Object Orientated designs, in that initial design decisions that later prove to be flawed can prove hellish to overcome.

    25. Re:I don't think so. by soulhakr · · Score: 1

      ...but a properly designed relational based system...
      I've seen far too many relational-based systems which were badly designed because the DBA/DBD didn't know how to implement the heirarchical structure properly

    26. Re:I don't think so. by flacco · · Score: 2
      I referred to it as a kludge because to get the benefit of a hierarchial database in a relational one, everything must be done in joiner tables. i.e. you have a table of names. Then a table of names and telephone numbers. Now a table of names and addresses. Don't forget the table of names and spouses. Or the table of names and contact categories. And so on and so on and so on. There's no longer any structure, just a bazillion tables all linking each other. Normalization bliss, perhaps, but a pain in the ass to work with.

      I'm not sure I see it as such a burden to keep entities in separate, non-redundant tables and to represent their associations in a table for that purpose. Certainly less hassle than reorganizing your hierarchy when your needs change.

      WRT your example - since there appears to be a one-one relationship of most of those items to "name", separate tables are unnecessary (unless you have to maintain histories). If you just need current address and current spouse, you include those in the employee table.

      And, of course, you wouldn't use "name" as the primary key :-)

      --
      pr0n - keeping monitor glass spotless since 1981.
    27. Re:I don't think so. by tzanger · · Score: 2

      WRT your example - since there appears to be a one-one relationship of most of those items to "name", separate tables are unnecessary (unless you have to maintain histories). If you just need current address and current spouse, you include those in the employee table.

      You're falling into the trap... Yes it's true that most times it is a one-to-one relationship. However to think that everyone in the world has just one address or one spouse is folly, and that is exactly what I was getting at: You must either keep damn near everything as a two-column table to allow for many-to-one and many-to-many mappings, or you must hardcode in limits. Neither is very palatable, but in a hierarchial database it's easy and fast, so long as you don't want to update the thing all the time. Directories are meant almost as a WORM technology (Write Once (in this case Occassionally), Read Many). And since namespace redesigns are painful in directories, you need to take a great deal of care when setting one up in order to meet the 99% of people's needs.

      It's like I had stated in another response: There are places for both systems; I would even go as far as to say most times you want a relational database. But to completely rule out hierarchial databases isn't a wise thing to do.

    28. Re:I don't think so. by Anonymous Coward · · Score: 0

      Please get a book on data structures and look up 'normalization'. You sound like an idiot.

    29. Re:I don't think so. by tzanger · · Score: 2

      Please get a book on data structures and look up 'normalization'. You sound like an idiot.

      If you would have read any of my other comments in this thread you would know by know that it is you who sounds like an idiot. My quoting the word "normalization" and then using "normal" in a different meaning in the next sentence only proves how easy it is to sidetrack anonymous cowards.

    30. Re:I don't think so. by __aanonl8035 · · Score: 1

      As in most things there are a variety of ways of looking at it. I found an intereseting article on the subject here.

  3. Reversed Question by devnullkac · · Score: 5, Insightful
    From a purist perspective, I suspect the question is actually reversed: we shouldn't be talking about "XML data" is if it was somehow the core representation. It's usual intent is as a transmission format and, as such, needn't correspond directly to the organization of the source data.

    Rather than discard the advantages of relational and object databases, should we instead ask how XML can be used to represent those kinds of relationships?

    --
    What do you mean they cut the power? How can they cut the power, man? They're animals!
    1. Re:Reversed Question by sporty · · Score: 2
      Having worked with representing just the table layouts in XML, its really not that hard to represent say, a NxM set of data. That's always been easy, its a 2d table. It winds up being a very shallow xml document, no more than say 3 levels: root node, node representing the data record, node representing an entry in the record.


      You are 100% right, in that we should discard relational db's. Objects are a little more natural for a representation in XML. If an object contains objects, even if they are of the same type, ala trees, its a more natural representation than a 2d table.

      --

      -
      ping -f 255.255.255.255 # if only

    2. Re:Reversed Question by Florian+Weimer · · Score: 4, Insightful

      You have a point. In addition, we should ask ourselves: "Do we really need XML if it doesn't fit in our established technology framework?"

      Often, the answer is a plain "No", from a technical standpoint. However, you have to market your product somehow, and this means that you need Java, Linux, LDAP, XML, and SOAP. (As time passes, some entries will drop off the beginning of this list, and others will show up at the end.)

    3. Re:Reversed Question by The+Raven · · Score: 1

      devnullkac did NOT recommend that we discard relational databases! Exactly the opposite, he claimed that we should be looking for efficient ways to represent relational information in XML, not ways to abandon relational databases in favor of hierarchal ones.

      Remember that XML is not the 'end use' for data. XML is a way to get data from place to place... from one database to another, one OS to another, one country to another. People don't 'use' XML any more than they 'use' TCP... it's a way to get data from one place to another.

      So the question should not be 'How do we efficiently represent XML in our databases?' The question should be 'How do I efficiently represent my database in XML?'

      Raven
      --
      "I will trust Google to 'do no evil' until the founders no longer run it." Hello Alphabet.
    4. Re:Reversed Question by sporty · · Score: 2

      Its really not that hard, especially with 2d data, per my post. And even if you do table joins, you can represent the joined data with XML to represent which tables which data came from and which did not.

      --

      -
      ping -f 255.255.255.255 # if only

    5. Re:Reversed Question by Anonymous Coward · · Score: 0
      Java, Linux, LDAP, XML, and SOAP. (As time passes, some entries will drop off the beginning of this list, and others will show up at the end.)

      That's a bit presumptious to say the least. Leaving Linux aside for the moment, try to remember that Java, LDAP and XML were created to solve particular problems - at which they have succeeded quite well. SOAP and .NET were created purely to try and grab market share away from the previous technologies, without conferring any substantial benefits and relying entirely upon marketing muscle. It's not at all clear that a sufficient majority will be fooled.

    6. Re:Reversed Question by haruharaharu · · Score: 2

      Java, LDAP and XML were created to solve particular problems - at which they have succeeded quite well. SOAP and .NET were created purely to try and grab market share away from the previous technologies

      And they are all being used in various places where they don't belong, just because they are the fads of the day. How long before 'Who moved my cheese?' finds its way in to this list?

      --
      Reboot macht Frei.
    7. Re:Reversed Question by Zeinfeld · · Score: 3, Informative
      Java, LDAP and XML were created to solve particular problems - at which they have succeeded quite well. SOAP and .NET were created purely to try and grab market share away from the previous technologies

      That is a crock. XML was developed explicitly to fix the problems in SGML. LDAP was developed to fix the problems in X.500. In both cases it was the poor design of the predecessor that was being fixed.

      Henrick F-N was working on SOAP like ideas long before he joined Microsoft. Again all SOAP does is to fix known incompetence in CORBA. Gates devised .NET to solve two problems, first how to get a foothold in the enterprise space, second how to improve on C++ without the proprietary lock that Sun had imposed on Java.

      --
      Looking for an Information Security student project suggestion?
      Try http://dotcrimeManifesto.com/
    8. Re:Reversed Question by Antti+R · · Score: 2, Insightful

      Gates devised .NET

      Isn't it just lovely to develop for a platform where the motivation for every development is a commercial plot to maximize platform controller's profit margin?

      [...] second how to improve on C++ without the proprietary lock that Sun had imposed on Java.

      More like, how to get a proprietary grip on language and a platform like Sun has with Java.
      And no, rubber-stamping some of the interfaces designed solely by you (to best fit into win32, of course) at ECMA while leaving the thinnest win32 wrappers (like the gui classes) merely de-facto standards, does not make C#/.NET non-proprietary.

    9. Re:Reversed Question by haruharaharu · · Score: 2

      That is a crock.

      No it isn't. What I said was that various technologies are being misapplied for the sake of product packaging. This should be no surprise. What I didn't say was that it was always the case.

      --
      Reboot macht Frei.
    10. Re:Reversed Question by Ian+Bicking · · Score: 2
      Rather, non-relational databases already exist -- there's lots of object databases. And when you consider that many of these object databases support objects that can also be serialized to XML (e.g., for XMLRPC), then you see that they are pretty much orthogonal to an XML database.

      Object databases are being used more and more, I think -- though they aren't taking off or even biting into RDBMS's much...

    11. Re:Reversed Question by john@iastate.edu · · Score: 2
      There's already something that works along these lines called ODBCSocketServer. Basically it's an ODBCServer that sits on your WinBloze box and ships out the results in XML. Something like:

      <result>
      <row><col name="foo">6</col>
      <col name="bar">john</col></row>
      ...
      </result>

      --
      Shut up, be happy. The conveniences you demanded are now mandatory. -- Jello Biafra
    12. Re:Reversed Question by Anonymous Coward · · Score: 0

      Well put. RDBMS came along because they will do things hierarchical storage systems won't. Comes to that, from an OLAP point of view, there's a lot that RDBMS won't do. And somewhere down the road, someone will devise a data storage system that will perform other tasks one cannot carry out with OLAP. (And to complete the circle, there are still things hierarchies do best.)

      Let's leave XML as what it was meant to be: a data communications format.

    13. Re:Reversed Question by briansmith · · Score: 1

      aagain all SOAP does is to fix known incompetence in CORBA.

      And, exactly what CORBA "problems" does SOAP fix? I think the only problem with CORBA is that it has a lot of features and practical guidence about effective use of CORBA has been lacking. The fact that Microsoft refused to implement slowed its adoption a lot as well. However, CORBA was and is technilogically sound and I doubt SOAP will ever be anything comparible (and never ever more efficient).

    14. Re:Reversed Question by Florian+Weimer · · Score: 2
      That is a crock. XML was developed explicitly to fix the problems in SGML. LDAP was developed to fix the problems in X.500. In both cases it was the poor design of the predecessor that was being fixed.
      The main problem with SGML and X.500 is the complexity of the specification. X.500 is so complex that no complete implementation exists. At the beginning, it made perfect sense to throw away all the unnecessary cruft to get implementable specifications.

      However, XML with all its surrounding standards has already gone beyond SGML in terms of complexity, and people are reinventing X.500 DAP features for LDAP. In the end, the same complexity problems surface again.

    15. Re:Reversed Question by leandrod · · Score: 1

      The only known problem with SGML, X.500 and CORBA is that SGML and X.500 were too complex for Microsoft programmers, and CORBA besides that was also too open for Microsoft corporation.

      SOAP on the other hand may be seen as forcing a Web view on the world...

      --
      Leandro Guimarães Faria Corcete DUTRA
      DA, DBA, SysAdmin, Data Modeller
      GNU Project, Debian GNU/Lin
  4. You didn't mention the best native XML DB by Anonymous Coward · · Score: 1, Interesting

    excelon has a very full featured XML database.
    We use it exclusively and it kicks ass.

    Well, the current version does. Pre 3.0 sucked ass.

    http://www.exceloncorp.com

  5. Hierarchical == Object-Oriented Databases? by disarray · · Score: 3, Insightful
    Wouldn't object-oriented databases qualify as hierarchical (or some of them, at least)? A rather lengthy story ran a while back covering various reasons why object-oriented databases are useful, followed by various comments on cases where they aren't and why they aren't as common as relational ones today. The bottom line seems to be that they are in use today. One notable example comes to mind: LDAP. The aforementioned story has more. Despite the rather preachy tone, it's an interesting read.

    1337ness for sale.

    1. Re:Hierarchical == Object-Oriented Databases? by ez76 · · Score: 1

      Wouldn't object-oriented databases qualify as hierarchical (or some of them, at least)?

      Be careful not to confuse class "hierarchies" with relationships among objects, which are generally graphs, not rooted trees.

    2. Re:Hierarchical == Object-Oriented Databases? by JordanH · · Score: 1
      Shhhh... You aren't supposed to notice that.

      Why MUST we be forced into one-size-fits-all RDBMS solutions?

      Someone had to come up with the buzzword-compliant "Object-Oriented" Database to break the hypnotism the Relational Database vendors and theorists have over the industry.

      It seems to me that a lot of data is hierarchical in nature. It's represented that way in programs and sometimes, you just want it to be persistent. A hierarchical database is sometimes just the thing you need, but we're forced into taking our nice hierarchies and deconstructing it into tables to make it fit in the one tool for persistant data storage that's blessed.

    3. Re:Hierarchical == Object-Oriented Databases? by swillden · · Score: 2

      Wouldn't object-oriented databases qualify as hierarchical (or some of them, at least)?

      Object-oriented databases are what used to be called network databases, and can represent arbitrary graphs. Any network database can be hierarchical, just by imposing some limitations on the kinds of likages that are allowed. In fact, network databases allow the most flexible data structures; anything you can build with pointers.

      In fact, the correct model for storing XML data *is* a network model. The relational model obviously doesn't fit, but although it's less obvious, the addition of the XLink specification to XML means that the hierarchical model doesn't either. XML documents can have arbitrarily complex structure because of all the pointers -- they map perfectly onto an OODB.

      --
      Note to ACs: I usually delete AC replies without reading them. If you want to talk to me, log in.
    4. Re:Hierarchical == Object-Oriented Databases? by flacco · · Score: 2
      It seems to me that a lot of data is hierarchical in nature. It's represented that way in programs and sometimes, you just want it to be persistent.

      That's inflexible. As you said, the data is REPRESENTED that way in programs, but it's only a representation. You might want it in a different hierarchy later (and there is almost always a "later", whether you're the one who encounters it or not).

      --
      pr0n - keeping monitor glass spotless since 1981.
    5. Re:Hierarchical == Object-Oriented Databases? by m.batsis · · Score: 1

      Object-oriented databases would not only serve for simple XML data; they would be quite usefull in holding RDF(S) statements without ruining everything... currently i strip RDF tags of the markup to store them in an RDBMS (and almost any XML for that matter).

      --
      "You laugh at me because I am different. I laugh at you because you're all the same." --Vick Imbornoni
  6. Many to many is hard? FALSE! by hodeleri · · Score: 3, Insightful
    XML has, of course, a hierarchical structure

    Just because XML is a hierarchical markup language does not mean that it can only be used for hierarchical things. Perhaps you should look at RDF which can use many to many mappings through resources and groupings (sequences, bags, and alternates). (A resource in one grouping can refer to another grouping i.e. many to many.)

    1. Re:Many to many is hard? FALSE! by m00nshyn3 · · Score: 1

      Nobody ever said many to many relationships in XML were hard. The orginal comment said that mm relationships were hard in hierarchical databases, not hierarchical structures. a straightforward explanation of why many to many is hard in a hierarchical db can be found here.

    2. Re:Many to many is hard? FALSE! by sql*kitten · · Score: 2

      Just because XML is a hierarchical markup language does not mean that it can only be used for hierarchical things.

      Yes, but XML is hugely inefficient for table structures, because of all the redundant metadata.

  7. Discussions by Lozzer · · Score: 3, Informative

    There is lots said on this over at Database Debunkings

    --
    Special Relativity: The person in the other queue thinks yours is moving faster.
  8. XML vs. ERwin by imrdkl · · Score: 3, Insightful
    IANADG, but the folks that do our models still use good, old, ERwin. Something about the relationship-specification capabilities, I guess. I was not aware that XML limited number of parents specifically. You sure that ain't just a limitation of your programming language? :)

    An afterthought, databases are about storage and speed of insertion/extraction. I honestly don't believe that fitting the database to the data structure is worth the cost or the trouble, just yet.

  9. No Chance... by augustz · · Score: 3, Insightful

    I think these discussions come up part of the time because people want something new and sexy. In this case OO DB's, which 'XML DB's' are a variant of, may have benefits in specific and limited cases. But I have not been impressed.

    Take your classic orders table. Part NO, Custoemr NO, etc. etc. The number of apps with only one parent is tiny, the flexibilty limited, and the whole metadata scanning business awkaward.

    For anyone doing and serious larger scale database work some of this stuff is a joke. The idea these vendors have is that we'll be storing XML data in these DB's, ignoring that even for a simple phone directory, the XML data probably takes up a significantly greater amount of space than a simple relational DB would require

    And this ignores the significant amount of time and energy invested in toolsets and models for the existing setup. Sure, someone might come out with a chip that runs 2x as fast as an intel at the same price, but unless it is intel compatible how many people would buy it or care?

    1. Re:No Chance... by Anonymous Coward · · Score: 0

      This is a very uninformed answer. XML DB's are *not* variants of Object DBs... Some, like Excelon and X-Hive, are built on top of object databases, while others like Tamino and dbXML have implemented native filing systems geared specifically toward storing and indexing XML document structures. Even those DBs that are build on top of object databases don't expose the actual object database layer.

    2. Re:No Chance... by Multispin · · Score: 1

      I wouldn't consider XML DBs a special case of OO DBs. OO DBs imply something more about object relationships. The often imply the notion of methods and operators on these objects.

      Belive me, there are LOTS of very cool things that you can do when you break the relational model. Is the relational model going away? Hell no. Is XML somehow competing with the relational model. Nope.

      I honestly don't see anything XML related being faster then relational for transactional data. However, for knowledge representation or data interchange, hierarchies rock!

    3. Re:No Chance... by Kingpin · · Score: 1

      This is not because people think "XML is sexy. Let's do everything with it!". It's because the DB research world recently has been approaching semistructured data that allows for queries not immediatly available in the relational model.

      Furthermore the "document society" has been bringing consistency back into the web by various derivates of the "webservices" concept, ie. there's an XML representation of data - which is machine readable, rather than one of quirky man made HTML.

      This convergence between DB research and data representation is what is interesting to lots of people in this area. Will it suddenly make sense to use the hierachical structure as logical view on the database? If so, can we make operations like JOIN and UNION (or other) on websites thus causing data enhancement or aggregation?

      The ideas are really interesting, don't quite knock them yet. I can strongly recommend a google search on "semistructred data", further the book "Data on the Web" by Abiteboul et al. is extremely insightful on this topic.

      --
      Unable to read configuration file '/bigassraid/htdig//conf/14229.conf'
      Geocrawler error message.
  10. Both Worlds by Khazunga · · Score: 1
    What I really would like to see is the ability to have Relational Databases, with hierarchical types for fields. I would be able to query these fields much like I query/transform an XML document, possibly using the XML technologies (XPath, XSL, ...)

    The relational model is very good for most situations, and has been very studied and optimized. Noone would transition back to pure hierarchical DBs.

    --
    If at first you don't succeed, skydiving is not for you
    1. Re:Both Worlds by friedo · · Score: 2

      A hierarchical model can easily be done in a relational system by simply using a self-referrential 2d table. Each element has an unique ID and the ID of its parent.

    2. Re:Both Worlds by rp · · Score: 2, Informative

      You can represent the structure, but you can't manipulate it using standard relational logic.

      For example, take a table representing a parent-child relationship. Now try to sort the persons in the table by their number of descendants. SQL has only recently been extended to allow this query to be posed. Perhaps your relational database can handle this kind of query, where you have arbitrary-depth path walking, ybut ou can't expect it to handle them efficiently.

    3. Re:Both Worlds by dgroskind · · Score: 2

      For examples of how use relational architecture for hierarchical data see Trees in SQL by the irrepressable Joe Celko.

      Briefly summarized, his approach is: "tree structure can be kept in one table and all the information about a node can be put in a second table."

  11. XML is sometimes useful by Anonymous Coward · · Score: 1, Informative

    Yes, I think XML databases can be useful sometimes, as even though relational is faster and better developed in some cases native XML products have the capability to store any data, without prior setup. I know I'm using dbXML (http://www.dbxml.org) in a product of mine which allows 3rd parties to store arbitrary data associated with a user.

    Also, you get the full advantage of the XML technologies developed by the W3C and others - the ability to do a simple query, transform that data and then send to a web browser with very little coding involved is a great bonus.

    (i've forgotten my login, time to go create a new one i think)

  12. Indexing? by aralin · · Score: 4, Insightful
    Anyone can explain to me what is suddenly so wrong about relational database with hierarchical indexing?

    Maybe its just me, but the goal today is integration and having a special database for XML and special database for this and that just because its faster for this particular problem creates such a level of complexity, which prevents accomplishing even of the most trivial tasks.

    Still, XML is only a way how to describe data, that might be often in their structure relational. Why do not store data in their native form and create XML documents out of database on fly by filters?

    This question of hierarchical databases is just plain trolling in my eyes.

    --
    If programs would be read like poetry, most programmers would be Vogons.
    1. Re:Indexing? by BenHmm · · Score: 2

      Still, XML is only a way how to describe data, that might be often in their structure relational. Why do not store data in their native form and create XML documents out of database on fly by filters?

      Quite. Not only would the XML markup probably take more space than the data itself, but storing it as XML seems to be not only pointless, but also a little shortsighted. What if your XML spec changes? What if you want the data in another form?

      Just storing the data and then dynamically creating the XML doc on the fly is sooo much easier.

    2. Re:Indexing? by captredballs · · Score: 2, Insightful


      The problems that you mention, both concerning storage space and flexibility of the data model are what XML databases are attempting to solve.

      Listing the problems in opposition to the solutions does not make for a good arguement.

      --

      I suppose I'm not too threatening, presently, but wait till I start Nautilus
    3. Re:Indexing? by Hector73 · · Score: 1

      Anyone can explain to me what is suddenly so wrong about relational database with hierarchical indexing?

      Absolutely nothing.

      Someone brought up LDAP earlier. In fact, Netscape's LDAP server is built atop a special version of DB2 with highly optimized indexing.

      The question of using a heirarchal database (whether LDAP or XML) vs. your own custom-built relational indexes boils down to (IMHO) the problem at hand and its tolerance for speed vs. integration stress vs. budget.

  13. All about databases by jayant_techguy · · Score: 1, Informative

    Extropia has a detailed tutorial on databases of all types.XML:DB discusses the differences between object-oriented databases, hierarchical databases, and relational databases in detail. You may be interested in DBX a DBMS that is written completely in PHP, and works using XML style text files as its native format.

  14. XML and RDBMS inconsistencies by russcoon · · Score: 2, Interesting

    In my experience with XML and RDBMS systems, mapping one onto another is always a dicey task. The primary reason (IMHO) is that XML's ability to represent order as well as structure as data doesn't fit into an RDBMS database without some work. I've seen people try to map both XML and regular DB's onto each other, and my opinion is that the results don't "feel right" on one side or the other unless great pains are made to preserve the structure of the XML doc in the DB schema.

    That said, I'm not sure a hierarchial DB will necessarialy be any better than something like an OODBMS with well-modeled objects.

  15. XML an alternative to db for me... by kellyboy · · Score: 1

    It's my choice for DBs for my website since my website hosting co doesnt provide MySql or any DBs. It has all the XML modules you could use to use XML as database..... it's conveninece. It's very portable. And it's easy to read.

    1. Re:XML an alternative to db for me... by Anonymous Coward · · Score: 0

      XML is not a database - it's a format (doh!). I mean, my gosh, how do you get ACID - especially given the concurrency?

      Either you have only one user at a time on your site, or incredibly simplistic data storage needs, or worse [inconsistent data]...

  16. Heirarchical vs relational dbs by ShmakDown · · Score: 2, Insightful

    I don't think that heirarchical db's have any real chance of taking over or replacing relational dbs in the future. There may start to be more of a place for them, but many application service providers that use XML still have a fair amount of relational data that needs to be maintained. XML is mainly being used for communication protocals and not so much for internal data structure storage. I think the more likely db trend in the future will be for many users to maintain both relational and heirarchical databases..

    --
    WeFunk
    1. Re:Heirarchical vs relational dbs by wadetemp · · Score: 1

      If you've every looked at any of the OLAP technologies, it's very simple to take data in an exsiting relational structure and map it to a hierarchical structure for easier user analysis. This isn't necessarily maintaining both types of databases, but rather building one type off another type for analysis purposes. *whisper (Microsoft has some great software to let you do this on SQL Server...) :)

  17. LDAP, the hierarchical database that works by dmelomed · · Score: 1

    With all this hype about XML and ubiquiteness of SQL, LDAP directories do not get the attention they deserve. How many of you have installed SQL-based authentication at your site, just to find out how limited a solution it is (maintain more than one database for all kinds of authentications, do you?). Not only does LDAP allow for a flexible hierarchical directory, it's also a standardized Internet protocol whereas SQL isn't. With LDAP, many applications work out of the box because it's a standard. Oh yeah, there's also the OSS server available at openldap.org.

    1. Re:LDAP, the hierarchical database that works by Anonymous Coward · · Score: 0

      One place I worked was using SQL for such things - when I asked about LDAP, the answer was "but we have people who know SQL" - if you hit a square peg hard enough, it'll probably go through the round hole....sigh...

  18. Take two by Anonymous Coward · · Score: 0

    IMHO, XML documents should be handled in both hierarchical and relational ways: the first for an efficient long-term storage and the other for transactions.

    1. Re:Take two by ShmakDown · · Score: 1

      I agree, the problem there lies in finding the best way for conversion, or using a different approach like redundency. I like the idea of being able to store in both ways so that the lookup still happens quickly.

      --
      WeFunk
  19. The priorities are wrong.... by el_mex · · Score: 2, Interesting
    A data format will NEVER dictate a system's design. XML is nothing other than a data format.


    The relational model has no major shortcomings. The only thing XML offers that is not already very well done is easier data interchange. As a database administrator, I can tell you there is NO chance XML will dictate a change of how we store data. There are much higher priorities in database management than easier data interchange.

    1. Re:The priorities are wrong.... by ShmakDown · · Score: 1

      Wake up. Data formats dictate system design all the time! Systems work better when they are fine tuned to work with their data sets well!

      --
      WeFunk
    2. Re:The priorities are wrong.... by el_mex · · Score: 1
      Data formats dictate system design all the time!


      Really? So I guess if a system processes a data file with a flat structure it would make good design if the database is designed flat to fit the data format?


      Systems work better when they are fine tuned to work with their data sets well!


      You're blurring the line. I referred to data formats as they pertain to a data file, not as they do to a database design (If I had meant "database design" I would have said "database design").

    3. Re:The priorities are wrong.... by captredballs · · Score: 1

      "No" major shortcomings? Traversing complex data sets can often be incredibly difficult in relational databases. Additionally, relational data models are generally very difficult to modify.

      In particular, I'm thinking about complex scientific data sets where you may wish to "select" based on criteria that may not be keyed.

      --

      I suppose I'm not too threatening, presently, but wait till I start Nautilus
    4. Re:The priorities are wrong.... by el_mex · · Score: 1
      I do not get your comment...

      I'm thinking about complex scientific data sets where you may wish to "select" based on criteria that may not be keyed

      By "keyed" do you mean indexed? Why is that hard? Time-consuming, yes (with large data sets), but "incredibly difficult"? Why is it incredibly difficult? Is it not involving just a single SQL statement, keyed or not?

      Again, would XML address this?

    5. Re:The priorities are wrong.... by doubtme · · Score: 1
      The relational model has no major shortcomings.

      While you may be correct, I can't help but wonder how you are meant to implement any form of generalisation or inheritance in an RDBMS, without a huge mess of tables and complex relationships.

      I mention this, because it is something I find myself wanting to do all the time, for example, when storing data that originates in OO programs. Being able to store it in an RDBMS has heaps of advantages for me (primarily that it is easier and less buggy to load and save data) - but I can't easily store the different info of different derived classes.

      If you have any suggestions on how to solve this, I'd definitely appreciate it!

      --

      There's no $$$ in 'team'...
      www..--..net - for incisive, w
  20. Why relational databases dominate by coyote-san · · Score: 5, Insightful

    Relational databases didn't come to dominate the database market because they pushed aside equally valid alternatives, they dominate the market because relational databases implement relational calculus. Indeed, that's the very touchstone that distinguishes relational databases from something like DBM and its many descendants.

    And *that* is important because it assures the desiger and user that every possible operation is well-defined and (hopefully) correctly implemented. The exact syntax for a "join" may differ, and a specific implementation may be flawed, but everyone agrees to a common baseline.

    For hierarchial databases to really take off, they need to have an equally strong mathematical underpinning. For now, AFAIK, there is none other than that you get when you map a hierarchial database into relational tables and use exactly those relational properties. That's a good start, but if you're only using the properties in relational databases, why not stick with them?

    As for XML, that's completely irrelevant. It's a good format for transferring data, but that's about it. You can store hierarchial data in an XML file, but you can also use it to store purely relational data or completely unstructured data (in some CDATA block).

    --
    For every complex problem there is an answer that is clear, simple, and wrong. -- H L Mencken
    1. Re:Why relational databases dominate by Zeinfeld · · Score: 5, Insightful
      Relational databases didn't come to dominate the database market because they pushed aside equally valid alternatives, they dominate the market because relational databases implement relational calculus.

      That's rubbish. Back in in the 1960s when the first relational databases emerged nobody had a formal specification for a relational calculus. Today we can create a formal calculus for any data model, the Entity relational model is no different in that regard.

      SQL is a very 1960s / COBOL way of looking at a data structure. Most of the people using it simply do not have the breadth of experience of other data models to know its strengths or weaknesses. Most of the posts in the thread are as empty as those in an editor choice flamewar.

      The entity relationship model has been discarded by the programming language community in favor of typed set theory. Java and C# both have representations of sets, lists, etc., the only reason to use an entity relational model is to get persistence for the data structure.

      So you get this impedance mismatch and a pile of code whose sole purpose is to rewrite the data structures used in the program so that they match the data structures used in the persistence store.

      What we need is a persistence store with a data model that matches our programming language data model. Unfortunately most of the attempts to do this are half baked. All it should take is to add transaction statements into the language so that you declare a procedure to be transactional, it will be all or nothing.

      Unfortunately Sun made a pact with Oracle over Java and so they have remained stuck in the obsolete SQL world. C# looks to me to be a much better opportunity, Microsoft has little to lose from unifying the data model of the language with that of the persistence store and everything to gain.

      --
      Looking for an Information Security student project suggestion?
      Try http://dotcrimeManifesto.com/
    2. Re:Why relational databases dominate by tim_maroney · · Score: 5, Insightful

      So you get this impedance mismatch and a pile of code whose sole purpose is to rewrite the data structures used in the program so that they match the data structures used in the persistence store.

      Exactly. What's more, this pile of code takes months to write even for a few dozen object types; it doesn't understand the idea of dependencies between objects so you have to add a whole layer to make sure that objects get persisted in the right order; it's incredibly hard to change, so the system design can't iterate; and simple objects like collections proliferate tables to the point of significant performance losses. It's a terrible way to build a software system unless the user model just happens to be adequately modeled by a fill-in-the-blanks table.

      This is why serious applications traditionally roll their own file formats. It's actually less work to manage most data models from scratch than it is to map them into the straitjacket of a relational database. Custom file formats serve in essence as hand-rolled object databases. Unfortunately, the rise of the three-tier client-server architecture has made the RDBMS layer an unquestioned assumption, with the result that modeling two dozen object types winds up generating over 50,000 lines of convoluted, slow and buggy source code. Modeling the same objects from scratch on a custom B-tree would take less than one fifth the code size. Doing it in a good ODBMS would be almost as trivial as specifying the data structures in XML.

      On my latest project, we ran into a strange issue when specifying the user interface of a discussion system. The designers wanted to mark read and unread messages per user -- in other words, functionality critical to providing a friendly user experience, which rn had fifteen years ago. The engineers hit the roof and said it was impossible. It turned out the reason was that this is an intrinsically hard problem on an RDBMS, although it's a trivial problem to solve in a hand-rolled .newsrc text file. Over the course of the project we ran into tons of these issues, and the interface design took a severe beating because of compromises to the limitations of an RDBMS back-end.

      Tim

    3. Re:Why relational databases dominate by fingon · · Score: 1

      My mind boggles at the last paragraph of yours; ANYTHING you can represent in a text file, you can also represent in RDBMS, and usually easier..

      One should get competent people first ;-) (all it takes is basically mapping of (article,user)->state, which can be simply one more table).

      --
      -- pending
    4. Re:Why relational databases dominate by jmerelo · · Score: 1

      As for XML, that's completely irrelevant. It's a good format for transferring data, but that's about it.
      You can store hierarchial data in an XML file, but you can also use it to store purely relational data
      or completely unstructured data (in some CDATA block).


      That's simply not true. XML is good for representing data structures, and that makes it good for transferring information. But it's good as a (very basic) semantic model for data. And it will be even better as the steps of the Semantic Web are ascended.

    5. Re:Why relational databases dominate by Anonymous Coward · · Score: 0

      The designers wanted to mark read and unread messages per user -- in other words, functionality critical to providing a friendly user experience, which rn had fifteen years ago. The engineers hit the roof and said it was impossible. It turned out the reason was that this is an intrinsically hard problem on an RDBMS, although it's a trivial problem to solve in a hand-rolled .newsrc text file. Over the course of the project we ran into tons of these issues, and the interface design took a severe beating because of compromises to the limitations of an RDBMS back-end.

      I do not see a problem with a table that has two fields: MESSAGE_ID, and USER_ID. If the user reads the message put an entry in the table.

    6. Re:Why relational databases dominate by elronxenu · · Score: 1
      relational databases implement relational calculus

      Yes, but the programmers and designers do not! In a previous job I inherited a large system built principally on SQL. The queries were sometimes up to a full page of convoluted SQL. Even those cow-orkers most familiar with the system were at a loss to understand what a lot of it did.

      I'm sure formal proofs of correctness were not possible and probably never entered the heads of the implementors. The system behaved inconsistently and exhibited different errors each month. The code was buggy, but the actual erroneous statements were not obvious. Maybe it was something subtle like a missing row in an 8-table join, or an unexpected NULL value in the row for the third previous invoice.

      The mere fact of this unreliable behaviour coupled with the inability of senior staff to pinpoint the erroneous statements leads me to believe that there's a more fundamental problem at work here than sloppy design or bad implementation. The conclusion is that the SQL language itself and by implication its relational calculus underpinnings, are not suitable for programming certain types of systems.

    7. Re:Why relational databases dominate by Anonymous Coward · · Score: 0

      That's rubbish. Back in in the 1960s when the first relational databases emerged nobody had a formal specification for a relational calculus.

      Dude, that's rubbish. E.F. Codd first published on the relational model in 1970, and the first relational database followed a few years later. (Here's the first supporting article I found on google, but I remember this from college in the 80s, when relational databases and SQL still weren't standards.) Your dates are off by a decade and your assertion is incorrect.

      So you get this impedance mismatch and a pile of code whose sole purpose is to rewrite the data structures used in the program so that they match the data structures used in the persistence store[. . .] What we need is a persistence store with a data model that matches our programming language data model.

      Relational databases are successful because you are exactly wrong. Persistence belongs in a separate, language-neutral layer, so that that different applications can share the same data store. In practically every system I've looked at, a proprietary accounting package written in C, web-based extensions to the system written in Java (or Perl) and reports written in SQL all work with the same tables.

      Sure, you can move the O/R mapping out of the client code, but creating a database that has interfaces to each of these languages simply moves the impedance mismatch out of the client and into the database. (Or you could use separate data stores, and spend your career working on synchronization problems. Or you could rewrite everything--your accounting package, your report writers, your web extensions-- in a not-yet-invented transaction-aware extension of a single language, in which case your system will be as obsolete as you think SQL is by the time that you finish it.)

    8. Re:Why relational databases dominate by Unordained · · Score: 1

      it's a many-to-many. what's so hard about that? while you're at it, store when they read it and what they thought of it...

      rdbms' can model anything, given thought. the main problem is recognizing derivation-type relationships within their structures. i'm a bit irritated that i have slightly different types of information being kept in a single table, with indicators as to which "type" of object the rows are. it's a bit like Unions in C, or the way X does its messaging. On the other hand, it's really easy to run queries, and the forms that handle those tables that contain multiple "derived" types have trivial code to identify and handle the slightly different requirements of each "type."

      i'd like to see object-oriented databases. i'd like to model my objects that way. i'd gladly give up PK's/FK's for "pointers" and a simpler way of life. but i've read the specs for OODBMS', and it looks like they don't fully grasp what OO entails. it's sad.

      until then, i'll rely on an rdms, because it works, and most of the time, it does what i want it to do...

      firebird!

      -philip

    9. Re:Why relational databases dominate by Zeinfeld · · Score: 2
      My mind boggles at the last paragraph of yours; ANYTHING you can represent in a text file, you can also represent in RDBMS, and usually easier..

      Personally I don't find writing $2.5 million+ checks to Oracle easy. However that is what one engineer's plan would have required. We wrote a custom db with limited schema support for $0.5 mil and blew $0.3 mil on RAM chips.

      Something that Oracle shareholders should recognize. The principal IP of Oracle is all to do with optimizing the movement of r/w heads over disk platters. That knowledge is effectively obsolete since RAM is approaching the cost of disk (todays RAM price is what disk prices were 4 years ago). RAM is in any case much cheaper than Oracle licenses.

      --
      Looking for an Information Security student project suggestion?
      Try http://dotcrimeManifesto.com/
    10. Re:Why relational databases dominate by angel'o'sphere · · Score: 1

      Seems you do not know what the "relational calculus" is, and the moderators of your post also.

      The relational calculus is from the previous century.

      And there are a lot of variants.

      The basic operations a re selection, projection and join.

      Just like the standard binary operations AND and OR or the standard arithmetic operations +, -, *, /.

      Hirachical databases have no well defined mathematical model. (However there are numberous models thinkable, and most hierachical DBs implement a vendor defined model).

      Your rants about laguage data models versus database data models seems also a bit naive.
      Also uninformed.

      E.G: Enterprise JAVA BEANS, form the Sun JAVA J2EE API specifications, put transactions, role based(authetification) access to methodes, persistence etc. fully out of the languge into configuration files.

      The configuration files are read at deployment time (that means installation) to generate wrapping code for transaction/persistance management.

      The reason is quite easy: hard coded transactions into your business logic is the thing you DEFINITLY do not want.

      Because it renders your business logic useless when the transactional behaviour is about to change. Or it renders your transactional code useless when the business logic changes.

      If you like to have persistence inside of your programming language, including transactions, then dig into object oriented data bases.

      The JDO API of JAVA J2EE might be interesting for you.

      Most of that can be read up on java.sun.com.

      Most posters here pointing out that XML is DATA and not a data abse are right. If at all, you like to manage hierachical data, use an object oriented database.

      If a OO DB is more performant than a RDBMS is mainly a question of "the typical query".

      If the typical query delievers joined and selected data which is finaly projected to remove columns, a OO database makes no sence, as it is optimized to deliver trees(graphs) of connected objects.

      XML data is in most cases just a tree, that is performant in an OODBMS but might also be performant in a RDBMS if the XML is generated by a filter.

      It simply depends wether you are navigating or querrying and how the schema of the DB looks in relation to the typical querry.

      Regards,
      angel'o'sphere

      --
      Cost free eBook I read (by iBook/Kobo/Amazon/ObookO/Gutenberg etc.): "The Green Odyssey" by Philip Jose Farmer.
  21. Re:Hierarchical != Object-Oriented Databases by nusuth · · Score: 1

    Organisation of containers are not what organisation of data is. Think of it this way, you can store all objects derived from FooObject if you have a node structure FooObject *Item, yet nodes themselves could be stored in a linked list (compare with a simple db table), or in a graph (cw a relational database with many to many relationships) or in a tree (cw a hierarchical database.) The difference between relational and object oriented databases are basicaly what type of things they can store. OO ones can store whole objects, while relational ones store fields in tables. Those fields are usually simple data types. I can't think of a reason that an OO database can not also a be hierarchical database, yet that does not have to be the case either.

    --

    Gentlemen, you can't fight in here, this is the War Room!

  22. I'm currently working on a paper about this... by Carnage4Life · · Score: 3, Interesting

    Hi,
    I wrote a paper on native XML databases and SQL databases that support XML that appeared on Slashdot a little while ago. While doing research for that paper I asked myself the same question, whether instead of coming up with hybrid methods to store relational and hierarchical data we should store XML in already existing hierarchical databases. Unfortunately things are not so clear cut.

    First of all, a lot of data out there is relational and people aren't ready or willing to transition all that data to XML based storage so mixing of relational and XML data will probably be with us for a while. The biggest problem with object oriented databases is that they didn't understand this fundamental issue but it seems that with XMKL databases the vendors understand that hybrid data will be with us for quite a while which is why Tamino supports importing data from relational sources and even ships with a SQL engine.

    Secondly, XML documents have a lot of metadata beyond the hierarchical parent-child relationships such as processing instructions, comments and entities which are require more intelligence in the support from the database than just storing parent-child relationships.

    Finally all the major [commercial] relational database vendors have included some sort of native suppport for XML including XML types and there is a an ANSI standard in the works for combining XML and SQL. From what I've seen, none of the hierarchical databases plan to support XML as much as the relational databases have or plan to.

    Now if you were simply asking whether a native XML database can be built on top of a hierarchical database then I believe the answer is yes. Then again native XML databases can and have been built on object oriented databases and relational databses so it makes sense that they can be implemented in a database system that is more suited to handling hierarchical data.

    1. Re:I'm currently working on a paper about this... by Anonymous Coward · · Score: 0

      Wow, you actually posted something which was not hopelessly inane. Still self promoting, though, so I have to give you the standard 'Thanks for your dopey comments.'

      Keep trying, though!

  23. XML Data Bloat by trp0 · · Score: 2, Insightful

    It certainly seems like the same thing is happening with XML that happens with any new toy: "my friend told me XML was cool for stuff, so I'm going to convert everything to XML so I can be cool too."

    I was pretty sure that XML was useful in that it was a human-readable data-encoding mechanism that "average" users could get a grip on and utilize in sharing information between heterogenous systems, but it seems like people are completely missing the point these days in how to use XML effectively.

    A lot of the benefit of using XML is quickly becoming negated by everyone coming up with their own DTDs and the lack of standard formats for encoding data that is to be shared. As an example, here at the university I attend, there is a project for sharing information about biological species' population data amongst sister organizations. The goal is make the information possessed by all these organizations available to all the others. The trouble is that they have all come up with their own format for storing the data they collect and can not agree on what standard should be used, so each organization is encoding all their information with a different XML labeling scheme. My first questions was: "Why in the heck are you using XML to encode the data anyway?" Seems easier and saner to just store it in your relational database and make the database accessible to sister organization who can then encode the information however they want for their end-users through their client applications rather than the organization holding the information imposing order on people wanting access to the information.

    To make a long story short, XML encoding doesn't help you store the information more efficiently at all and with the state of the "formatting standards" today doesn't even really provide an efficient way of sharing information between organization or an efficient way of encoding the information for transmittal to other organizations. It seems as if people are missing the forest for the trees in how XML can be useful in its relation to data encoding and we should stick with our trusty ole relational and object-oriented database models as they have shown their usefulness and efficiency.

    1. Re:XML Data Bloat by el_mex · · Score: 1
      Agree with you 100%. XML is not a database. XML is just a formatted, human-readable export file. If you run a database on top of the XML file, you will come up with anon-optimal system. The fact that it is human readable takes away from the computer's ability to read the data easily.

      I dismissed XML altogether when people started to claim it was going to save the world. The situation is still ridiculous, some people just do not get it. In my database I want uptime, redundancy, speed, and recoverability. Does XML address any of these issues?

    2. Re:XML Data Bloat by Skapare · · Score: 3, Insightful
      XML is just a formatted, human-readable export file.

      Human readable?

      I suppose you don't mind it when someone send you mail, and you see a bunch of tags all over the place because it's in HTML. XML is just the same kind of thing ... all cluttered with tags. The computer can read XML easier and more quickly than humans. Sure it could read it even faster if it didn't have to parse all those tags. But I wouldn't call this a design intended for humans to read.

      --
      now we need to go OSS in diesel cars
    3. Re:XML Data Bloat by JordanH · · Score: 3, Informative
      • Human readable?

        I suppose you don't mind it when someone send you mail, and you see a bunch of tags all over the place because it's in HTML. XML is just the same kind of thing ... all cluttered with tags. The computer can read XML easier and more quickly than humans. Sure it could read it even faster if it didn't have to parse all those tags. But I wouldn't call this a design intended for humans to read.


      The XML isn't human readable, but browsers and other applications can make pretty good guesses at a nice human readable representation.

      Further, you can define style sheets to produce different views, with data that would be unimportant to a particular human (or application) elided.

      It may be oversold, but the point is that the data definition is well defined such that writers and readers (often human readers, also applications) can interact more easily. It's about portability of data, which readability is a subset.
  24. XML not meant as a replacement for RDBMSs by bwt · · Score: 4, Interesting

    Or is it better simply to use a relational database that can output in XML, or script your way to achieve the same goal?"

    I believe that RDBMS's should add functionality to read/write XML, especially as the XML Schema recommendations is basically done.

    The idea that XML should be the permanent storage format is a bad one. There is a lot of power in a normalized data model -- it enforces data integrity , while eliminating data fragmentation automatically and it minimizes transaction resources.

    Consider XML representations for different entities that all share some kind of child entity. For example: people, businesses, and schools all share addresses. In XML, you want the addresses to appear in the description of the individual object. Does that mean you want to store the addresses separately that way? Absolutely not, because then when you enforce constraints or ask questions about addresses, your data is fragmented in three places. For that matter, how do you know all the entities that might use addresses? In an RDBMS, you can inspect all the foreign keys to the address entitity. What's the XML analog?

    1. Re:XML not meant as a replacement for RDBMSs by camusflage · · Score: 2

      I believe that RDBMS's should add functionality to read/write XML, especially as the XML Schema recommendations is basically done.

      Oh, like Microsoft SQL? I know, I know. Mod me down, but MS has been on XML like Justice on an antitrust suit.

      --
      The truth about Scientology, Xenu, and you: Operation Clambake
    2. Re:XML not meant as a replacement for RDBMSs by leifw · · Score: 1
      Oh, like Microsoft SQL? I know, I know. Mod me down, but MS has been on XML like Justice on an antitrust suit.

      Gosh, and I thought Microsoft was doing a great deal to support XML. I hadn't heard that they'd packed up early, quit the XML party, and gone home.

  25. Pros and Cons by Multispin · · Score: 2, Informative

    I work for a company that has been doing hierarchical DBMSs for years. The company is Applied Technical Systems. We make a database engine called CCM.

    XML is a great way for exchanging data, but the term XML databases is very misleading. If the database engine actually stores data in native XML, it's going to be *very* slow. I think the point behind XML is that nobody should really have to care what your backend is as long as you can export reasonable XML. Note that I say reasonable XML. And XML export that simple encodes the rows and fields in a table to XML with <row> and <col> tags is NOT reasonable. It conveys no actual knowledge of the real structure of the data.

    Storing XML data in a relation DB can either be a very hard problem or a very easy one. Let me explain.You could look at some XML and define a DB schema for it, not too hard to do. Problem? It's not generic; a human has to re do it each time the XML structure changes. The alternative is to store it all in one big table and index the hell out of it. Problem? It's slow. At that point you aren't using any structure of the XML or the power of relational DBs.

    I'm a firm believer that efficient XML storage, querying and retrieval will require a hierarchical database. The problem is that there's several features (bugs IMHO) in XML (and XPath) that, in a way, are throwbacks to relational DBs. IDREFs and the notion of document order particularly bug me. I ran into these this summer when I was on a team trying to build a XPath and XQuery front end for CCM.

    We're gradually seeing the XML world change. Early XML documents were similar to the type mentioned above. They were flat. When you start adding depth the information inherent in the structure of the data becomes apparent. Another thing I'm glad to see the industry move away from is the notion that XML resides in files. Many (if not all) of the early XML parsers made this assumption. It was a pain in the ass to parse from some other source, like a buffer in memory.

  26. Strictly speaking by bikiniAtoll · · Score: 1

    Strictly speaking the relational model doesn't specify how data is stored, only how it is retrieved. XML is a storage format; there is no reason why a relational database couldn't use XML for storage.

  27. Repeat after me ... by Serpent+Mage · · Score: 5, Informative

    XML is not a magic bullet. Relational database won out over the Hierarchical model for a lot of reasons. For instance, there exists a number of integrity constraints with the Hierarchical model such as

    1) No record occurrences except root records can exist without being related to a parent record occurrence. This means that
    a) a child record cannot be inserted unless it is linked to a parent record.
    b) a child record may be deleted independently of its parent however, deletion of the parent record automatically results in the deletion of all its child and descendent records.
    c) the above rules do not apply to virtual child records and virtual parent records.

    2) If a child record has 2 or more parent records from the SAME record type, the child record must be duplicated once under each parent record.

    3) A child record having 2 or more parent records of DIFFERENT record types can do so only by having at most 1 real parent, with all the others represented as virtual parents. IMS limites the number of virtual parents to 1.

    In addition to these flaws, relational databases have had over a decade to become mature, optimized, and enterprise scalable. Harddrive partitioning for such databases as oracle work out perfectly with the cylinder, sector, and tracks of a hard drive to allow for the fastest read/write times as can be possible.

    Too often people see that XML "can" do so many things and decides that it should be the way things are done but XML is NOT a magic bullet and just because it has the potential to do something does not make it the best methodology for doing so.

    1. Re:Repeat after me ... by Unordained · · Score: 1

      did anybody else get the spam snail-mail from microsoft a few months back claiming that XML, Win2k, etc. would solve every problem that ever came my way?

      Effectively, microsoft claimed that having XML capabilities for import/export in your company would make it immediately possible to interface to other companies with XML data in/out.

      Not -once- did they mention that, say, your databases wouldn't contain compatible information? that your table layouts wouldn't match? that your lookup tables would need to be patched, or maybe that they just plain don't keep the information you want? No. XML made it all work.

      It's sad to see something as potentially useful as XML turned into something it's not, so that even its legitimate uses might be forgotten. It's happened before, it will happen again.

      -Philip

    2. Re:Repeat after me ... by Anonymous Coward · · Score: 0
      XML is not a magic bullet.

      Isn't that supposed to be magic sword or silver bullet...

  28. With SDF, is the time right for relational dbs :) by teambpsi · · Score: 1

    I really wish people would stop focusing on the INTERCHANGE format and focus on the abstract implementation details.

    Just about any heirachial store CAN be implemented in a relational database -- they are called "intersection entities".

    Trivial and fast (when indexed) to Manage one-to-many and many-to-many relationships.

    Complete with constraint checks if you so desire.

    The greatly exaggerated demise of ODBMS should point out the problem of adoption: What problem does this solve that I cannot solve using what I already know?

    or to parody Dr. Ian Malcolm in Jurrasic Park

    "you were so busy using BLOBS in relational databases, you didn't stop to consider whether you SHOULD" :P

    --

    Old age and treachery almost always overcome youth and skill.
  29. Source and migration (a digression) by Improv · · Score: 1, Offtopic

    I would suspect that companies/people who run
    Unix would like that faster chip, as Unix is quite
    portable. I have 2 Alphas, 3 PCs, a NeXT, and my
    laptop is an iBook, all of them running Unix.
    At work, I manage various flavors of Unix, many on
    non-x86 hardware. But I digress..

    --
    For every problem, there is at least one solution that is simple, neat, and wrong.
  30. XML is not only hierarchical by Anonymous Coward · · Score: 0


    XML has several standard hyperlinking paradigms (ID/IDREF, XLink, HyTime, TEI) which allow for the creation of non-hierarchical relationships.

    Also, I don't like hearing so many people talk about using DB technology to hold XML data. Unless you are talking about a document management system, you really ought to be thinking of XML as an interchange format only.

    In terms of the ability to represent information, XML tries to solve much of the same problem as a database: providing a framework for arbitrary structure. A database does it in a way that is highly optimized for query and modification speed, XML tries to do it in a way that is optimized for interchange and platform-independent processing.

    Storing XML fragments in database fields is an odd thing to do, but I see more and more people doing it. I guess in an ideal world, your database schema would go down to the exact level of granularity you might be using the XML to capture. It seems half-assed to me to use a DB for high-level structure, then inside your records you have some other completely different type of structure. I guess people like this, though, since the DB vendors have all added technology to enable this.

    To me it just means that people spend less time thinking about and designing the information models, and it is yet another case of software features shaping requirements (when of course we all know it should be vice versa)

  31. impedance mismatch by tim_maroney · · Score: 2

    I was surprised to see so many questions of the form "what's wrong with relational databases"? Relational databases have a well-known problem called "impedance mismatch" when mapping multi-linked object structures. Many links on the impedance mismatch issue can be found at this Google search.

    Anyone who has tried to take a natural set of application-side objects and map them onto a relational database is already quite familiar with the problems created by the proliferation of tables needed to map simple application data structures, as well as the large amount of development effort needed to deal with simple relationships that would be trivial to specify in an object model such as Java's or XML's.

    There is clearly a need to move on to object databases, but installed base and skill set inertia have blocked this transition, with the result that database-oriented applications have remained hamstrung in their friendliness and feature set.

    Tim

    1. Re:impedance mismatch by Anonymous Coward · · Score: 0

      Anyone who has tried to reuse the data stored in another applications object database would like to stuff your "impedence mismatch" up your arse. So called "Object Databases" are fine for the situation when you are sure that nobody will ever want to access your data in a different way to what you first envisaged. The data I'm asked to manage is more important than that. Try writing an ad-hoc report in an object database. I can get the data in a few minutes with SQL. Tell you what, you stick to your esoteric spirituality and I'll do the enterprise database stuff.

    2. Re:impedance mismatch by mattypants · · Score: 1
      installed base and skill set inertia have blocked this transition
      There is a good reason for this and the survival of hierarchical DBMS's; the simple fact that most people understand and make least mistakes in hierarchical models. Is it not unsurprising to find that people like to use tools that allow them to solve their problems in the way they actually think? Most are happy not to get bogged down in the details of precisely which minor disadvantage they are suffering from because by the time they have worked it out they have missed their deadline.

      That is why our products have been selling for over 20 years in the UK - we recognise that hierarchies exist and everything else is related to them.

      As for XML; it's perfect for import and export from a hierarchical DBMS and therefore lends itself admirably to programmers who like straightforward solutions.
    3. Re:impedance mismatch by bwt · · Score: 2

      There is clearly a need to move on to object databases, but installed base and skill set inertia have blocked this transition, with the result that database-oriented applications have remained hamstrung in their friendliness and feature set.

      The "impedance mismatch" is little more than the fact that object oriented approaches generally do not obey the rules of 3rd normal form data modelling, especially in the way they represent many-to-many relationships. If anything, it's a problem caused with object orientation and it's assumption that efficient software development overhead is "the goal". That's true if the data is throwaway or only persistent in small quantities. When the amount of data is large, and you are paying for "big iron" to support many simultaneous users and transactions, as are both typical for enterprise grade applications, the software reuse benefits of object oriented methods lose significance relative to structural data integrity enforcement and transaction efficiencty.

      OO works great in the GUI and business rule layers, but consider the way OO represents many-to-many relationships. For example, suppose I have students and courses. Generally, I might have students with a collection of course objects or vice versa or both. If you use both, then you've got redundant data and ACIDity and data integrity will add resource overhead and complexity. If you put the collection in only one of the objects (say in the student object), then when you ask a question like "who are all the students in class X" then your application will crawl as you have to ask every student who exists if they are in taking that class. If there are a couple thousand students, then it's not a big deal. If there are 400 million, then it is a very big deal.

  32. OO? by Nevrar · · Score: 1

    I'm rather ignorant on this subject, but surely XML data could be viewed as being object oriented? If this is the case, then surely an OODBMS or more practically an Object-Relational DBMS could be used.

    In case you were wondering if there are any out there, check out PostgreSQL which is way cooler than MySQL (it's open-source for a start)

    --
    Nevrar
  33. Having worked in the industry... by SerpentMage · · Score: 1

    Having just quit a pure XML company (can't say the name of the company, but lets just say I can now spy on the company instead of working there) I have to say that pure XML databases will most likely not pick up.

    The reason why they will not pick up is not because they are not good in their own domain, but simply because the legacy of SQL is simply too large. To make XML do what SQL does today is about eight years away.

    In those eight years SQL data will become huge and the problem will be converting the data. For example if you have a multi-terrabyte database how can you ensure there are no errors in transferrring. Hard disks have an error rate that works one in a billion. Now put that on a multi-terrabyte database that means a megabyte of data may be faulty at the best. This means that somebody will have a screwed up account.

    This means the best solution is status-quo since the status quo works and does the job correctly.

    I even predict that in ten years the "programmer" will almost cease to exist. In ten years we will become data mining extractors. Sure there will still be a task of extracting data using programs, but the main concern will be managing the data and figuring out interesting things from it.

    What all this means is that we will live for a long time to come with SQL. Ok there may be XML adaptors, but it will still be SQL...

    --

    "You can't make a race horse of a pig"
    "No," said Samuel, "but you can make very fast pig"
    1. Re:Having worked in the industry... by Anonymous Coward · · Score: 0

      A corporate spy, eh? That's funny, we just fired a guy by your name...

    2. Re:Having worked in the industry... by SerpentMage · · Score: 1

      You did not get it!!! I am not a spy!!!!

      --

      "You can't make a race horse of a pig"
      "No," said Samuel, "but you can make very fast pig"
  34. Oh god, please no. by rhinoX · · Score: 1


    Having just taken a database course a year ago in which we had to deal with a hierarchical database. It is absolutely awful.

    --
    The copper bosses killed you, Joe. 'I never died', said he.
  35. Some thoughts... by Coventry · · Score: 5, Interesting
    I have been struggling with these issues for awhile now, for various reasons. Why? Because I like Zope, but am, like most developers, more comfortable with relational data structures.

    Zope uses an object database known as the ZODB. Some forms of many-to-many relationsships and such can be handled via the use of selection and multi-selection properties, which are designed to distinguish between a selected element and the list of available elements. The list of elements can be derived from a property on the current object, a property on a parent object, or be created via a method call - allowing for non-traditional (for OODBMS) cross-linking of objects. Of course, since this sort of thing is a workaround, no true relational links are created... 'Soft Relations' may be ok for MySQL, but in big application development, relationships must be enforced! Thus, the big-boys in RDBMS all enforce foreign keys (mysql does not)...

    Of course, I've found that by careful creation of object heirarcies, very complex applications can be created on top of a OODBMS that are in fact more robust, in some ways, then the relational couterparts. The Bigest hurdle (Short-term) I see to OODBMS (including ones based upon XML [the ZODB can export objects as XML but they are stored differently internally]) is the lack of a true query and data manipulation language - like SQL. Sure, OQL exists, and is even technically a standard, but it A) sucks and B) is geared towards large java applications with huge amounts of active objects, not general purpose OODB queries. Thus, without such language, OODBMS are all disimilar in how one queries and creates/updates data, and in many cases, the only interface is a truely procedural one! Thus OODBMS are forced to use proprietary tools, and are locked into one system - not to mention speed of development (something normally associated with OO development and OODBMS in general) is hindered by the excessive amount of procedural calls one needs to simply query thier data...

    Recently, an add-on to Zope addressed some of these issues. Called 'ZOQL' - it uses a SQL like syntax and allows for very discrete querying of the ZODB (something one had to do programatically using the 'ZCatalog' before) with all of the familar aggregate and comparison operators SQL users love... Of course, this _still_ doesn't address the issue of soft-relationships:

    I think the bigest hurdle to OODBMS in the long term (tools like ZOQL are interfaces to existing systems, thus can be mplemented easily) is the lack of handling relationships. It seems that most RDBMS force a developer to think in Relational terms about the data, and most OODBMS force you to think in terms of objects... Most problems can be mapped to either of these domains, but you are forcing the data-model-type onto the problem. What is needed is a hybrid system, an 'Object-Relational' DBMS. This is to say that OODBMS system makers desist with the traditional OO idea that relations are of the following types:
    • Object A is a Object B
    • Object A Has a/many Object B(s)
    What RDBMS systems excelled in (and thus fell into pupular use for) was ease of management and allowing common data to be moved and grouped. A 'Look-up Table' - for instance, which simply holds a list of common data (an enumerated list) and can be centrally maintained is a Boon in the RDBMS world. For example, you have a lookup table of car manufactureres, and one of them changes its name... Instead of updating all N Cars that are made by the manufacturer, you simply update the single record in lookup table. Since each car would have somehting akin to a 'Manuafactuer_ID' column linking it to the lookup table, the Cars belonging to the manufacturer are all taken care of.

    How does one do this in a hierarchal system? Well, the easy answer would be that each manufacturer object contains all the cars that manufacturer makes. Simple, right? WRONG. Why?

    Because each car also has a body-type (compact, sedan, SUV, truck, van, etc...) - which in a relational database would simple by another lookup table, but in an OODBMS poses data management issues. Do we put body-type higher then manufacturer? If so, then we have to maintain the list of manufacturers for each body type, causing headaches. Or do we put body-type below manufacturer, causing us to need to maintain a seperate list of body types for each manufacturer - these lists of course need to match exactly if we ever plan on being able to search or do reports based upon all cars of a specific body type.
    Sadly enough, this sort of seperate-enumeration-relationship isn't implemented (well) in any OODBMS I've found.
    Take the ZOBD for example, its selection and multiselection lists Try to handle this situation, but fail because relational integrety is not maintained! That is to say, behind the scenes it's not a true reference to a value in the enumerated list, but just a text entry representing a value in the list. If the value in the list changes, the selection-property does not update, leaving you with the equivilent of MySQL's bastard-children, the orphaned records.
    This sort of soft-relationship handling is Ugly and BAD for maintainaility, but OODBMS users are faced with two ugly choices each time they map such a relationship: Do I store this as a plain-text property and just update N records each time this changes, or do I map it into the hierarchy and deal with the headaches incurred by doing so...?

    I don't think I've answered the question, but hopefully I've at least shed some light on the subject for members of both the OODBMS camps and RDBMS camps... Now if only a useful ORDBMS were to come along...

    (Note that PostgreSQL and some other RDBMS actualy can be used in a semi-OO manner, but this is usually reserved for inheritable structures of data to be used for specific extensions to the data model - thus the SUV table inherits from the Cars table and adds some columns - but all other relationships SUV has will still be relational)
    --
    man is machine
    1. Re:Some thoughts... by maxm · · Score: 2, Interesting

      It could be solved easily enough I think, and I am currently writing a module that I belive would solve most of the problems I have when using Zope.

      All that is needed is a relation "product".

      relations.add([obj1, obj2], [obj3, obj4, obj42])
      relations.getRelations(obj1)
      >>>[obj3, obj4, obj42]
      relations.getRelations(obj3)
      >>>[obj1, obj2]

      Every object in zope is defined by its id, and it's path, so it could be done relatively easily.

      Then you would get the advantages of a relational model in the ZODB.

      You could even use a different instance of the class for different object types. Like you make many relation tables in a traditinal rdbm.

      --
      Max M - IT's Mad Science
    2. Re:Some thoughts... by angel'o'sphere · · Score: 1

      Your post is errr.... strange.

      I can not follow it, so I do not dare to point out what is wrong.

      Except, of course :-), for one thing:

      In an OODBMS there is no MAPPING from objects in the programming language to objects in the DB.

      The objects are stored one to one.

      A OODBMS has only three purposes:

      1) Allow querries on OO structures(graphs of objects).
      2) Allow transactional save changes on a lot of distinged objects.
      3) Isolate point 2) if different users access the same objects.

      You example of cars and manufactors is to complex to talk on/make surgery on here. However:

      class Maufactor { /* your manufactor data here */ }

      class Car { }

      template
      class One_Parent_to_many_Children {
      Parent parent;
      Children* children;

      }

      typedef One_Parent_to_many_Children .less. Manufactor, Car .greater. CarManufactors;

      This are C++ classes. Most people would simply have a pointer to the Manufactor in the Car class, of course.

      If you have that above, the objects in the database are ....
      Ermm.... Cars, and Manufactors and relations, noting else. Nothing to map.

      For selections you make OQL querries. Yes, you can not REMOVE COLUMNS, like you can do in SQL. And you do not like to do that, or do you like to get HALF cars out of the DB? You are moving in a C++ world above, so you like to get Cars and Manufactors out of the DB and not something which is returned by an:

      select Manufactor, Cartype from Manufactors, Cars, where production_date .
      Each Manufactor object referes in its list of produced Cars to the Cars fitting the querry above.
      If you stick to the relation class I sketched above, you only might get Manufactors and then you make a new query for each Manufactor in the vector to get the Cars. If you would have used a pointer, as most C++ programmes would have done, the OQL query above would be just fine.
      The result would have been trees/graphs of Manufactors with Cars built by them. FULL FLEDGED C++ objects, ready to call virtual functions on.
      Not some array of text data to be converted into objects.

      Regads,
      angel'o'sphere

      P.S. I have configured "PLAIN OLD TEXT" for submitting. less and greater signs are ALLWAYS interpreted as HTML leads. BOLD etc. does not work as the closing tag is note recognized .... BUG or am I just not able to post correctly?

      --
      Cost free eBook I read (by iBook/Kobo/Amazon/ObookO/Gutenberg etc.): "The Green Odyssey" by Philip Jose Farmer.
  36. A Hierarchy of Myth by droleary · · Score: 2, Interesting

    While a hierarchy is often used by humans to organize and structure things, that should in no way impact how the data/information/objects are treated as individuals. Look at the common file system hierarchy and it's easy to see that burying files under a hierarchy of directories actually makes access to that information harder. It wasn't so noticeable when we were all just managing a few MB of files, but now people are beginning to store large picture, movie, and sound libraries. File managers have mistakenly stuck with the hierarchy instead of using information associated with the file itself (ID3 tags, etc.) to organize it all. What is really needed is a better approach to representing metadata so that information can be accessed directly based on those metadata attributes and not have it hidden in the hierarchy. I have a short essay on this from the work I've been doing on a Meta Object Manager (MOM), but it needs to be cleaned up before it could be published.

    The desire to impose a hierarchy on the data itself instead of considering a hierarchy as simply one view on the data is a step backwards. Nobody who manages large amounts of data is looking to jam it into a static hierarchy, and so XML is not an answer, nor is any hierarchical representation.

  37. Best of Both Worlds? by modulo · · Score: 1

    I use Intersystems' Cache at work - under the hood it's hierarchical, so it would seem to be a good fit for XML (I hear more XML stuff is coming for version 4.2), but it also projects everything as relational tables through ODBC, and simultaneously as objects, through ActiveX and Java. (They're dropping CORBA, not enough interest apparently.) I find it a little tough to program natively in, but it's gotten a lot better with version 4.x. And it runs quite well under Linux. That's the platform I use :-)

    --

    ...but the language is MUMPS, which I will not utter here

  38. persistence layer by budGibson · · Score: 2, Informative

    In design, the logical construction of the program and its data structures should be relatively independent of the physical implementation of said.

    Basically, as I read your question, you are using a logical design that is hierarchical (an object structure experessed in XML) and wondering if it would not make more sense to store it in a hierarchical database. Maybe.

    However, relational databases form the current state of the art and have been highly optimized such that any theoretical performance gains from better matching of logical structure to physical lay-out in the database are likely outweighed. More generally, by insisting on a match between logical and physical lay-out, you would potentially be limiting yourself to a specific physical implementation, one that may not provide good performance relative to others.

    A better solution to your problem might be something referred to as a persistence layer. This adds another layer of abstraction to your application, in the form of a mapping, between your logical design and your actual physical mode of storage. There now exist publically available free (as in beer, and in some cases open-source) tools that will automate this mapping. Generally, any performance hit from the abstraction should be made up in the speed of the superior physical implementation, and the freedom to switch later is also important.

    Two that exist for java are castor available from exolab and a pilot implementation for Sun's emerging Java Data Objects standard (see http://java.sun.com for that tool).

    1. Re:persistence layer by JimRoepcke · · Score: 1

      Another excellent persistence layer for Java (RDBMS to Java object models) is EOF, part of WebObjects.

      http://www.apple.com/webobjects/

      This has been in development for about 10 years now. I use it every day and it's wonderful.

  39. Bad Question by smack.addict · · Score: 2

    First, hierarchical databases weaknesses are not limited to many-to-many relationship modeling. The simple fact is that some data is better represented in a hierarchical fashion (a directory service IS a hierarchical database) and others in a relational fashion. XML is a tool for exposing that data to external sources regardless of its internal representation(s).

  40. Mapping XML onto a relational model by ccf · · Score: 2, Interesting

    The enXyme project attempts to map XML data onto a relational database schema. The goal is to allow complex, specific queries of XML data. It's not easy to capture ALL of XML, with all its possibilities, but you can do parts of it. The project already has a basic XML schema parser, a script that takes an XML schema and generates a series of sql CREATE statements that reflect the hierarchy described by the schema.

    I guess a pure XML database like the ones mentioned in the article would be better at this, but the advantage is that relational dbs are already in wide use.

    --

    Structured data. Structured searching. The Enzyme Project
  41. Let's get this out of the way by ppetru · · Score: 1

    XML is a format. Using it just because you can (which unfortunately represents most of its current usage) is stupid. Examples:

    • XML databases. Excuse me? What sane person would store a massive amount of data with the on-disk format being XML? What's the point?
    • Configuration files (especially the Java stuff is plagued by this). Why on earth would you want to replace "theparameter = thevalue" by "<theparameter value="thevalue">" ? I've seen software embedding an XML parser for the sole purpose of parsing the config file... I don't know about you, but that seems just plain wrong to me.
    • Messaging protocols designed to be lightweight and/or very low overhead. Now don't get me wrong, I think that XML-RPC is very nice and everything, but to use it (for example) for in-cluster message passing of a distributed database would add so much overhead (with exactly zero benefits) that only a marketing guy could take such a decision.

    The list could go on and on and on. Just remember: use each tool when it's needed. Don't just put XML in there because it's a nice buzzword, mkay?

    PS: on the subject of the article... Again, I fail to see where XML comes in. Hierarchical DBs have been around for a long while, and even if they've been shadowed by the RDBMS' it doesn't mean they're dead (I'm working on a project which uses a hierarchical database simulated on top of BerkeleyDB. Works wonderful, and there's no need for SQL. Or XML, for that matter).

    --

    Petru
    1. Re:Let's get this out of the way by tbien · · Score: 1

      I totally agree...

      Once I talk to a guy in my current project and he said that every kind of data wants to be stored into its naturally fitting format - what he meant was XML... Sorry, but I don't think so, Sir!

      XML is nice when I have to do inter-application communication with unknown communication partners - thats it!!!

      The power of XML was its simpleness, but with all those senseless 300+ pages specifications around it its worth nothing!

    2. Re:Let's get this out of the way by shatteredpottery · · Score: 1
      "...would add so much overhead (with exactly zero benefits) that only a marketing guy could take such a decision."


      AHA! I think you've hit on the real problem here... God knows the marketroids at my company are constantly nagging everyone to use XML for everything, and often come back from a call saying something like "I told Client Z that we use XML for this, this, and that. We do, right?"
      --

      A witty saying is worth nothing - Voltaire

  42. Got it backwards by lesinator · · Score: 1

    I think what this really points to is that XML is not the end-all, be-all of formats for exchanging data. All that XML does is bring structured data formats to where databases were in the late 70s (when the Relational model began to take off). Just as databases took the path from flat files to indexed files, to heirarchical databses, to relational database; XML is just the next step in the path from fixed with columns, to delimited values, to key/value pairs, to XML. We need at least one more revolutionary change before the transmission structures (XML) catch up to the live datastores (Relational Databases).

  43. Nested relational databases area better fit by lawley · · Score: 1
    If you have a look at the literature, for example the proceedings of VLDB, you'll see that nested relational databases are, in general, the preferred theoretical model. There are a few quirks to iron out where the semantics don't quite align, but on the whole they are the better approach.


    Why? Because the relational component still allows set-at-a-time interaction for efficient querying, but path expressions can also be used to navigate through the nested structure.

  44. Hierarchical DBs run the world right now by philgross · · Score: 1

    IMS, IBM's hierarchical database system, was originally developed for the Apollo space program. It is far from dead as you can read here. IMS systems are currently storing around 15000 petabytes of data, and executing 50 billion transactions/day. All stored in a hierarchical database system.

    1. Re:Hierarchical DBs run the world right now by aviator · · Score: 1

      Agreed. Like it or not, IBM's IMS hierarchical DB is still in use for many critical databases/systems. I haven't done any IMS work in years but it would be interesting to see if IBM has added some type of "direct" support for XML...

    2. Re:Hierarchical DBs run the world right now by wildgift_mac_com · · Score: 1

      I'm a db novice, but it seems pretty obvious that relational dbs are so popular because there's a lot of accounting data that fits the relational model.

      When you have different needs, you need a different underlying model. A file system seems to be like a hierarchical database, no?

      I'm guessing that if you need to access specific "paths to data" in a very very large database, the relational model will start to break down, and hierarchical databases (which I assume use pointers) could be faster.

      I mean, a terabyte filesystem seems feasible and sounds somewhat manageable (and cheap), but a terabyte relational db for heterogenous file-like data seems rather complex and expensive.

      Just don't ask me to do a "find" on the terabyte disk. :-)

  45. The client benifits from XML, not the server. by Twillerror · · Score: 1

    The client is the one that needs the information in easier to read form, not the server.

    Relational databases are fast, dependable, and are well proven. The data storage that they provide is very adequate for most situations, if not all.

    What really needs to happen is SQL needs to evolve to include XML, and better support for some other situations that constantly arise when writing complex queries.

    I wish that the top clause could be used on a join stament.

    left join top 1 some_table st ( ot.pk = st.fk ) with order by st.some_column.

    Or that there was some syntax for querying out trees of data. Or there was a way to embed code in the joins that could analyze what had been joined at each loop. The data is stored fine in a relational database, sometimes SQL doesn't have the power to extract in a simple manner. I'm sure you can think of many other things that SQL could do a lot better.

    Looking at one to many or many to many relationships in the standard two dimensional data set is a pain. Especially when you have several joins that can duplicate data. Cold fusion ( I know it sucks ) has a feature that allows you to query a result set, which makes extracting the data you need a little eaiser, but XML would make this a mute point.

    Microsoft is developing an XML extension that will return the data in XML for SQL server. The bigger issue is that ODBC needs to do it before they (MS) do so that we don't get stuck using Microsoft XML over OLE DB, which means no Linux support. You can read about MS solution at http://www.microsfot.com/sql.

    Looking at MS solution, it is still too complicated, because the SQL language does not allow you to direct the XML output. As well if foreign keys are setup it shouldn't need them, as well the join statements should be more then adequate to describe how the XML should be formated.

    Another aspect is update/insert/bulk inserts. This could make doing updates to linked tables easier. As well as inserting complex structures of data. Also, it could be used to check data before it got inserted into the db and failed.

    Data transfers would be much nicer as well. Taking what is one to many relationships in you database, and trying to make them horizontal instead of vertical is a real pain in the ass. It is also much slower usually because of the extra joining needed.

    OLAP applications could stand to benifits as well. OLAP produces data in with multiple dimensions. Usually this is accomplished by
    having binary data formats that are specific to vendors, now multi-dimensional data can be treated the same as any other result set.

    I think we are a good year or two away from there being good XML support for databases, but it will increase application performance as front end guys and gals can issue less queries to the database to get the data they need, a in more logical fashion as well.

    1. Re:The client benifits from XML, not the server. by Anonymous Coward · · Score: 0

      I agree with this.
      1. Make sql queries to views/tables. A basic select dump.
      2. Return table as XML data to app that can parse it or use it.
      3. User plays, slice and dices XML data client side.
      4. User updates from client to DB with XML data fragments.
      5. Are we missing something here? When data is to be read and interpreted--do client side. When data is to be managed and updated to it server side.

  46. Experience with XML over ER engines by nsample · · Score: 3, Informative

    Anyone can explain to me what is suddenly so wrong about relational database with hierarchical indexing?

    Maybe its just me, but the goal today is integration and having a special database for XML and special database for this and that just because its faster for this particular problem creates such a level of complexity, which prevents accomplishing even of the most trivial tasks.


    Forgive me for tooting my own horn on this one, but I believe that (for once on /.) there is a correct answer.

    I summarize the answer in a paper written for VLDB 2001 (www.vldb.org). The paper presents joint work between Stanford, Berkeley, and RightOrder, Inc. It can be found online here (in PDF).

    What we found is that relational systems, with appropriate indexes for XML data, give the advantages of both worlds. XML is a hierarchical representation in only the loosest sense. It's written linearly in a flat text document, just as a child learns to write things down on a piece of paper. However, you wouldn't convince anyone but that same child that something written on paper can only represent two-dimensional objects just because the paper itself is flat. XML in many variants is plainly richer in concept than its simple hierarchical representation and thus quite suited to ER. I believe a previous poster mention RDF... a perfect example.

    Punchline: XML is neat, XML is tasty, but XML is not inherently more or less expressive than ER; it just requires a little critical thinking (and index tweaking) to tune ER engines to deal with it. (Once tuned, the ER engines dominate all others in performance.)

  47. 'Joiner tables' are not Kludges by The+Raven · · Score: 1

    Taking out attributes with multiple values and putting them into a linked table is core to the functionality of relational databases.

    Customer
    --------
    CustomerID
    FirstName
    LastName

    PhoneNumber
    -----------
    PhoneNumberID
    PhoneNumberTypeID
    CustomerID
    Text

    This kind of relation is basic functionality in relational databases. This ain't no kludge.

    Hierarchal databases have so many limitations. Even simple things, like employee lists, suffer under the restrictions of a hierarchal database. Employees that work in multiple departments, or have multiple supervisors. Employees with multiple spouses (think International). Projects with three leads. Employees working on multiple projects.

    Relational databases were created for a reason. Abandoning all those improvements just to fit more cleanly into the XML hierarchal model is ludicrous to me.

    Raven

    --
    "I will trust Google to 'do no evil' until the founders no longer run it." Hello Alphabet.
    1. Re:'Joiner tables' are not Kludges by tzanger · · Score: 2

      Taking out attributes with multiple values and putting them into a linked table is core to the functionality of relational databases.

      I'm no newbie to RDBMSes; I Know that this is a core concept in the system. What I was referring to (and rather poorly, might I add) is that to get the same functionality of a hierarchial system you need to use these reference tables for everything lest you run across something which breaks your mold. This reduction to absurdity is eliminated in hierarchial designs.

      I'm not a hierarchial guru; I have more experience and can relate (ha!) better to RDBMSes than I can hierarchial databases. But to say that a relational (table-based) database solves all your problems just because you can organize everything with relations is absurd. You lose the entire concept of a database entry or object by doing this. Instead of having a table consisting of contact information, you have a table for names, a table for spouses, a table for phone numbers, a table for fax numbers, a table for email addresses, a table for postal addresses... Lord help you if your documentation is ever lax or worse yet, you lose it (or the table views or the driving software) altogether! Each methodology has its place.

    2. Re:'Joiner tables' are not Kludges by Anonymous Coward · · Score: 0

      Taking out attributes with multiple values and putting them into a linked table is core to the functionality of relational databases.

      True, but it's also a disadvantage in a lot of situations. That's why post-relational, aka multi-dimensional, aka multi-value, aka Pick-style databases came about. Admittedly, these databases are not as hierarchical as an XML-ish database, but they still are and to lump them in with an overly broad lambasting of hierarchical databases in general is a disservice.

      Hierarchal databases have so many limitations. Even simple things, like employee lists, suffer under the restrictions of a hierarchal database. Employees that work in multiple departments, or have multiple supervisors. Employees with multiple spouses (think International). Projects with three leads. Employees working on multiple projects.

      All these situations are handled very easily in a multi-dimensional database. In fact, they are handled much more easily in a multi-dimensional database than a regular flat-file relational database.

  48. Henry Baker's opinion of relational databases... by Barry+Wilkes · · Score: 1

    Some time ago I came across the following letter by Henry baker regarding relational databases. It made interesting reading.

    Perhaps there is an element of 'when your only tool is a hammer, everything is a nail.' to relational databases. They are certainly so pervasive now that any idea of using something different would be seen as taking a HUGE risk.

  49. Maybe in the future.. but not just yet by Anonymous Coward · · Score: 0

    I have pulled a great deal of hair out at work because a client insisted that we used an XML based object database (eXcelon).

    While the concept is incredibly powerful, the technology is still to young. The moment you try and do anything important, like an aggregation for example, life becomes very difficult.

    I've noticed that there's a lot of cool new stuff being proposed for XSLT2. Maybe that'll help the market along a bit. As far as modeling the data is concerned, it really does kick butt over relational for many (but not all) applications - but it's simply not ready to hit the masses yet.

  50. depends by Anonymous Coward · · Score: 0

    most business tend to use the relational dbs because necessary speed. I think most business will stick to that to a degree.

  51. E-R databases (can do either R or H) by Anonymous Coward · · Score: 0

    Let us throw something really nasty into this.

    For some reason this has been ignored in the obvious "popular" discussions.

    Neither RDBMS's or HDBM's can understandably respresent all data structures efficiently.

    The "best" I have found are Entity-Relationship Databases. No, I do not mean those ER diagrams that most DB packages and modeling tools deal with.

    Imagine a world (DBMS) where every "Entity" is allowed to be Related to any other "Entity" without regard for its data contents (that may be the origin of the link but afterwards the linkage can remain, if wanted, even if data changes on either side).

    Also imagine that the Relationship itself can have any arbitrary data associated with it (including the aforementioned original data that caused the relationship to be linked).

    Now imagine that the Relationship is NOT 1-to-1. It can be allowed to be any Relationship between any N Entities of any (probably) predefined types.

    Now imagine that a Relationship is allowed to be an Entity itself.

    Actually pure Enitities that do not describe Relationships between other Entities probably do not exist in real life - A person can be related to another person, but in special cases the person is the relationship: A legal father and mother of a child may have not relationship to each other except as parents of that child.

    The other forms of DB's (Network, Relational, Heirarchical) are just efficiency-motivated reductions or special cases of the above.

    The overhead in maintaining the arbitrary links, extra tables, keys,and joins, is the real reason the E-R model is not a popular implementation. However it can be easily implemented with a Relational database. A big advantage is that the data in the DB can change without necessarily changing the logical data-interrelationships.

    Once you analysed things this way you stop having to have database theorists deal with normal ever-day applications.

  52. Persistent, Post-parsed DOM in Zope!!! by supton · · Score: 1
    ParsedXML in Zope is a perfect example of how to do this; put DOM in an ODB, and make it accessible for interaction with other sets of heterogeneous data; with Zope...XML, peristent objects, and relational data-stores can sit along side one another. Pick the right storage tool for the job: Zope will support it. Built in services for FTP, WebDAV, XML-RPC, add ons for SOAP, CORBA, etc make this ideal for XML in a complex application and heterogeneous environment. Plus you get to do it all in Python, which makes it perfect. ;)

    http://www.zope.org

    ParsedXML

    1. Re:Persistent, Post-parsed DOM in Zope!!! by supton · · Score: 1

      and I forgot to mention add ons for an object query language, XPath, and XSLT...

    2. Re:Persistent, Post-parsed DOM in Zope!!! by khuber · · Score: 1

      I just went to zope.org.

      The tour's "what can I do with zope?" page has 4
      demo sites.

      appwatch.com - closed down
      stormlinux.com - asks for a zope login which I
      don't have
      zope.org - amazingly, this one works
      technocrat.net - errors with connection refused
      connecting to technocrat.net:9673

      So, apparently Zope is good for running zope.org
      and nothing else...

      -Kevin

    3. Re:Persistent, Post-parsed DOM in Zope!!! by supton · · Score: 1

      Then you are an idiot if you only buy into marketing materials or old case studies... Zope is used by a bunch of big media companies, the US Navy, NATO, and a bunch of other folks, including the company I work for, a Top-20 US daily newspaper (for many significant and growing portions of our site).

      Yes, technocrat.net is not there anymore; perhaps if you paid attention to slashdot, and were not oblovious, you would have figured out why: Bruce Perens moved on to HP, and didn't have time for Technocrat anymore, but it was a good, popular weblog site, often linked to by slashdot.

      Read past the brochureware; how hard is: apt-get install zope zope-parsedxml? Or for that matter downloading the source package and compiling? Try it sometime!

    4. Re:Persistent, Post-parsed DOM in Zope!!! by khuber · · Score: 1

      Oh, yeah I forgot - the Navy site was also down
      when I posted my original followup.

      I have actually played with zope before.
      It isn't really useful to me since
      nearly everything at work is Java-based,
      though certainly it has many good ideas.

      -Kevin

    5. Re:Persistent, Post-parsed DOM in Zope!!! by supton · · Score: 1

      For Java folks, I think the usefulness of Zope (well, just ZODB) will come as 3rd party CORBA support matures; using ZODB as a data-store for Java apps could be very interesting.

    6. Re:Persistent, Post-parsed DOM in Zope!!! by khuber · · Score: 1
      So I don't get modded down to hell, this is an on-topic connection to Zope, hierarchical DBs, XML, and data storage.



      supton> For Java folks, I think the usefulness of Zope (well, just

      supton> ZODB) will come as 3rd party CORBA support matures; using ZODB

      supton> as a data-store for Java apps could be very interesting.



      Possibly. SOAP (or XML-RPC) may be an easier route than CORBA though.
      It wasn't really an option when I last looked at Zope, but SOAP (and
      Zope) have matured since. CORBA is kind of a dirty word.



      I was thinking about this today - using SOAP as a transport between
      Zope and Java. I'm sure my thoughts here aren't new at all, but I
      work on a large system so I can relate the ideas to at least one
      complex real system.



      The question to me doesn't seem to be whether we should bring back
      hierarchical databases, but whether any storage format or API is a
      good choice for manipulating data, period. I will come back to this.



      Currently we store documents for a large document delivery system in
      our own internal binary format which is partially parsed and indexed
      so that we can retrieve, manipulate, and display documents and parts
      of documents more quickly. The XML is sent to servlets on a web front
      end for the most part, across COM in some instances, and even to a
      mainframe.



      As time goes on, our ability to make this process more dynamic
      increases due to increased processing power. New features will drive
      the need for more dynamic data, more ways to make the data useful
      information.



      So, Zope says it integrates with RDBMSs and I'm sure it can, using
      ODBC. Zope could access some of our data directly, and some data over
      SOAP. (Believe me we aren't going to stop using DB2. We need
      industrial strength data storage for many terabytes of data with
      backup, failover, replication and all that fun stuff, and DB2 is the
      primary corporate database system for us).



      Now, RDBMSes can handle hierarchical data (n-ary trees) fine in the
      sense that they can store them efficiently enough. However, SQL is
      not a convenient way to get at it because you have to chase pointers
      around essentially and the queries get to be complicated. What we do
      is layer Java classes on top of data access, including these complex
      structures. Everyone and their mom does this.
      Most SQL is buried in the system and we "see"
      object representations in servers and applications. This is all fine
      and good for now; it's just not that general or dynamic. Access for
      each type of data in the RDBMS is usually through one set of classes and
      is almost always specific to the service that is the primary user.
      For example, you can get results from a search engine, but you can't
      directly access the search engine's index data. You just get results
      from your query. This is okay for encapsulating data access, but bad
      for reusing the same original data for other purposes.



      In other words, I don't think it's so much RDBMSes or databases in
      general that are the problem. They can be used to -store- complex
      data. SQL may just not provide a great API to -use- the data. A lot
      of XML isn't semantically hierarchical anyway, any more than an ASCII
      document is necessarily referring to one long string.

      So a hierarchical storage system wouldn't necessarily be better since
      it wouldn't really work to represent non-tree data . It may be wonderful for hierarchical-only data, but we have a lot of
      different kinds of data.



      I think the real issue is mapping between many different data storage
      formats so that you can use many different APIs and different
      components on the same data, not looking for the holy grail storage
      system or query language.



      We already convert objects sent between services to XML dynamically
      because most traffic goes through message queues, so at runtime we can
      really represent the data any way we want. Java JDK 1.4 will even
      serialize directly to XML.



      So how could you use Zope, an ODBMS, or some other future system as an
      alternative API to the data in the RDBMS? ZODB could just be used to
      cache the data Zope gets from Java or the database. That way, you
      don't have to replace DB2 or create a whole new set of databases that
      have copies of the data. You can provide alternative XML/OO-friendly
      access to the same data by routing through Zope, and keep it efficient
      by caching.



      I doubt anyone made it this far :), but, in summary, one way to tackle
      "hierarchical" XML data or other structured data is to use a non-RDBMS
      system at runtime only. The company I work for does this to an
      extent with Java objects, but you could make it more general. The pieces are:



      1) components that extract data from each source (LDAP, RDBMS, files,
      etc.) and present it as XML. Ideally you could make a generic system
      for all data types in each source. In the worst case you'd provide
      one XML translator for each group of data.



      2) proxies that convert this XML into some runtime native object
      representation for each language (Java, Python, C, etc.).



      3) possibly a metadata format such as RDF to describe the structure of the data.



      Now every component can access any data as XML without caring what the
      original format was. You have to know what the data means to use it, but not
      how to decode it. Then you can provide many types of APIs or
      services, in different languages if desired, that manipulate the same
      data more conveniently than the native API of the data source like SQL
      or LDAP.



      -Kevin

  53. XML is not a database format by Ars-Fartsica · · Score: 2
    SGML/XML is an interchange format and this is the only domain it has proven useful over the years. For years people have tried to construct useable SGML and/or XML databases and the results have been consistently disappointing.

    For more complex hierarchical relationships, an object database is more apt, or an XML translation kit for your relational DBs.

  54. ?? by einhverfr · · Score: 2

    Horses for courses.

    LDAP is an example of a really good and well developed implimentation of the hierarchial database idea. However, try keeping track of whay your customers bought from who with LDAP. So while LDAP (and other hierarchical dbs)do certain things better, don't try to run a CRM suite off one.

    The basic problem is that the entire database is rarely hierarchical in nature even though some queries may be.

    --

    LedgerSMB: Open source Accounting/ERP
    1. Re:?? by tzanger · · Score: 2

      LDAP is an example of a really good and well developed implimentation of the hierarchial database idea. However, try keeping track of whay your customers bought from who with LDAP. So while LDAP (and other hierarchical dbs)do certain things better, don't try to run a CRM suite off one.

      LDAP isn't designed to do that. It's funny that you picked a CRM application, because that's the type of thing I've been playing with.

      Everyone that comes in contact with our company goes into an LDAP directory (benefits: works with almost every email client, replicates great along the boundaries we have, provides logical/protocol barrier between the contact data in the directory and the business data in the RDBMS) and then Postgres takes care of the actual relational (business) data. The ties between the RDBMS and the directory are done by DN; the directory format was carefully designed to avoid DN changes while still making the DN "make sense" when browsing the tree.

      Our products, once manufactured, are assigned a serial number and entered into the directory as well, under a different node. We get the benefit of being able to track our product like we can our customers and the RDBMS takes care of all the stuff that changes on a frequent basis (trouble tickets for Customer Service, quotations, acknowledgements, shipping schedules, etc. The directory is only used to store the data that shouldn't change (or will only change very infrequently) during the lifetime of the entry. Looks good on paper; We'll see how well it works in reality. :-)

  55. Exactly by Ars-Fartsica · · Score: 2
    XML is a poor choice for data storage. Arguably it is also a poor choice for data modelling as it does not have a strict constraint model.

    Relational databases are here to stay and will be with us for at least the next fifty years. It is better to think of ways of translating relational data than supplanting it.

    1. Re:Exactly by flacco · · Score: 2
      Relational databases are here to stay and will be with us for at least the next fifty years. It is better to think of ways of translating relational data than supplanting it.

      Yeppers - taking the LDAP example, the best of both worlds would be to keep the actual data in a relational DB, and use a tool to "publish" it as an LDAP directory, or just use an LDAP interface to that data, along with an indexing scheme that optimizes for LDAP-like queries.

      --
      pr0n - keeping monitor glass spotless since 1981.
    2. Re:Exactly by ttfkam · · Score: 2

      Yeah, it'll be around for the next fifty years (or more). Just like COBOL. Better to find ways of interfacing with COBOL than to look at other languages in your development environment. Yes, that was sarcasm.

      Strict constraint model? Like datatypes (XML Schema)? Like document structure (XML Schema)?

      XML has this model. It is simply not as used yet (not as optimized yet). There is nothing inherent to XML that precludes its use for data storage. The fact that it is plain text in its *serialized* form is immaterial to its internal storage format in a hierarchical database. Nor does the fact that it's XML preclude the possibility of indexing the information just as you would a table column.

      Something that relational databases are not as good at handling: web accessible data where the data does not allow for rigid guidelines. For example, in a web magazine, many articles are somewhat structured with author, date, title, etc., but otherwise tend to be very free-form.

      Not a problem in and of itself, but what happens when you try to search it? How do you differentiate between a search for info in the title of a component use case and a search within a biliography? So you create a relational model that handles an arbitrary number of use cases and biliography entries -- all indexed by article. But some use cases have more information than others. Some have associated graphics. Suddenly we are shown not a many-to-many relationship, but a many-to-many-to-many-to-"Aw screw it. It'll take two days to query" relationship. Do you put markup data in the database? A regular expression on all of the content? Yeah, THAT's efficient.

      We tend to think that anything that we put into a relational database can be adequately represented in XML. And we'd be right. Unfortunately many people believe that the reverse is also true. It is not.

      Others have made the point of LDAP and naively assumed (as I once did) that a full blown relational database on the backend would be a better solution than the pansy in-process, flexible data model, file-based BerkeleyDB that's commonly used. What was found? User queries (VERY common query) from a listing of 400 users took about five seconds with BerkeleyDB behind OpenLDAP, and over a minute (!) using PostgreSQL behind LDAP. Why? The overhead involved in trying to represent a hierarchical tree in a relational model proved to have more overhead than it was worth.

      An object database may have performed better than the relational model, but if you are mainly handling text or simple datatypes such as dates and integers (as most databases do), why not use XML and optimize for that case?

      People scream that their relational database is enough and can be used for anything that an XML database can be used. These people sound very much like people screaming that a singly linked list is inherently better than a red-black binary tree. After all, they both hold data just as well. In fact, a linked list does it more efficiently (look! fewer pointers!) and there's nothing stopping you from sorting the singly linked list (plenty of efficient algorithms already out there for this). Yes, that was also sarcasm. And objects are useless, just use C. :-/

      Use the right tool for the job. In many (most?) cases,a relational database fits the bill. Sometimes an object database is called for. Sometimes a hybrid of the two. Is it so hard to accept that maybe, just maybe, when the only thing that you do is XML processing and XML data sharing (more and more common these days) that a dedicated XML datastore might be what the doctor ordered?

      --

      - I don't need to go outside, my CRT tan'll do me just fine.
  56. Relational is still needed by bbeaton · · Score: 1

    With the single exception of potentially higher speed, which also requires all hierarchical access code to follow the structure carefully, relational still has facilities untouchable by other structures.

    The commonest strength is not simply the way relational handles many-to-many, but the simple extension of recursive relationships in which relational is one of the few paradigms with this ability.

    Relational also supports the network model, again another major weak area in the heirarchical model. I've yet to come across a potential database structure that I can't handle relationally; that is absolutely not true for every other paradigm.

  57. Wait a minute ... by vlad_petric · · Score: 1
    What sane person would store a massive amount of data with the on-disk format being XML

    Don't get it wrong ... As you said, no sane person would use the XML format for storage of massive data. However nothing's wrong with using XML for external representation and querying (see XQuery for example).

    The whole point with XML is interoperability . When designing a distributed application, one could choose from:

    IIOP/CORBA (or variants like Java/RMI, etc.): theoretically very good interoperability, but practically almost none when you use tools from different vendors

    design a custom binary protocol. Most of these designs are short-sighted - making them extensible is not an easy task. Moreover, most of them adopt an ostrichish attitude towards both byte ordering and character encodings

    use XML: built-in extensibility, extremely good interoperability & support for different encodings. Certainly more of a bandwidth hog than the previous two, however, as we all know, the battle between bandwidth and processing power was won by bandwidth. When bandwidth is the bottleneck, one could still use standard HTTP compression to alleviate the problem.

    XML-RPC is nothing but interoperability taken to the extreme: XML is the packaging format, mostly because any architecture/os can properly decode it, and HTTP is the carrier, because it can pass all proxies, and, again, almost everything understands HTTP

    As far as configuration files are concerned, one extra reason for having XML as format is that it's cheaper & faster to use a DOM parser than to write your own ...

    The Raven.

    --

    The Raven

  58. XQuery by changos · · Score: 1

    Not saying that I agree with turning databases into XML, but it's really fun to write these xml queries. You can check it out here

  59. The multi-legged turkey by Dasein · · Score: 2, Insightful

    Okay, I've worked for two different network model database companies -- the network database model is just an extension of the network model to allow graph schemas instead of a strict hierarchy. I've also worked with two companies that we mapping hierarchical structures onto relational databases.

    You can think of data structures as (leaving ternary relationships and such aside) some sort of network of relationships. When you think of it this way, relational and network model databases have more similarities than they have differences, especially when you consider that using surrogate keys is the moral equivalent of a network model "pointer".

    Okay so you have this network of relationships, mapping a hierarchical structure onto that is simply picking a starting point and traversing the structure from that "viewpoint" without visiting a node via the same relationship twice (simplified algorithm but...) One of these groups used to think about this like you had a multi-legged turkey. You grab one leg and hold it up. All the other legs hang down -- you grab another leg and a different set of legs hang down.

    So, if you buy that, does it really make sense to represent any sort of network of information in a hierarchical form? Well, yes and no. It makes sense from a presentation and maybe interchange perspective but not from a native storage perspective. It's simply to constrictive and you and up representing relationships that don't fit into a neat hierarchy programmatically in the application code instead of explicitly in the database schema. 25 years from now, someone is trying to reverse engineer your code and figure out how all this data is related -- blech. Ever wonder why IMS application are generally left alone and newer applications are not usually written to IMS. This is part of the reason why. (yes there are some but they are the exception).

    Throw in to this my experience working with a bank that had hierarchical data and the extent to which they went to circumvent that restriction, and I'd say that native hierarchical storage for XML is a bad idea. Granted it's tempting but it seems ill advised since it's very likely that your data will survive long beyond the lifecycle of the system used to originally store it.

    <RANT>
    The original question didn't provoke this but I've seen a couple of responses about using XML as a native data storage format. Let me say that, unless the data is very static, it's a monumentally stupid idea to do that. XML is not a replacement for a database.

    I find that most of the people who really want to do this are ignorant of all the work that goes into real database systems. They don't understand lock management, transactions, rollback and recovery, free space management nor the scalability issue that real databases take care of under the covers. If you feel tempted read this

    You throw this plus the representation of non-hierarchical relationship with IDs and sooner or later you will find yourself in a text editor tracking down ID/IDREF pairs to find out where your data is corrupted. Or writing scripts to validate your "entire data set" -- above a few megabytes it can be really painful.

    For God's sake, expect to use XML to store data that you are going to update with any regularity.
    </RANT>

    --
    You are not a beautiful or unique snowflake -- but you could be if you got off your ass.
  60. Use NDS, don't reinvent the wheel. by candyuk · · Score: 1

    Just use NDS.

    Its simple use NDS as a binding agent between various data storage mechanisms.

    As well as been scalable, fast and bullet proof, it has the added the added advantage of built in security at every level.

    ----

    --
    Modern definition of an expert: Someone who comes from far away with a powerpoint presentation.
  61. Re:Henry Baker's opinion of relational databases.. by (outer-limits) · · Score: 1

    The reason relational databases are so pervasive is that they work so well. Having worked with both, the flexibility of the RDB is what is so powerful. There is a book out on the remainder piles that is a serious attempt to create the 'object oriented' relational database by CJ Date. One of the big failings of current computer development is the insistence of developers to use hierarchical data structures for everything from oo class systems to file directories. Hierarchical might be quick and easy for getting something going, the long term restrictions and complexities it introduces are never ending.

    --

    Microsoft - Where would you like to go today, Maybe Jail?

  62. LDAP/X.500 limitations. by Jason+Pollock · · Score: 2

    LDAP/X.500 heirarchical databases are all well and good, until you want to run a query that asks which customers have what services, especially when everything is keyed on phone number. You know what we ended up doing? Pulling the entire database out of LDAP every night, and putting it into Oracle to run the reports. Nothing sucks more than a full table scan in an X.500 database.

    I do agree that heirarchical databases are great where you are only going to access the data from a single key, like passwords and email addresses... But, they should probably be provisioned from an external RDBMS if you are looking to do reporting.

    Jason Pollock
    1. Re:LDAP/X.500 limitations. by dmelomed · · Score: 1

      Of course, if you want relational features, run relational database separately, OR run LDAP on top of your relational DB.

    2. Re:LDAP/X.500 limitations. by tzanger · · Score: 2

      Of course, if you want relational features, run relational database separately, OR run LDAP on top of your relational DB.

      Know of any? I would love to have one open-protocol, open-format database backend but be able to run different front-ends on it. (SQL, LDAP, etc.)

    3. Re:LDAP/X.500 limitations. by Anonymous Coward · · Score: 0

      Too lazy to login, but if the backSQL portion of OpenLDAP is complete, that might be what you need.

    4. Re:LDAP/X.500 limitations. by dmelomed · · Score: 1

      There's an OpenLDAP feature already in the works. And if you search their mailing lists, it looks like someone even got it to work over MSSQL!

  63. Data storage format is irrelevant by dgroskind · · Score: 5, Insightful

    XML may be hierarchical but the data it is used to markup is not necessarily hierarchical. For instance, XML can be used to markup conventional fielded (flat file) data to serve as an interchange format.

    More importantly, XML is used to impose some structure on inherently unstructured text. The structure it provides is based on some assumptions of how the data will be used or how it will be presented. If the data is used in some otherway, the markup can be useless.

    An example is a book. For XML purposes, it can be described as structured by chapter, section, subsection, and paragraph. For information purposes, tags are assigned to represent the ideas, terminology, names and other index-like content. There is virtually no structure in these index type of tags but they convey the most important information in the book.

    Or not. These tags are assigned based on assumptions about what readers are interested in. A different set of assumptions would produce a different set of tags even thought the structure of the document would stay the same. If the sentences and paragraphs are shuffled and exerpted for some other publication, even the structure becomes irrelevant.

    How this inherently unstructured information is stored is relevant to how it is managed, that is, how it is backed up, how access is controled, how changes are tracked. However, when it comes to putting the information to some useful purpose, it is the retrieval mechanisms that are important. The issues here are how easily the user can specify the type of information he wants and how accurately the mechanism can find it. This process is usually independent of the underlying structure and uses some higher level concepts of relevance and context.

    The question of whether to use a hierarchical, relational or object-oriented data structures misses the point for textual data, for which XML is commonly used, because none of these structures capture meaning.

    Topic maps make a heroic stab at capturing meaning in XML markup but still only within a set of assumption. I suspect a true meaning markup language is theoretically impossible, or at least theoretically very far in the future.

  64. LDAP is a *protocol* by bigbird · · Score: 2, Informative

    Contrary to some of the comments I've read here, LDAP isn't an implementation of a database, it is a *protocol* for accessing directories. LDAP data could be stored in anything - a hierarchical database, a relational database, an object database or a flat file. Let's not confuse the issue under discussion.

  65. PostgreSQL has support for hierarchical data by Grue · · Score: 1

    PostgreSQL already has support for hierarchical data. I've messed with it a bit, and it's nice. Unfortunately, there's a few problems. Number one is that if you want to keep your project completely portable, you should probably store it in a relational format, just because more databases store that way.

    Is there any work on mapping OO models to relational? Surely there exists some sort of mathematical relationship between the two.
    I also had a problem with people stating that XML will never dictate the structure of the database backend. This is fairly naive. Object models often times are a more natural representation of the structure of a problem domain. So why try to squash that to a relational model if you lose information in the process?

    Or if you're losing efficiency by converting back in forth? If you do it enough, it only makes sense to put it on a lower level. Either with a data binding framework like castor or in the database server itself. Fuck, if nothing else, it'd be a good reason to push/sell the newest version of a DB server .

    Josh

  66. if you only have a hammer... by mj6798 · · Score: 3, Insightful
    everything looks like a nail. The relational model is pretty good for its original purpose: allowing non-specialists quick access to large amounts of statistical and business data (sales records, etc.) via an easy-to-learn query language. But for many other applications, it has proven to be completely insufficient.

    Indeed, that's the very touchstone that distinguishes relational databases from something like DBM and its many descendants.

    The alternative to relational databases is not "DBM", it is object oriented, tree structured, logical, and other kinds of database models. Those are just as well defined as relational databases.

    And *that* is important because it assures the desiger and user that every possible operation is well-defined and (hopefully) correctly implemented. The exact syntax for a "join" may differ, and a specific implementation may be flawed, but everyone agrees to a common baseline.

    Relational databases provide a common baseline for a primitive set of relational operations. Real-world implementations of those models have been augmented by zillions of operations that weren't part of the original relational model and that often don't even fit into the relational model. And without those extra operations, relational databases would not be useful in practice.

    For now, AFAIK, there is none other than that you get when you map a hierarchial database into relational tables and use exactly those relational properties.

    Are you kidding? It is a major pain trying to express hierarchical data in a relational database model: the relations that describe hierarchical data and the operations that you might want to execute often require complex, multiple, inefficient queries and updates, and the relational model provides few tools to ensure that the corresponding relations remain consistent.

    The semantics of tree structures are trivial to define. People do it in programming language classes all the time. And it is trivial to formulate a database model corresponding to it. In fact, if you have an object-oriented database that respects language semantics, you get hierarchical databases automatically when you define an abstract tree datatype.

    Still, so-called "relational" databases will continue to dominate the market for a long time to come. That's not because the relational model is particularly well-suited to a lot of applications. In part, that's because "relational databases" are not purely relational anymore: they generally include numerous facilities for object-oriented and hierarchical databases, under a "relational veneer". They even include the old "navigational" database systems, combined with the widespread use of stored procedures that do whatever they want whenever they want it on the database server.

    In different words, traditionally relational databases will provide increasingly better support for hierarchical and object-oriented data, but they will continue to also support the relational model, as well as relational access to these other data types. And newly developed databases with other kinds of data models will provide an SQL or other relational frontend to their content. And marketing will continue to include "something-relational" in all the advertising because otherwise the old database hands won't buy it.

  67. Store or Query? by rfmobile · · Score: 1

    "Do you think a hierarchical database would really be a better answer for storing XML data over the existing relational counterparts?" This is like asking "would multi-gigabyte tape or sub-gigabyte cd be better for storing audio?" Depends on your needs - do you need random access or is brute capacity more important? In the case of XML, the answer comes from what you want to do after you've stored the data. XML can represent hierarchical data by nesting tags. It can represent relational data by matching ID and IDREF attributes (part of the XML 1.0 spec). Another issue is whether a schema or DTD is available. With schema information you can optimize you storage strategy whether relational or hierarchical. Of course you *can* store hierarchical data in an relational database with the hierarchy preserved as a parent/child relation. For the longest time (decades) relational DBs have had SQL as an effective powerful way to express a query. Now we have XQuery as a query language for XML. XQuery may come to rival SQL but the experts are still figuring out how to implement and optimize XQuery. Optimized implementations of SQL are commonplace and have had years of testing. So ... figure out what your goals are and go from there! -rick

  68. Commercial Support more important than tech... by jptxs · · Score: 1

    ...most of the time. Unless something shows such a HUGE benefit from a technology standpoint that the business side cannot ignore it, commercial support will always win out. Just ask anyone trying to push Linux in corporate arenas right now (read:me). The relational database has IBM, Oracle, M$ and all manner of other flora and fauna behind it. It's not going anywhere. And, as someone further up noted, if you can't beat it, integrate it. Most of the major relational DBs have facilities which allow for use of models other than relational (OO, Hierarchical, even Navigational).

    --
    we speak the way we breathe --Fugazi
  69. LDAP server is a database. by rfmobile · · Score: 1

    Yes, the "P" in LDAP stands for protocol. You use it to talk to an LDAP server. The LDAP server *is* a database so the points are still valid.

    1. Re:LDAP server is a database. by Grue · · Score: 1

      Not exactly. The LDAP server (ie slapd) is a server that answers requests for LDAP clients. But the backend database can still be anything. In my slapd.conf it's set to ldbm, but you could use a relational database if you wanted. Here's a snippet from the configuration file to point that out:

      #
      # ldbm database definitions
      #

      # The backend type, ldbm, is the default standard
      database ldbm

      Josh

    2. Re:LDAP server is a database. by bigbird · · Score: 1
      Yes, the "P" in LDAP stands for protocol. You use it to talk to an LDAP server. The LDAP server *is* a database so the points are still valid

      My point is that an LDAP server can be *any* database - and is quite likely to be a relational one, not a hierarchical one. LDAP is simply a hierarchical protocol.

    3. Re:LDAP server is a database. by rfmobile · · Score: 1

      Like I said, an LDAP server *is* a database. You've just confirmed that. I've set up OpenLDAP before and even created a custom schema for it so I understand your point - but it does not contradict what I've said. -rick

    4. Re:LDAP server is a database. by rfmobile · · Score: 1
      The original question read:

      "Do you think a hierarchical database would really be a better answer for storing XML data over the existing relational counterparts?"

      My point (and that of others) is: an LDAP server is a valid store for XML data and so should be considered relevant to the question above. You were dismissing LDAP as merely a *protocol*.

    5. Re:LDAP server is a database. by bigbird · · Score: 1
      My point (and that of others) is: an LDAP server is a valid store for XML data and so should be considered relevant to the question above. You were dismissing LDAP as merely a *protocol*.

      It *is* just a protocol for accessing a datastore in a hierarchical manner. An LDAP server is just a translation layer that sits over a database - generally a relational one. And yes, an LDAP server is possibly a useful store for XML. But this doesn't imply anything about hierarchical databases.

  70. Logical Relationships by Prong · · Score: 1

    First off, one of the primary selling points of RDBMSes originally was a standard DML that was relatively easily learned. Anyone who has ever written any IMS DML knows what a pain going done multiple trees to get the correct result set can be in a "hierarchical" model. Plus the fun of possibly losing your place in the chain. Think multiple, interrelated linked lists and you start to get an idea of the pain involved. Of course, this doesn't mean that the average programmer writes good SQL.

    Second, "hierarchal" vs. "relational" is a logical concept. People seem to get wrapped up in what is better way of storing things and not remembering that they are dealing with an abstraction. It is completely possibly to put a SQL front end on another style DBMS and force coders to think in sets (CA did this with IDMS 10 years ago). Conversely, you could store LDAP data in Oracle and you'd still think of it as a tree structure.

    As far as XML goes, if I were only dealing with data that would be handled in XML documents, and there was a standard "XML database management system", then I would be inclined to use that XDMS. However, I suspect that the real world set of that intersection is rather small.

  71. It's all about normal form by Curt+Cox · · Score: 1

    Relational databases have distinct advantages over other datbases that far outweigh any superficial affinty with XML
    - you can design schemas where the data is always consistent (no insert, update, or delete anomolies)
    - schema changes are easier than with other database types
    - ad hoc querying is easier

  72. fucking idiots by Anonymous Coward · · Score: 0

    better answer for storing XML data

    So now my data has to bend to a buzzword? XML is a system for data representation, not storage, you idiot fuck

  73. The missing points by sergeaux · · Score: 1

    Okay, technical reasons are good, but they do not seem to rule the world all the time. I want to mention some purely political ideas.

    1. To my point of view, it is relational model that dominates because of collective efforts of several big companies that managed to persuade everyone of its technical superiority, and then to deliver unparalleled implementations. (It was hard to develop a good RDBMS in those ancient times). So everybody with non-relational databases was claimed down and out and only permitted to silently die.

    2. There exists a direct relationship between DTD and relational model (just map !elements into relational tables and impose constraints). The question is whether one cares to develop a query language which operates over XML primitives instead of relational ones so as it were more convenient for plain old hackers and newly brewed developers (and possibly housemaids). I personally do not believe in rapid creation of universally accepted such a language.

    3. Everybody has become clever enough to develop different complicated ways of doing simple things. But not everybody is wise enough to make complicated things simple. And the XML is something overy bloated to me.

  74. if so, then XML is the wrong solution by wytcld · · Score: 4, Informative

    99+% of all corporate data that isn't in a flat-file or (possibly three-dimensional) spreadsheat is in relational tables. The typical task that XML has been designed for is to standardize data exchanges between differently-structured relational systems, by providing sets of tags specific to the standards of specific industries. The whole point of XML is to enable companies to continue to use their current investment in relational databases, without the drag of having to do custom data conversions when dealing with suppliers or distant divisions in the company.

    If you're going to throw out the installed investment in relational databases, you might as well just design a common database standard per industry (rather than an XML data exchange standard) and let them all exchange native data rather than translating in and out of any exchange format. Obviously that won't happen.

    Now, if you're a new firm, you might decide it's easier to go OO or heirarchical or keep your data in slips of paper in a shoe box. But most of the available tools and solutions will continue to respect that relational works real, real well for inventory, manufacturing, accounts ... just about everything industry consists of. So if there's an impedence mismatch between relational and XML that's enough to make trouble, it's XML that should be replaced by another model.

    What design changes would be required to produce XML's relational equivalent?

    --
    "with their freedom lost all virtue lose" - Milton
  75. Hierarchical is good for some things by Anonymous Coward · · Score: 0
    The flexible hierarchical structure of XML works better for knowledge that is more characterized as a decision tree than as a data store. Suppose my application is some kind of a catalog of prices. For some products in the catalog, the price depends on how many you are buying, ie volume discounts. For others, there are different prices by location, whether or not you are a frequent flyer, belong to AARP, whether you are taking delivery on a Tuesday, any combination of the above, etc, etc, except that in some locations, a whole different price structure is followed, and their are some special deals. And all of this changes rapidly, depending on who knows what. Relational databases are pretty ugly for such applications. It is likely much easier to program an agent that can navigate a tree (eg of XML) to find the price than it is to (1) translate such a setup into a real ugly set of relational tables and then to (2) revise the table structure every time the rules change.

    Bottom line is that relational tables are good for data, but hierarchical is better for describing behaviors.

  76. Delphi/Kylix uses it too... by SAN1701 · · Score: 1

    I use Delphi in my daily work, and we can use its TClientDataSet components family both to:

    1- Create a cliente-side cached data-packet in client-server/multi-tier environments.

    2- Use "briefcase model" functionality (copy data to your notebook in the morning, go do your business and merge them back with the database in the end of the day).

    3- Create small, single-user applications that stores data in XML format.

    I mean, altough its obvious hierarchicals capabilities, XML is well adapted to relational stuff too.

  77. Hierarchical = Multidimensional DB by northsea · · Score: 1

    Hierarchical or multidimensional databases have been around for a while. There are several pros and cons for both relational and multidim, but one can never replace the other. The performance of one is better for certain types of analysis than the other and vice versa. Check out the OLAP marketplace to see what multidim databases look like.

  78. Heirarchical dbs: Burroughs (Unisys) DMSII by Anonymous Coward · · Score: 0

    I programmed for about 14 years for a electric utility that used DMSII on a Unisys A-Series. DMSII is a heirarchical database. One of the things that I miss using Oracle now is the ability to do a "SELECT LAST ..." which was used to return the last record of a select. In Oracle you must do a normal select an retrieve the last record first by using the appropriate ORDER BY. The select still builds a result set of all of the records instead of returning only one record. If you are doing alot of these (one for each account, for example) and there are many rows which potentially could be returned, doing a SELECT LAST can be a real time saver.

    Heirarchical databases can be as complex as any relational db; you may need to do a couple of selects instead of one on a relational DB. I believe that the heirarchical model is still useful.

  79. See RDF by shirro · · Score: 2, Interesting

    I don't think XML by itself carries enough metadata to understand much beyond whether a document is valid or not. I think RDF and RDFS have a big role to play in getting XML database ready.


    Perhaps hopping on the XML database bandwagon before RDF technologies mature could be a mistake. Forget the semantic web, I want to see the sematic database.



    W3 RDF


    A Good RDF resource

  80. Native XML Databases by idomeneo · · Score: 4, Informative
    I recently wrote an introduction to native XML databases article for xml.com. My main point there and it applies to this discussion too, is that native XML databases are a tool like any other. For some jobs they're right and for some they're not. I've been working on the technology in the form of dbXML for about a year and a half and in some cases it's great and in others it really stinks. It's all about the right tool for the job.

    It's easy to dismiss a new database technology as irrelevant because of the dominance of the RDBMS, but you should really learn more about it and when it is appropriate and when it's not. It's not going to replace relational, and isn't intended to. Here's a few links where you can learn more beyond what's available on Ronald Bourret's site mentioned in the original post.

    The XML:DB Initiative
    The dbXML Project (open source native XML database) Soon to become an Apache XML project named Xindice
    eXist (another open source native XML database)

    My blog on the subject.
    Kimbro Staken

  81. databases by Anonymous Coward · · Score: 0

    Relational databases are here to stay, I'm afraid, the relational databases that are available are incredibly well developed and used for just about everything, I doubt anything will change that anytime soon.
    I'm not sure why anyone would want to go back to the old model, but just the same, they will never become more popular than their relational counterparts.

  82. Lots 'o Heirarchical Databases out there... by pegacat · · Score: 4, Informative

    A bit surprised to hear that 'Hierarchical databases were blown away by relational versions' - since I'm pretty sure they've been paying my pay check for the last three years... :-)

    There are a large number of heirarchical databases out there. The big fellas are the X500 directories (X509 certs came out of this work). More common are X500's demented kid sisters, the LDAP directories ( rfc2251). The DNS system also fits the description 'heirarchical database'.

    As far as XML goes, there are people storing XML in directories - although they're still fussing about exactly how to do it. There are a bunch of people trying to come up with standards - check the directory services markup language people www.dsml.org.

    There are people trying to sell XML enable directories - Novell sells an XML directory, but most directories can be used to store XML (including our 'eTrust Directory').

    As a final quicky - when do you use a directory over an RDBMS? Directories are good for naturally heirarchical data with few cross connections. They are usually optimised for slow writes/fast reads. They are *very* good for distributed data (e.g. DNS, international organisations etc.). The X500 spec defines a very fine grained security model, which can also be useful. However, if your data is closely cross-linked with lots of relationships... well, use an RDBMS!

    --
    Wer mit Ungeheuern kämpft, mag zusehn, dass er nicht dabei zum Ungeheuer wird.
    1. Re:Lots 'o Heirarchical Databases out there... by chizor · · Score: 1

      i am a quibbler, and i am astounded you claim to work so intimately with hierarchical systems, all without knowing how to spell the word.

      --
      ... !
  83. pervasive interoperability by alienmole · · Score: 2
    The point you miss is that when all the tools support XML, from developer's application-building tools to client tools like browsers, there can be unforeseen "network" benefits to using XML.

    You mentioned configuration files: if all you're talking about is a linear config file, then XML might not give much benefit. But if your config file has a hierarchical structure, XML does provide a benefit, since it provides a well-defined and standard way to represent that hierarchical structure. In addition, XML-aware editors make it easier to work with these files, plus you don't have to write a specialized parser for it, plus you can display the file in an XML-aware browser, plus you can run automated transformations on the file, plus...

    XML is one technology where the mindless adoption by people who don't really understand why they're adopting it, may in fact be of benefit to everyone in the long run.

    XML is the first and closest thing we have to a universal standard data format. We're better off having such a thing than not having it. Since XML is the first such format, naturally it has its problems and limitations. But it's a step in the right direction, and we'll only find out how best to improve it if we use it heavily.

    On the subject of the article, though, you're right. Whether a database's native representation is XML is irrelevant.

  84. Examples? by SuperKendall · · Score: 1

    Where do you think any of the technologies mentioned have been misapplied? Do you have any examples?

    --
    "There is more worth loving than we have strength to love." - Brian Jay Stanley
    1. Re:Examples? by leandrod · · Score: 1

      Simple. XML was to be used to text encoding -- never for data management. Text encoding can very well lead to data transmission, but even there a well-defined and agreed upon plain text file usually suffices -- banking in Brazil have all document exchange fully automated for decades now using just plain text files. Now XML for data management... it is regression to thirty years ago when we did not have relational theory to really understand data.

      --
      Leandro Guimarães Faria Corcete DUTRA
      DA, DBA, SysAdmin, Data Modeller
      GNU Project, Debian GNU/Lin
    2. Re:Examples? by Anonymous Coward · · Score: 0

      What is XML if not a well-defined and agreed upon plain text file?

      If you hate XML, just say so, don't be a jackass.

    3. Re:Examples? by Dwonis · · Score: 2

      Java applets. They were a goot first-shot at a portable sandbox interface, but there is better.

  85. Not so simple by drodver · · Score: 3, Interesting

    The problem with your scheme is that as the list of information grows the amount of time to do the access grows linearly. With systems where there are thousands of concurrent users all doing large numbers of data accessing of the information your solution would grind the system to a halt. While technically possible to do this with a relational database it is impractical. Taking into account that 90% of the databases use is doing reads and the access of such a system would be even worse. Also take into account that the system has to be able to handle some data which does not change and some data that has to be grouped based apon a visit date. Your system would have no efficient way to group certain elements by date.

    1. Re:Not so simple by wilhelm9 · · Score: 1

      From your post it is not clear why hierarchical systems perform better in the situations you point out.

      I wish to know exactly why the hierarchical paradigm will perform better when there are thousands of users processing their data? Or why hierarchical systems are better at handling data which does not change, or why quering for certain elements by attributes (date) is better using a hierarchical database?

    2. Re:Not so simple by the_2nd_coming · · Score: 1

      perhaps the answer is to have a HDB for each patient...sort of like an XML document(i mean isn't that what they are anyway?) then Index those XML DBs with a Relational Data Base. then somthing like Orical can handle sparce data well and return the query fast....all the Database is going to care about is that an XML file exists, it does not even have to worry what info was in the document.

      --



      I am the Alpha and the Omega-3
    3. Re:Not so simple by drodver · · Score: 2

      With a hierarchical system the data is in a tree. To find specific information about something you just specify which branches of the tree you want. No searching is required, using a relational database means you have to do a linear search with is O(n) since you don't know what order the data is in. With a hierarchical database the lookup time for the data is constant or O(1), for example to find someone's birthdate all you might have to say is PeopleDB["Smith, Jones"]["Birthday"]. In the cases in which you do have to search, say for days someone came in for a visit, the amount of data to search through is much less than a relational database. Once you specify the person you are looking for all you already have filtered out all the other people's information from your search. This might mean that instead of searching through a million elements you have to search through ten.

    4. Re:Not so simple by dgroskind · · Score: 1

      Your system would have no efficient way to group certain elements by date.

      Using Raven's data scheme and appropriate indexes, why wouldn't a query like this one solve the problem:

      SELECT d1.PatientID FROM data d1, data d2 WHERE d1.DataTypeID='DateOfVisit' AND d1.Value>=20010911 AND d1.Value<=20011119 and d2.DataTypeID='Diagnosis' AND d2.Value='Anthrax' AND d1.PatientID=d2.PatientID

    5. Re:Not so simple by ttsuchi · · Score: 1

      I think the difference between the hierarchical and relational databases is more of how you access data than how you store it nowadays. Relational databases are inherently not suited for storing sparse data sets as you said. However you can create indexes (which is hiearchical in nature) to increase read performance (with possible write performance degradation). In OORDBMS, you can create ref table columns to slve the problem of variable number of 'attributes'. On the other hand it seems lots of hierarchical DB systems (like LDAP) are implemented with relational databases in the back end. So the only major difference between relational and hierarchical database is the syntax used to access data, and lots of RDBMS vendors are starting to implement direct XML interfaces... :)

      --
      Life: n. Stuff that happens between coding.
    6. Re:Not so simple by wilhelm9 · · Score: 1

      I think the point is that particular implementations of hierarchical systems (just like some relational ones do) exposes references directly to the data to the application. By using these references the application can return to the record he his working on quickly. I can't believe any hierarchical system can return arbitrary records at O(1) the first time the are accessed. It has to involve some kind of search.

  86. XML and LDAP by Napiers+Bones · · Score: 1

    The popularity of relational databases is that, with hierarchical, the hierarchy has to exist prior to the entry of data. Together with single parent inheritance, it imposes a structure upon a systems designer which, does not necessarily represent a real world process. With time, the process alters, and the model fractures.

    The same occurs with that special breed of database, the directory. LDAP owes its single parent inheritance from X.500. This model is becoming less useful in representing the complexity of electronic commerce transaction events.

    Multiple inheritance is implicit in many program structures such as Python, but not in LDAP. This is where the big change will be seen - in iPlanet, eDirectory and Active Directory.

  87. You all laughed in 1990 by PinchDuck · · Score: 1

    But now my COBOL/IMS experience will pay off BIG!!! HAHAHAHAHAHAHA

  88. +1 Interesting, Informative & possibly Insight by alienmole · · Score: 2

    Nice to see some solid and useful info being posted on /. every now and then...

  89. already answered by samantha · · Score: 2

    The question of whether hierarchical or relational databases are better, more flexible, dependable and so on has already been answered decades ago. I fail to see how a popular hierarchical data transfer protocol re-opens the question. If XML is going to be pushed far beyond what it was designed for then it needs to become far less hierarchical rather than having the rest of the world, and especially mainstream DBMS, re-arranging itself according to XML's limitations.

  90. Microsoft Access is cutting edge then... by Anonymous Coward · · Score: 0


    Well why don't you idiots just use MS Access? Its file based (ISAM). Its a piece of crap so you guys would love it.

  91. Speed, scalability and maintainability by cuteface · · Score: 1

    ....while are these issues can be addressed with
    time and widespread adoption...until then businesses
    or individuals are unlikely to jump in.
    It's a chicken and egg issue, i guess.
    Just my two cents worth.

    --
    Reality is what we taste, smell, see, hear and touch yet we cannot comprehend it...only approximate it.
  92. No magic bullet... by Anonymous Coward · · Score: 1, Informative

    Like some other posters have pointed out, XML is no magic bullet. Sometimes I really don't get the hype. I really wonder whether CSV will become the next big thing -- comma separated values!

    Why store a database in XML when you could store it in some high performance binary layout that maps to a disk's layout?

    Sure, XML is good for human-readable data, just as HTML was in a more primitive way, but that doesn't make it very efficient for enterprise-level stuff, and especially the *storage* of enterprise-level data.

  93. XML database by Anonymous Coward · · Score: 0

    Don't we already have a huge XML database indexed hierarchically? The WWW!

  94. XML hierarchies are VIEWS of data. by flacco · · Score: 2
    Hierarchies are often used to represent a view of data that is appropriate for a given purpose, but different scenarios will often need different views of the same data, so it's not a good idea to lock the data into one hierarchy or another.

    Today you might want to store your history of employee-department assignments by nesting employees under departments, but at some point you may also want to nest work histories under employees.

    --
    pr0n - keeping monitor glass spotless since 1981.
  95. repeat after me: all data is not a tree by patSPLAT · · Score: 1

    The normalized form of many (if not most) kinds of data is not hierarchical.

    Think about it: how do you represent complex "types" of things in a strict tree? Is that person a client? Or are they a contractor? Or are they an employee? How do you represent these things while keeping just agile enough for the future?

    The stupidest thing about xml is that it is strictly hierarchical. XML is an extension of printing idioms, and it leads to the same dopish proliferation of records that is documented in Brazil.

    Do yourself a favor: Don't force everything into the same hole. That's why we have squares and circles.

    1. Re:repeat after me: all data is not a tree by flacco · · Score: 2

      I think you're wrong. How about: "Not all data is a tree" instead?

      --
      pr0n - keeping monitor glass spotless since 1981.
  96. XML can be relational by jmerelo · · Score: 1

    You can use id atributes, and references to those IDs in other elements, to imitate relational DBs y XMLs. Of course, it's not "natural", but it can be done

  97. XML is not a database. CDATA, order of fields? by kptBlaha · · Score: 1

    I agree that any hierarchical database can be replaced by a relational one. But XML is not a database format in general. The data in XML file are mixed with parts of free text and the order of elements matters. XML (SGML) was invented as a markup language - ie a method for inserting metadata into your plain text (for example Docbook).

  98. The entire reason by Anonymous Coward · · Score: 0

    why relational databases took off in the seventies and eighties was that they encapsulated the entire list of operations people wanted to do on data at the time within an extremely easy-to-understand concept. One data type, one set of operations.

    Why object-orientated databases suddenly started to take off in the nineties, was that the list of operations people wanted to do on their data, suddenly expanded as computers got more powerful and memory got larger.

    So data got larger, including things that weren't previously seen as data. Object-oriented databases went back to the old hierarchical database "model" to do so, while everybody forgot that the relational concept deals not primarily with allowable data/operations and data structures, but instead with the way data was manipulatable.

    Now data has expanded yet again, and people thing XML is something new. surprise,surprise! It isn't. All that has happened is that the list of allowable operations on data has expanded, due to the pervasive influence of the Internet. The central concepts of the relational database model are still valid. All that needs to be done is to expand the allowable data/operations to cover XML. Simple.

  99. Object-Oriented Relation Databases by sigwinch · · Score: 2
    I mention this, because it is something I find myself wanting to do all the time, for example, when storing data that originates in OO programs. Being able to store it in an RDBMS has heaps of advantages for me ... but I can't easily store the different info of different derived classes.
    PostgreSQL provides inheritance for tables (which is why they call it an object-relational database). I haven't used that feature yet, but it looks perfect for persistent storage of OO data.

    Speaking of DBMSes, one of the electrical engineers at work wanted to learn how to use the Oracle DB one of our projects is built on. Somebody told him "No problem, it's dead easy. Almost everything can be done with only two commands, SELECT and UPDATE. Simply learn those two and you'll know everything there is to know about DBs." Apparently, the CS guru who witnessed this nearly imploded. Wish I'd seen it...

    --

    --
    Kuro5hin.org: where the good times never end. ;-)

    1. Re:Object-Oriented Relation Databases by doubtme · · Score: 1
      PostgreSQL provides inheritance for tables [postgresql.org] (which is why they call it an object-relational database). I haven't used that feature yet, but it looks perfect for persistent storage of OO data.

      Hmmm... very sexy. I will have to look at that... I can see it being incredibly useful for a project I am planning at the moment.

      --

      There's no $$$ in 'team'...
      www..--..net - for incisive, w
  100. If you had a gun, pointed at your foot... by The+Panther! · · Score: 1

    Uhm.. Have you ever really used a database, or are you karma surfing?

    The alternative to relational databases is not "DBM", it is object oriented, tree structured, logical, and other kinds of database models. Those are just as well defined as relational databases.

    I can't defend the original poster's assertion that DBM is the only alternative. I do argue that object oriented models are different in nature, since they allow functional bindings to data. The original question regarded hierarchical structure, but said nothing of bound functionality. Your example of object-oriented databases being alternatives may be true, but they are not equivalent.

    As for the rest of your examples which are relationally different, they are subsets of RDB. Anyone could implement an hierarchical database interface using a RDB back end fairly easily, with no more than a constant performance penalty. However, it is impossible to do the reverse. Please, educate me if you think otherwise.

    Your arguments have the ring of an emotional belief that implicit tree structures have some direct benefit that relational ones do not. They don't. Relational systems can explicitly describe a tree structure, which among other things makes it possible to extract small portions and move parts of them around (network, disk, memory) without changing their relationship or representation. A tree structure cannot without creating a relational database equivalent (through use of pointers or indices into a table, etc). This should be more than adequate to prove there is value in relational systems where trees are insufficient and inefficient for equal operations. Contrariwise, you cannot produce an operation for which trees are inherently faster. More natural to think about as a human, perhaps, but not mathematically advantageous in any way.

    Are you kidding? It is a major pain trying to express hierarchical data in a relational database model:

    Sorry, that's simply wrong. The operations required to modify a data structure are related to the data structure, not its representation. That's why they have classes on teaching data structures as concepts, and not on implementing them. The same amount of data must be present in some form, even if you as a user are not aware of it. The relationship property of two nodes in a tree IS data. But in a tree, you often don't have direct access to it.

    the relations that describe hierarchical data and the operations that you might want to execute often require complex, multiple, inefficient queries and updates, and the relational model provides few tools to ensure that the corresponding relations remain consistent.

    The complexity for any representation is of the same Big-O order, regardless of the database type. I don't see where you can do more work in less statements with a non-RDB system which would be less complex or less inefficient, without throwing out some functionality. Also, relational databases usually support constraints to maintain valid relationships, as well. Please feel free to chime in here with some actual examples anytime, though.

    [...]they generally include numerous facilities for object-oriented and hierarchical databases, under a "relational veneer".

    Humans think hierarchically. Computers don't. It's natural to have explorer apps to help visualize data, but data does not want to be hierarchically organized.

    For the record, object-oriented is not meaningful to associate with hierarchical order in exclusion of relational order. It is a common folly to think an object must be hierarchical in nature to be an object.

    --
    Any connection between your reality and mine is purely coincidental.
  101. HDB BS. by Anonymous Coward · · Score: 0

    I see plenty of armchair database experts, out there talking about Relational Calculus'. Yeah, I'm sure there are plenty of DBA's with deep knowledge or RDBs.

    How many of you have actually worked with a real Hierarchical Database like IMS? Perhaps we could get an intelligent post rather than the pseudo-intellectualy, fresh out of college, /. is known for?

  102. Damn right! - OT by The+Panther! · · Score: 1

    I wish I knew more about filesystem programming, because I've long wished to write a simple file system that uses a structure which is independent of the presentation of files.

    It would be simply wonderful to create a file system view, per user, which exists not only to restrict what they can see (almost like being chroot'd with lots of mounts in that directory), but also to make certain things more accessible or differently organized based on properties you feel are important. Doing so currently requires a shitload of symbolic links and manual maintenance when adding or removing files. Instead, you should be able to mount a file set under a name and put a query in that file set, so that it appears to be a directory with files that match some given attributes. Then you build a hierarchy of those, since that's a natural way to think about things.

    The lack of categorization, or meta data, for files has been a thorn in users' collective side for decades, and with the death of Mac metadata in OS X, there's no real proponents out there for improvement.

    Oh well... In my dreams...

    --
    Any connection between your reality and mine is purely coincidental.
    1. Re:Damn right! - OT by droleary · · Score: 2

      I wish I knew more about filesystem programming, because I've long wished to write a simple file system that uses a structure which is independent of the presentation of files.

      This doesn't require you write a fs, but rather it suggests an abstraction layer above any particular file/object store, be it data stored in a hierarchy on the file system or in an XML file or data stored in a database.

      It would be simply wonderful to create a file system view, per user, which exists not only to restrict what they can see (almost like being chroot'd with lots of mounts in that directory), but also to make certain things more accessible or differently organized based on properties you feel are important. Doing so currently requires a shitload of symbolic links and manual maintenance when adding or removing files. Instead, you should be able to mount a file set under a name and put a query in that file set, so that it appears to be a directory with files that match some given attributes. Then you build a hierarchy of those, since that's a natural way to think about things.

      Dead on. I wanted to give an example from my paper here, but the Slashdot lameness filters aren't allowing it.

      The lack of categorization, or meta data, for files has been a thorn in users' collective side for decades, and with the death of Mac metadata in OS X, there's no real proponents out there for improvement.

      Actually, Mac OS X metadata handling is richer than in previous versions, getting away from a file-centric model and closer to a user-centric one. It still isn't up to snuff, though, which is why I'm writing Mary, my Meta Object Manager, using Cocoa. So I guess you could say there is at least one proponent. :-)

  103. go read a book by mj6798 · · Score: 2
    I recommend reading SQL for Smarties, which explains at great lengths how to implement hierarchical data structures in SQL. You'll see that it's not trivial to do well. And you'll learn something. If you ever actually deliver a high-performance commercial application based on a relational database, it will come in handy, I promise.

    The relationship property of two nodes in a tree IS [a relation].

    Indeed it is. However, something like the "parent(x,y)" relation satisfies particular properties that the relational model has no support for enforcing. Furthermore, algorithms over trees are intrinsically recursive and usually require a recursive exploration of such a relation; you cannot express that with a bounded number of relational queries--it requires iterating queries together with transactioning across those queries, procedural code that falls outside the relational model and that, incidentally, is also very slow when implemented on top of standard relational databases.

    The complexity for any representation is of the same Big-O order, regardless of the database type.

    In real-world applications, constants matter, a lot in fact, so even if the algorithms weren't just the same big-O, but the same big-Omega, there would still be an issue. Second, the issue is not a clear-cut as you seem to think: depending on the specific relational database model one adopts, you may end up paying extra logarithmic or even linear factors in the size of the database.

  104. Jim Starkey's opinion by Anonymous Coward · · Score: 0

    I thought it would be interesting to quote Jim Starkey, original designer of InterBase and various other RDBMS on this subject:


    It is very interesting to see ideas come and go then return again.

    The earliest database management system I know of was of what might be called the amorphous data model -- a database was a collection of records, each consisting of totally arbitrary attribute/value pairs. It had some nice retrieval characteristics, but systematic
    processing wasn't in it. Then came the hierarchical systems, IBM's IMS and the ARPA Datacomputer. The hierarchical systems
    were regular and the tree architecture seemed intuitive, but updates were a nightmare and complex structures almost impossible
    to model. Then came the CODASYL "network" data model of records and named "sets". It could model anything, was readily updatable,
    but was utterly impossible to layer a query language on. Then came Codd's relational database (for two points, can anyone
    explain the name?) which was highly regular, easily updated, expressive, extensive, and easily retrievable. So good, in fact, that once the performance problems were worked out it put absolutely everything out of business.

    Then the object data model showed up --inheritance, polymorphism,
    so very OO. The venture guys threw great big sacks of money at them. But scrape off the hype and you find a warmed over CODASYL database with all of the warts, glitches, and gotchas
    that resulted in a mass extinction a decade earlier. Another mass extinction.

    So now, whoppee, another hierarchical representation, complete with all the problems that made IMS such a mega-pig. XML as
    an operational data representation is not just a terrible idea, it's the reincarnation of a very famous terrible idea.

    I'm now waiting for organic medium database management system. Based by an all natural renewable resource (farmed trees fueled by our sun) using an offshoot of optical storage
    technology -- hole/no hole -- arranged in a user friendly intuitive format of 12 rows and 80 columns.

    When will they ever learn?
    When will they ever learn?

    Jim Starkey

  105. Go hierarchical :-) by elronxenu · · Score: 1

    Relational databases have their benefits when the data and the access modes fit neatly into the relational model. Over-normalisation of data is a
    sign though, that the relational model is breaking down in that instance.

    Hierarchical is a much better fit for an object-based data model: "this IP address is a host, and it's running these services; that IP address is a router, and its connections are ..."

    I was telling an ex-cow-orker about IBM's IMS hierarchical database recently, how the access modes facilitate more correct programming ("get next object; do something with it; update and get following object") and easier access to related data ("get the next object contained within this object"). Although he grew up on PostgreSQL his response was "cooool!".

  106. Not a good example. by jcr · · Score: 2

    No, in any RDBMS, you just have the numbers in a table of phone numbers, and use a one-to-many relationship from the person to the numbers.

    Sure, many database "designers" (and I use that term very loosely), will do something bone headed like define a fixed ten-decimal-digit number as the phone number field, but that's not the RDMS's fault.

    -jcr

    --
    The only title of honor that a tyrant can grant is "Enemy of the State."
    1. Re:Not a good example. by tzanger · · Score: 2

      No, in any RDBMS, you just have the numbers in a table of phone numbers, and use a one-to-many relationship from the person to the numbers.

      I've already replied to this arguement a number of times; I'm not trying to brush you off, but please refer to my other comments on the matter. Join tables aren't the solution to everything; at some point the system breaks down into nothing but tables with two columns and everything referring back to a name or some other common data bit. Hmm... come to think of it, that kinda sounds like a hierarchial system...

  107. Yes! Give that man the gold star! by jcr · · Score: 2

    we shouldn't be talking about "XML data" is if it was somehow the core representation.

    Exactly. The question of whether to use hierarchical databases is orthogonal to the question of whether to use XML. NetInfo, the DNS, LDAP, and smalltalk source code trees are all good examples of where you should use a hierarchical approach, and this was true long before anyone thought of XML.

    -jcr

    --
    The only title of honor that a tyrant can grant is "Enemy of the State."
  108. But why? by Peter+Harris · · Score: 1
    The problems that you mention, both concerning storage space and flexibility of the data model are what XML databases are attempting to solve.

    Listing the problems in opposition to the solutions does not make for a good arguement

    I still don't see what problems the "XML" databases are themselves intended to solve.

    It's reasonable to point out obstacles in the way of a course of action when there is so little on the "benefit" side of the equation to make the cost worthwhile.

    Maybe there is a mathematically elegant hierarchical representation for data, but XML isn't it. My vote would be for s-expressions.

    --

    -- What do you need?
    -- Gnus. Lots of Gnus.
  109. Been there, done that. Read Date by shaunok · · Score: 1

    One of the advantages of being a old git is that you can see history repeating itself. First relational vs hierarchial, then objects appeared and we fought the relational vs object (usually hierarchial) wars, and now with XML we do the same dance again. The original rationale behind relational databases still hold true and I recommend Date as a good read. Most of the stuff in his books seems like common sense, but it wasn't so obvious at the time. It just goes to show how deeply relational thinking is embeded in the way we do things. Relational databases will win again, after all the data is the thing between the angle brackets. No doubt there will be some applications where XML specific atabases are the go but in general relational dbs will be modified until the XML functionality required is available.

    1. Re:Been there, done that. Read Date by brenfern · · Score: 1

      Hear Hear.

      The relative novelty of XML means that people are still enthusiastic enough to really make it want to work, but there's really little advantage in using a "native" hierarchical model. I am particularly enjoying the spectacle of the XML community re-inventing the wheel with XSL/XQL/SOAP etc.

      Any so-called "impedance mismatch" in RDBMs are trumped in spades by the problems of using XML/hierarchies, which not only scale badly to high volume but also lack integrity constraints and are less type-safe than databases.

      Oracle have refid which I have never needed to use but understand is able to solve the remaining 1% of XMLers' problems with RDBMs.

  110. Storage mechanism isn't the issue by Yousef · · Score: 1

    The main reason for relational databases was the speed with which queries would be carried out as opposed to other mechanisms.
    And the simple Discrete Mathematics that was inherently used to get the queries to work.

    --
    -- "To ask a question is to show ignorance; Not to ask a question means you'll remain ignorant."
  111. Puleeeease -- I Had to WORK with hierarchial DBs by Dakota+Rider · · Score: 1

    I'm old enough to have had to actually WORK with hierarchical databases. The system was on the old Data General systems, and I can't recall the name... but I hated it almost as much as I hated "pure" Pascal.

    True liberation came in the form of relational databases and C. I /know/ how much more freedom of expression and action both provide. Not that they are better in the same way, they just both represent a better (in many ways) technnology.

    Count me out in the wave de-evolution back to hierarchical databases. Let's not let XML be the cart that precedes a dead horse.

  112. personal hierarchical db by p_pp_n · · Score: 1

    I made my self a little application for storing my thoughts, addresses, todolists, random snippets of commandline hacks, quotes, links etc.

    I found no other application capable of doing it out there,.. so I made my own (hnb), it turns out other outliners, as I found out these apps actually are called, existed, but none of them did it the way I felt natural.

    I think hierarchical db's are nice for things that humans are supposed to organize and reorganize, but I doubt it would be really sensical to move from rdbms to pure hdbms because of xml.

    --
    The mind is it's own place and in itself, can make a heaven of hell, a hell of heaven.
  113. The truth is ... by Anonymous Coward · · Score: 0

    There is no hierarchy that a relational scheme cannot describe. (Unless it's not a well-defined hierarchy to begin with (e.g. Parent(A)=B, Parent(B) = A etc))

    Relational tables are hierarchical, but the tree structure is implicit, it's implied by the relation. It can be generated from the relations, (very efficiently with today's hardware and software).

    SELECT students WHERE teacher='Miss Jones' and school = 'P.S. #1' certainly elaborates a school-teacher-student hierarchy, doesn't it?

    And relations can go where no singly rooted tree dare tread: bi-directional graphs, dags, lattices etc.

  114. Fabian Pascal bashes XML and OO for all it's worth by dannyspanner · · Score: 1

    I was wondering about exactly this a little while ago when I came across this very interesting site called Database Debunking by Fabian Pascal. Very interesting to see such a "contrarian" view. He's the guy who wrote Practical Issues In Database Management.

  115. Debunking non-relational database (including SQL) by leandrod · · Score: 1

    Go for http://www.firstsql.com/dbdebunk/... there Fabian Pascal, Chris J Date and occasionally Hugh Darwen, the greatest living experts on database management systems, constantely debunk the need for object, hierarchical and network databases, includin XML... all we need is a properly implemented relational database management systems, something better than SQL.

    --
    Leandro Guimarães Faria Corcete DUTRA
    DA, DBA, SysAdmin, Data Modeller
    GNU Project, Debian GNU/Lin
  116. Re:Fabian Pascal bashes XML and OO for all it's wo by VP · · Score: 1

    He also states that there are currently NO real RDBMS out there - just SQL databases, which offer limitted support for the full relational calculus. To me this means that a hierarchical or an object database environment could be used to build a proper RDBMS. The thing Fabian Pascal warns about is not confuse the underlying technology with the relationsl model, if something uses tables, it does not mean it is an RDBMS.

  117. perfomance issues by Anonymous Coward · · Score: 0

    if you have already made any experiences with an xml database - you will never be talking about performance in such a context again.

  118. The OO solution by MarkusQ · · Score: 2
    Because each car also has a body-type (compact, sedan, SUV, truck, van, etc...) - which in a relational database would simple by another lookup table, but in an OODBMS poses data management issues.

    The OO way to answer this is that body-type is a class and compact, sedan, SUV, etc... are instances of it. Each car would have some instance of body-type as a member. I've implemented this sort of thing in a roll-your-own OODB (in Ruby) and in a OODB-on-SQL (in Delphi); in both cases it was painless. The only thing that is remotely tricky is to avoid infinite loops in your low-level serialization code, by doing lazy streaming or by having a serialization flag, or stack, etc. just in case some later person creates a body-type (e.g. batmobile) that somehow refers back to one or more instance of car.

    -- MarkusQ

    1. Re:The OO solution by Coventry · · Score: 2

      You're missing the idea that multiple levels of classification here... take a look at my example for the detail on deciding about a manufacturer or body-type as the base class.

      --
      man is machine
    2. Re:The OO solution by MarkusQ · · Score: 2
      You're missing the idea that multiple levels of classification here... take a look at my example for the detail on deciding about a manufacturer or body-type as the base class.

      You can have as many levels as you wish, and even store things that don't fall into nice "levels." Just use the OODB to store objects and then have members in the objects that refer to other objects (not enumerations) for your classifications. Thus you don't need an instance of SUV for each manufacturer that makes one...because SUV is a (single) object. In the same way, Datsun and Delorean etc. are all objects.

      The class of a particular instance of make (say, Model-T or Bug) wouldn't be a manufacturer or a body-type, it would be the class make. And, like all instances of make, it would have (as members) both a manufacturer and a body-type.

      The problem isn't with using an OODB, but with using an enumeration (or a collection of strings) when what you want is an object.

      -- MarkusQ

  119. relational vs hierarchical by zr · · Score: 1

    relational databases are easy to query, but rather difficult to navigate, hierarchical are easy to navigate, but hard to query. take your pick..

  120. Re2:Some thoughts... by angel'o'sphere · · Score: 1

    Your post is errr.... strange.

    I can not follow it, so I do not dare to point out what is wrong.

    Except, of course :-), for one thing:

    In an OODBMS there is no MAPPING from objects in the programming language to objects in the DB.

    The objects are stored one to one.

    A OODBMS has only three purposes:

    1) Allow querries on OO structures(graphs of objects).
    2) Allow transactional save changes on a lot of distinged objects.
    3) Isolate point 2) if different users access the same objects.

    You example of cars and manufactors is to complex to talk on/make surgery on here. However:

    class Maufactor { /* your manufactor data here */ }

    class Car { }

    template
    class One_Parent_to_many_Children {
    Parent parent;
    Children* children;

    }

    typedef One_Parent_to_many_Children .less. Manufactor, Car .greater. CarManufactors;

    This are C++ classes. Most people would simply have a pointer to the Manufactor in the Car class, of course.

    If you have that above, the objects in the database are ....
    Ermm.... Cars, and Manufactors and relations, noting else. Nothing to map.

    For selections you make OQL querries. Yes, you can not REMOVE COLUMNS, like you can do in SQL. And you do not like to do that, or do you like to get HALF cars out of the DB? You are moving in a C++ world above, so you like to get Cars and Manufactors out of the DB and not something which is returned by an:

    select Manufactor, Cartype from Manufactors, Cars, where production_date is_less_than 2001-11-19;

    Because that SQL query only returns a list of two attributes of Car or Manufactor.
    The similar OQL query:

    select Manufactor from Manufactors, Cars where Car.production_date is_less_than 2001-11-19;

    yields ... Errr ... what do you guess?
    It yields your favourite container class configured as query result, in this case it might be an STL vector.
    Each Manufactor object referes in its list of produced Cars to the Cars fitting the querry above.
    If you stick to the relation class I sketched above, you only might get Manufactors and then you make a new query for each Manufactor in the vector to get the Cars. If you would have used a pointer, as most C++ programmes would have done, the OQL query above would be just fine.
    The result would have been trees/graphs of Manufactors with Cars built by them. FULL FLEDGED C++ objects, ready to call virtual functions on.
    Not some array of text data to be converted into objects.

    Regads,
    angel'o'sphere

    P.S. I have configured "PLAIN OLD TEXT" for submitting. less and greater signs are ALLWAYS interpreted as HTML leads. BOLD etc. does not work as the closing tag is note recognized .... BUG or am I just not able to post correctly?
    ARGGGGGG after posting I saw even the less_than operator in the SQL fragments messed my post up, sorry for posting twice :-(

    --
    Cost free eBook I read (by iBook/Kobo/Amazon/ObookO/Gutenberg etc.): "The Green Odyssey" by Philip Jose Farmer.
  121. XML for everything. by Edward+Kmett · · Score: 2

    I have been toying for a while now with how to represent data efficiently in a manner in which it can be efficiently walked with an XML document-object-model, if not in XML itself.

    My core concern is actually building the indexing tools to handle the matching of insane numbers of actual Xlinks (or a mildly simplified form thereof), to make a dataset that is distributed and provided by multiple subsystems hold together.

    My most recent iteration used a simplified form of Xlink to join data supplied by code that implemented a specialized DOM model (code that understood how to browse a given object model, for reflection purposes), with data taken from a flat text XML document (browsed with a different implementation of that model).

    The real concern I have with this sort of system is the time it takes to register a document against all of the current X-links and deriving an efficient system for parent-child connection when multiple parents can exist (transparent links from other nodes). Registering a new node or worse, changing data in a node is an expensive operation in this model.

    My need for extensive X-link support is to be able to provide a sort of on-the-fly XSL translation of one branch of data into another. The links would then connect to the translated data rather than the original.

    A secured view of the data could be provided by handing a link to a DOM node that was walking the XSL view, rather than the original data. User security becomes a XSL-like document.

    My filesystem becomes a branch of the tree. It is admittedly an awkward security mechanism to do XSL-style matching against the tree, but it does not have to be done in XSL itself, another utility which walked a base dataset and returned a filtered view in the same DOM model provides the same security mechanics.

    There are some issues with the current mechanisms for conveying XML information (DTDs do not localize well, etc).

    Method invocations are shaky at best and rely on sharing handles of some sort (SOAP, CORBA, etc). I transmit data either by copying a branch of the tree, or handing off a handle to a CORBA or SOAP object that can walk the DOM on the local system.
    Either way looks the same to the client.

    A hierarchal model has a certain level of appeal because of the simplification of the new-branch registration process, but severely limits the effectiveness of the tree processing tools you can build. OTOH, the relational model could be reconstructed with some stylesheet tricks.

    My present use is in an operating system project as the mechanism for accessing system resources, the 'file system', user data, etc.

    There is appeal when it comes to generating a backwards compatible view of the data, because you can provide a translator which takes the current data and translates it back to a view compatible with what a given application expects, etc. Method invocations through a function call-translator can allow for constrained arguments to methods, etc.

    The transparent linking model also has appeal for simplified remote method invocation, a filename is just an Xlink, etc.

    Active Directory, eat your heart out. ;)

    --
    Sanity is a sandbox. I prefer the swings.
  122. Such as? by SuperKendall · · Score: 1

    So, what would you say worked out better than applets?

    --
    "There is more worth loving than we have strength to love." - Brian Jay Stanley
    1. Re:Such as? by Dwonis · · Score: 2

      Well, the Amiga SDK is a start. Also, note that just because there is currently little better doesn't mean that the status quo is the best thing possible.

  123. WARNING MS JUNKIE WEBSITE by Anonymous Coward · · Score: 0

    Do not click the link, MS loser site

  124. RDBMS vs. XML? by Anonymous Coward · · Score: 0

    I'm a database software engineer and I am trying to understand all of these replies by people attempting to be professional database developers or high-level DBAs. First off, there are products to transform Hierarchial/Network format databases into a relational structure. IMHO they all suck. A major one is by Centura/Raima called ExecSQL which works with their db.Star and previous Raima Data Manager systems.

    The relational data model, even when normalized, will contain redundant data not found in hierarchial systems however, doing an outer-join or view on a hierarchial system is much harder as the internal structures are not representative of relations, domains, attributes, and tuples. I've written several relational and hierarchial/network systems and have found that while relational may be a minute bit slower, the advantages are greater.

    XML vs. RDBMS?
    Why is XML, a storage format, being compared with a software system? I and many other manufacturers can easily write a RDBMS engine around the XML data model. IMHO XML sucks as far as speed and reliability compared to traditional storage methods employed by all major databases. Transactions stored in XML are great however, why? Because XML is a storage system? Get a brain, don't spread false gospel in here. I like the article that said, "we want something new and sexy". How true, and how many uninformed /. readers preach these replies as fact. These poser posters are probably the ones saying that RDBMS' don't need transactions ala MySQL. Hmm, how can the database be reliable internally without transaction support? You all need to get some ACID information, I like reading intelligent articles but dislike 500 factually incorrect posts/replies by uninformed and so-called industry professionals.

  125. this site by Anonymous Coward · · Score: 0

    sucks salty donkey balls