Slashdot Mirror


Do XML-based Databases Live Up to the Hype?

douthitb asks: "I have recently started work as a contractor with a company developing/improving an application for exchanging large amounts of data. The current solution exchanges data via XML, but the data itself is stored in a SQL Server database. There is a concern about the overhead involved with wrapping and unwrapping the XML to get the data in and out of a relational database. The proposed solution is to use Tamino, an XML-based database. Neither I nor any of the other developers have any experience with Tamino, but the desired result is to remove the bottleneck of converting the XML back and forth. Does anyone have experience using Tamino (or any other XML-based database)? What benefits and/or difficulties did you have in using an XML database, as opposed to its relational counterpart? How large of a learning curve should be expected with a product like this? Do XML databases really live up to the hype? A similar topic was discussed on Slashdot way back when, so I was hoping to get some more up-to-date feedback on the subject." "Sales reps from Software AG, the makers of Tamino, were brought in to discuss the benefits of their product with us. They, of course, presented Tamino as the end all, cure all database system (it will even clear your acne and make you popular with the girls!). The management of the company I'm contracting with were basically eating out of the sales reps' hands, without asking any of the "tough" questions about what the product can do; I was less convinced. Doing some initial searching on the Internet, I have had trouble finding much information about Tamino outside of the Software AG website."

105 comments

  1. I've worked with the Tamino kit... by (H)elix1 · · Score: 5, Insightful

    The thing the XML databases are nice for is if folks can't really lock down the schema. Often you have the case where you are mapping attributes to columns, which works fine in a relational database. Then things change over time.... Usually turning a nice relational design into a mess. Being able to use Xpath is great when you are searching for nodes too, once you get your arms around the syntax and assuming the stuff you are storing is XML. Some of the other bits in their toolkit were interesting.

    If things are fixed, there are a lot of other options out there for faster manipulation. XMLBeans (now an Apache project, formally BEA) is good stuff. Hibernate is lovely kit for mapping objects to a relational DB.

    1. Re:I've worked with the Tamino kit... by Anonymous Coward · · Score: 2, Interesting

      The thing the XML databases are nice for is if folks can't really lock down the schema

      If you don't know the structure of your data, you're not dealing with data at all, but incoherent noise, which should be treated as an opaque object.

      There are few shortcuts in life, and data storage is no exception. If you don't take the time to understand your data OR admit you don't understand it and treat it as an opaque object, you will likely get burned. Sometimes you won't, but don't let that fool you. You can drive for years without using your seat belt, until you get in an accident...

      Just some food for thought for the budding data designers in the audience tonight.

    2. Re:I've worked with the Tamino kit... by smittyoneeach · · Score: 1

      Mod parent up!
      XML may make sense when you're force to temporarily 'shim' two things together, but it puts the 'k' in kluge.

      --
      Get thee glass eyes, and, like a scurvy politician, seem to see things thou dost not.--King Lear
  2. yeah, i support a tamino server at work. by Anonymous Coward · · Score: 4, Informative

    it runs in tomcat or similar. it's really crashy. we can't wait to get rid of it.

    1. Re:yeah, i support a tamino server at work. by Anonymous Coward · · Score: 0

      No, I must repectfully say that I think you are confusing Tamino with something else here. I use Tamino too, and can definitely tell you that it does *not* run in Tomcat, but as a standalone server.
      You can access the database through a web application running in Tomcat, but the server runs separately.

      I would also have to disagree about the stability. It runs great for us on Windows - what are you running it on?

  3. Berkeley DB XML by SchnauzerGuy · · Score: 4, Informative

    I haven't tried it, but the regular Berkeley DB is highly regarded, and both are open source and (depending on your situation) free, so it is definitely worth a look.

    Berkeley DB XML 2.0

    1. Re:Berkeley DB XML by selectspec · · Score: 2, Informative

      Berkley DB (XML) is great for some applications, but it lacks high availability (remote replication, clustering, etc).

      Tamino seems to claim recent support for "Enterprise High Availability" but I'm not sure what that means.

      Before I'd decide on XML, SQL,flat files,OODMBS, RDBMS etc, I'd want to know four things:

      1. How will it be secured.
      2. How will I back it up and recover it.
      3. How will I replicate/mirror/cluster it locally and over distances in case of a failure/disaster.
      4. Do upgrades require downtime.

      Then I'd discuss the academic issues.

      --

      Someone you trust is one of us.

    2. Re:Berkeley DB XML by Tet · · Score: 1
      Berkley DB (XML) is great for some applications, but it lacks high availability (remote replication, clustering, etc).

      Have you looked at the Berkeley DB XML High Availability product? It definitely supports multi node clusters. Remote replication should be fairly trivial to achieve, too (although I haven't personally tried it).

      --
      "The invisible and the non-existent look very much alike." -- Delos B. McKown
  4. Oracle and XSQL by Rich · · Score: 4, Interesting

    Oracle and XSQL/XSLT works fine for the database we use at work. The overhead of wrapping and unwrapping the data doesn't seem to be any problem.

    1. Re:Oracle and XSQL by Wabbit+Wabbit · · Score: 3, Interesting

      The overhead of wrapping and unwrapping the data doesn't seem to be any problem.

      Yeah, but how much data? And how many calls/second?

      A few years ago I worked on a day trading system that talked to a SQL Server database and we were going to use XML to wrap the data but found that it did add significant time to the commits, and in that business time was $$$ so we left it out. (and yes, we spooled commits out of a separate thread, etc. etc. but don't ask; it was a complicated architecture that I was saddled with, and there was still some db code in the core codebase).

      --
      Nothing is inexplicable; only unexplained -Tom Baker, Doctor Who
    2. Re:Oracle and XSQL by Rich · · Score: 2, Informative

      It's a decent size - the results of around a million security assessments. The number of transactions per second is low, but the amount of data needed to generate reports is quite high.

  5. The devil's in the details by LeninZhiv · · Score: 5, Informative

    The first question to answer is, why is this data in a relational database to begin with? More to the point, is this application the only one that accesses the data, or are there other, non-XML centric databases that make use of the same data? The relational model gives you flexibility that XML does not for dealing with the data in arbitrary and unforeen ways (XML can be quite flexible with XSLT, but a programmer must still intervene for each and every new way you want to use tha data, with a much bigger performance hit). The normalised relational database stores your data in a mathematically sound way that puts the priority on integrity of data independently from its past, present or future structure; XML preserves data structure based on its present use while leaving the door open to moving from that to any arbitrary future use... which of the two ideals is more attractive depends on the nature of the data and how many applications need to use it.

    Relational databases with good XML support (my background is DB2 but most major databases should be able to do this) reach a good compromise by giving you acces to normalised relational data as XML (which you can compliment with XSLT it if that's what needs to be done), while preserving it internally reduced to its bare essence as data (according to relational calculus' idea of what constitutes the bare essence of data, anyway.)

    On the other hand, for single-app applications, or data that is more file oriented than datum-oriented (databases of XML documents where the document rarely or never needs to be abstracted from the data it contains), XML databases offer simplicity and efficiency by removing the need to work out a relational data model. Why break up your structured documents into a DBA's hand-tuned data model when 99.9% of your queries will just build these data sets back into XML documents (even when DB2, Oracle, and I assume SQL Server can automate this last task)? An XML database can give you more flexibility in querying than an all-XSLT solution, while saving a lot of unnecessary work over an SQL-to-XML solution for what is really an XML-to-XML application.

    As I see it, that's the big picture. The actual decision has to come down to your applications. An XML database will be less efficient for non-XML applictions, plain and simple. Querying XML cannot be made as fast as querying relational tables, meaning extra overhead for non-XML apps. But *your* application encurs overhead in turning relational tables into XML (probably via the RDBMS's internal facility), and in transforming it if necessary. The question is therefore: who makes more queries on the database, this application or other non-XML ones? Who will make more queries in 5 years?

    If you answer 'others' to either question, use a relational database--their XML support is decent now and will only get better, and they're far more popular in business which is an important CYA factor. If you answer 'your app' or 'other XML-based apps' for both questions, it's time to check out what XML databases have to offer right now. I expect other posts to comment on the current state of the art right now, but you can expect things to only get better as industry support for XQuery et al. improves--but don't expect them to *ever* pass up the relational databases in terms of raw performance, it's impossible. But as the evolution from Assembler to C to Java has shown in programming languages, the day may come when raw performance takes a back seat to other concerns.

    1. Re:The devil's in the details by Anonymous Coward · · Score: 2, Interesting

      But as the evolution from Assembler to C to Java has shown in programming languages, the day may come when raw performance takes a back seat to other concerns.

      The point of a database is *data integrity*, not data storage and retrieval. Those are side issues. I can store data very quickly by dumping it to a raw disk device (/dev/hda1). But I will have a hell of a time guaranteeing data integrity (for instance, does each order item have a corresponding inventory item?).

      Your evolution example of C to Java is one of increasing *abstraction* at the expense of speed. In a database, you don't want abstractions, you want your data to come out the way you put it in, and you want to be guaranteed that you will never have an invalid set of data in your database.

      A bug in a C program means you have to rewrite your program. A bug in your data (bad data, in other words) could mean mistakes compounded on mistakes, that you can't ever unwind. I worked once on an order system that didn't cascade to the order line-items when the order itself was deleted or canceled. And royalties were paid to authors based on the order line-items. You can imagine after 5-6 years the shock when they realized they had paid 5-10% too much every single year because deleted order items remained in the database!

      To put it formally, if your database asserts both X and NOT(X) in any given database due to inconsistent data, you can then create a result for ANY ARBITRARY PREDICATE as either true or false. In other words your database is completely broken and can return any arbitrary fact.

      My point is, the poster should first ask themselves if data integrity is of utmost importance. If so, they should learn and understand the relational data model, then learn their database (whatever it is) and how it can be mapped onto the relational model. Since no truly relational databases exist today (SQL is bad joke), you need to perform this mapping step. Then program accordingly.

      If data integrity is not important, then use whatever you want.

    2. Re:The devil's in the details by Anonymous Coward · · Score: 0
      Since no truly relational databases exist today (SQL is bad joke)

      You make this great post (anon for some reason), but then there's that odd ending statement. SQL is not a database. Based on the rest of you post I'm sure you know that. Why throw that in?

    3. Re:The devil's in the details by Anonymous Coward · · Score: 0

      probably meant SQL -databases- are a bad joke, which they are. what else do you call a database that uses SQL?? you can't really call them "relational databases", so you have to call them "SQL databases". SQL being an adjective here.

  6. Adding MORE XML Won't Fix It by Karma+Farmer · · Score: 2, Interesting

    There is a concern about the overhead involved with wrapping and unwrapping the XML to get the data in and out of a relational database.

    So, can you explain how an XML database will fix this?

    Your database still needs to translate the verbose, human readable XML into an internal storage representation. If you're transfering the data between two SQL databases now, then I can't see why it should matter if you're parsing XML and putting into a "traditional" row-column RDBMS or parsing XML and putting into a datastructure more suited for storing XML data. The parsing is going to take exactly the same amount of time.

    The XML database would help if you've mapped your data representation to XML, and are having a difficult time persisting it to SQL. For some data representations, going from XML to parsed binary RDBMS representation back XML may be difficult, and it may be easier to just go from XML to parsed binary representation of XML back to XML again. But either way, you're doing the parsing.

    You're solving the wrong damned problem.

    1. Re:Adding MORE XML Won't Fix It by Anonymous Coward · · Score: 2, Insightful

      Because XML and relational data essentially represent two different data models. If a DB was designed to support XML from scratch, it should be able to perform much better than an existing relational solution. Rewriting XML to relations is a slow process. Rewriting XQuery to SQL is a slow process (mind you, anything you could possibly gain by optimizing XQuery is lost once you hit the SQL layer). Additionally, with a hybrid solution, now you need people that are well-versed in both SQL and XML. Sure, that may not sound like much, but it's hard enough to find someone that really knows SQL, let alone XMLSchema, XPath, XQuery, etc.

    2. Re:Adding MORE XML Won't Fix It by Karma+Farmer · · Score: 1

      Rewriting XML to relations is a slow process.

      That depends on the Schema of the XML and the RDBMS. SQL to XML to SQL requires no rewriting of XML to relations. None. Zero. Zilch. It's simply parsing, and no faster or slower than rewriting XML to an internal XML binary representation.

      Again, if it's specifically XML-centric data, and the difficult part is getting into and out of a RDBMS (and the RDBMS doesn't add any value), then go XML all the way. It's a good way to go (assuming, of course, that your XML database is ACID complaint. I assume they all are).

      But, the question was framed that XML was being used for datatransfer between two RDBMS systems. If that's the case, then they're not going to save any parsing time by parsing into a heirarchical system instead of a RDBMS.

  7. "Overhead" is not important here... by smug_lisp_weenie · · Score: 1

    In an RDBMS XML is just treated as a big "blob" of data- Although translating the raw binary data to XML theoretically has some tiny overhead, in practice large data objects take time to transfer over a network and that far outweighs the tiny conversion time the client needs to do to convert the data into an XML document.

    1. Re:"Overhead" is not important here... by OldMiner · · Score: 1

      The poster can correct me if I'm wrong, but I don't think they're just storing chunks of XML into the database. I wager they have a complicated XML document which they are parsing to extract keys and values. Those keys and values are used to make SQL statements which don't include XML at all. The reverse process happens when extracting data -- normal "SELECT * FROM foo WHERE boo = baz" or what have you is used, then that data is used to build an XML tree.

      It is that wrapping and unwrapping that I believe he is talking about.

      --
      You like splinters in your crotch? -Jon Caldara
    2. Re:"Overhead" is not important here... by smug_lisp_weenie · · Score: 1

      OK- In that case, I agree that that might warrant moving away from RDBMS.

  8. Thumbs Down on XML Databases. by rossifer · · Score: 5, Insightful

    XML databases are possibly useful if you think about them as: an elaborate bucket for storing non-normalized data via an XML interface.

    If your current relational database schema is either 1) small flat files or 2) a few big tables with most/all of the data stored in "blob" columns: i.e. blobs, clobs, byte arrays, or big varchars. You might be a candidate for an XML database. I'd get two experienced DBA's to agree there was no realistic way to normalize the data, first, but that's me.

    If you actually need a database (as opposed to a few files, XML or flat) and your data can be normalized (it almost always can), then a relational database will tend to provide important advantages in three areas: unforseen query handling (OLAP, data mining, etc.), scalable performance, and availability of people with the skills to maintain it.

    As for the tradeoff of converting to XML, a number of the commercial RDBMS's allow you to obtain query results as XML. Though I don't know for certain how they handle inserts and updates, I suspect that there are XML equivalents for those as well. However, even if you have to completely roll your own conversion from SQL to XML, that cost is minimal against the cost of accessing the disk to fulfill the query, which both RDBMS and XMLDBMS will have to do.

    In general, after working with a commercial XML database and attempting to work with another XML database written in house, I'm categorically unimpressed. I think that a lot of engineers have discounted the relational programming model without first understanding it. In my opinion, people familiar with functional and object programming models would do well to learn about relational programming with an eye to determining the appropriate model for different kinds of problems.

    Regards,
    Ross

  9. XML DB? In my expert technical opinion.... by AntsInMyPants · · Score: 2, Insightful

    Ick. I suppose you could do it that way if you want to. Maybe its just me, but I like to keep data in relational DBs and keep the XML stuff for when I need to provide a way of sending information to outside people who will not have direct access to the DB. Most of the time the DB is being accessed, it is for internal applications which can access the tables via accessor methods. Now I suppose you could just write accessor methods against the XML DB..... Relational DBs for storage, XML as a transmission format. But the types of things I tend to build are quite small, so YMMV.

  10. I agree by Tangurena · · Score: 4, Interesting
    Having worked with a business partner who claimed total XMLosity in their database, I had to rework the parser almost every time we got a data feed from them. Their idea of the data model changed from day to day. Even when we sent nailed down, will never change specs for the structure. They really didn't like the idea that I tossed the raw XML into a memo field every time my components received a message, so when there were nasty fingerpointing meetings, I could drum up a simple SELECT statement and show everyone what was changing each and every week.

    XML is kinda nice for some things, and really rotten for some things. Please do yourself a favor and sit down and try to decide what problem you are trying to solve. XML really stinks when it comes to sets: something that SQL based databses excel at.

    I think that with the XML fetish we have these days, that we are reverting to the preSQL days of CODASYL or IMS (pre 1980s for those of you young'uns).

    1. Re:I agree by Anonymous Coward · · Score: 2, Interesting

      I think that with the XML fetish we have these days, that we are reverting to the preSQL days of CODASYL or IMS (pre 1980s for those of you young'uns).

      Stop bashing Charles Bachman's grand ideas. Dr. Codd used "math" to incorrectly justify bashing Bachman's beautiful techniques in their debates. But Bachman's ideas were more natural and organic. After all, natural selection didn't lead to relational structures in our brain. Do you have a relational brain? No? Why not? Because relational is too artificial--it imposes a structure that aint there in the real world. Bring back paths and allegedly evil "pointer hopping". That is closer to how the brain works and better models the upredictable real world. Darwinian evolution is proof of Bachman's ideas! Our brain is a navigational DB because that is the better model. God and/or Darwain voted for it. The grey squishy stuff is a natural, organic, flexible graph not bound by arbitrary "math" rules.

    2. Re:I agree by Tablizer · · Score: 1

      Er, you brain may not work relationally, you flawed freak, but don't assume the rest of us function down at your pathetic level.

      Note to the person who modded the above as "troll". I think it was meant as a joke. It seems they were implying that THEY have a relational brain, like a robot or something.

  11. Obvious by Pan+T.+Hose · · Score: 5, Insightful

    What benefits and/or difficulties did you have in using an XML database, as opposed to its relational counterpart?

    Benefits: XML is new and trendy.

    Difficulties: Ignorance of the decades of scientific research and engineering experience in the field of relational database management systems, relational algebra, set theory and predicate calculus; lack of real atomicity of transactions, lack of guaranteed consistency of data, lack of isolated operations, lack of real durability in the ACID sense, and in short, the lack of relational model; scalability, portability, SQL standard, access to your data after two years and after twenty years; to name just a few.

    How large of a learning curve should be expected with a product like this?

    Certainly smaller than a real, relational database.

    Do XML databases really live up to the hype?

    No.

    I believe that you are confusing an RDBMS with an object store. You should read this excellent comment posted almost three years ago by Frater 219. I understand that you may be inexperienced but you should not be ignorant. Literally decades of scientific research has been put into relational database management systems. Of course you are perfectly free to forget about computer science, jump on the bandwagon and choose whatever buzzword is trendy these days (yesterday it was OOP, today it is XML, tomorrow it will be .NET) but then you have to realise that you are gambling with your data that may be rendered inaccessible in few years (and that is if you are lucky and don't lose its consistency before) and those unfortunate enough to inherit the responsibility of maintenance of your system will curse you to no end wishing you were dead, and not without a reason. You can be fancy with your applications and front-ends, but RDBMSs are probably the most mature computer systems known to man. Ignoring it is foolish, to say the very least. You may say: but my application will always be the only front-end to that data and it will always be an optimal way to work with it! To which I say: Kids these days!

    --
    Sincerely,
    Pan Tarhei Hosé, PhD.
    "Homo sum et cogito ergo odi profanum vulgus et libido."
    1. Re:Obvious by duffbeer703 · · Score: 2, Interesting

      "I believe that you are confusing an RDBMS with an object store."

      Excellent point... I've worked with some huge CORBA systems with semi-custom object databases and have seen firsthand the pain these systems can put you through.

      One of the bigger vendors whose software we use claims to be porting their entire system to an Oracle or DB2 backed system instead.

      Of course, they'll probally use some J2EE monstrosity to implement the new system, so performance will still suck.

      --
      Conformity is the jailer of freedom and enemy of growth. -JFK
    2. Re:Obvious by Anonymous Coward · · Score: 0

      This coming from somebody whose handle is "Pan-T-Hose" (panty hose). Do they also give Phd's for bras? :-)

    3. Re:Obvious by swillden · · Score: 4, Insightful

      Excellent post, as is the Frater 219 post that you referenced.

      I think that both of you stopped short of pushing your arguments to their conclusions, though, so I'd like to add a bit.

      Frater 219 is exactly right that objects and tuples are fundamentally different, but he focused on both from a purely data-oriented point of view, which caused him to understate the issue a bit. A better understanding of the real goals of objects and tuples helps, IMO, to clarify why they're so different -- and the arguments can be extended to consider XML as well.

      Consider the goals behind relational database normalization. It's obvious that the primary goal is one of flexibility, ensuring that the data can be sliced and diced in any way imaginable, easily (which is not always the same as efficiently). A good relational design provides total "transparency", so that no matter what future demands are made, if the information is in the database it can be retrieved, just by asking the right, simple, question.

      Obviously, relational database technology was created because in the past there were systems that structured data in ways that limited the ways in which it could be retrieved and analyzed. RDBMSs solve that problem admirably well.

      So, if data transparency is such a wonderful thing, why does another computing tool, Object-Oriented Software structure, place so much emphasis on data abstraction and even data "hiding"? The answer is: because OO is about behavior, not data.

      The tenets of good OO design are all about partitioning the problem into compact components that interact in flexible ways. Objects have data, but only, really, to provide these fundamentally behavioral entities with the data elements they need in order to function "independently". This doesn't mean that object architectures can be defined without consideration of data, or that none of the ideas about data relationships which would be at home in a relational design have a place in object design, because they do, but the core ideas of object-oriented design are about entities that act in response to stimuli, allowing internal details (like what the supporting data looks consists of) to be hidden, and allowing subtitution of other entities that accomplish the same abstract goals, but may do it in different ways, using different data.

      This is the real fundamental "impedance mismatch" between OO design and relational design, IMO. Relational design focuses almost purely on data, with little attention paid to how the data will be used (well, in practice, that gets a lot of attention when it becomes clear that the nicely normalized model is simply too slow, but that's separate), and object design focuses mostly on behavior, paying attention to data only as needed to point out obviously bad factorings. This means that if you design a very nice object-oriented application and then try to simply persist those objects in relational tables, the result will be a very poor relational database. On the other hand, if you create a nice relational design and then try to create a class for each table, the result will be a painfully sub-optimal OO design.

      So, as Frater 219 pointed out, if you want a database, use an RDBMS, if you want a persistent object store, use an OODBMS. If you want both (as is common), well, you have to deal with the impedance mismatch, and it'll nevery be pretty, or very efficient. IMO, the best approach is to do the OO and relational designs more or less separately, then work out a solution to translate between them.

      So what about XML? Well, let's look at the goals behind XML.

      One problem with doing that is that there are at least two uses of XML. The first is as markup, in the sense that the document content is really not intended to be understood or processed by machines so much as people. The tags are only used to make machines ablee to grab hold and manipulate bits of it, without any understanding of the rest of the stuff. HTML is like this. An HTML document is ulti

      --
      Note to ACs: I usually delete AC replies without reading them. If you want to talk to me, log in.
    4. Re:Obvious by flockofseagulls · · Score: 2, Insightful
      swillden wrote:
      Consider the goals behind relational database normalization. It's obvious that the primary goal is one of flexibility, ensuring that the data can be sliced and diced in any way imaginable, easily (which is not always the same as efficiently).
      No. Normalization eliminates duplicate information, and insures that non-key attributes are dependent on (correctly grouped with and referenced by) key fields. Normalization is not primarily about flexibility, it's primarily about data integrity. Data can be "sliced and diced in any way imaginable" in both normalized and unnormalized databases, but data integrity can only be guaranteed in normalized databases.
    5. Re:Obvious by swillden · · Score: 1

      Data can be "sliced and diced in any way imaginable" in both normalized and unnormalized databases, but data integrity can only be guaranteed in normalized databases.

      It's easy to create counterexamples, situations in which it is very difficult to extract data in certain ways with non-normalized data. Also, data intregrity can be guaranteed in non-normalized databases, just not by the database engine. Not without add-ons -- triggers and stored procedures, to be exact.

      Normalization serves both purposes but when you contrast relational technology with the hierarchical databases that came before, it quickly becomes clear that the goal driving relational technology is data transparency, and normalization has a lot to do with that.

      --
      Note to ACs: I usually delete AC replies without reading them. If you want to talk to me, log in.
    6. Re:Obvious by flockofseagulls · · Score: 1
      swillden wrote:
      It's easy to create counterexamples, situations in which it is very difficult to extract data in certain ways with non-normalized data. Also, data intregrity can be guaranteed in non-normalized databases, just not by the database engine. Not without add-ons -- triggers and stored procedures, to be exact.


      Of course non-normalized data can be hard to query, but that doesn't say anything about data integrity, or whether a normalized database is easier to query.

      Data integrity can be implemented with triggers and stored procedures, but that isn't the same thing as a schema that guarantees integrity. A normalized schema cannot contain duplicate information (that's different from not allowing duplicates through the use of procedural code). If there's no possibility of duplication there is no need to write application code to implement integrity -- that's the whole point of normalization.

      Normalization serves both purposes but when you contrast relational technology with the hierarchical databases that came before, it quickly becomes clear that the goal driving relational technology is data transparency, and normalization has a lot to do with that.


      Maybe you mean that relational databases don't have invisible pointers, that everything in the database is visible. But that is not the goal driving relational technology. Read Codd's papers, or the many articles and books by Chris Date. The history of relational technology is well-documented.
  12. I've used Tamino and here's my story by snowtigger · · Score: 5, Interesting

    A few years back, I was brought in to a small company to build their new software on top of the Tamino DB. XML was "the way of the future" and we were asked to use it as much as we could. Software AG promised that everything would be easy to program and that their software functioned perfectly. Software AG's sales rep used the fact that Tamino was used in production by (insert major national company here) as a major selling argument. I later found out from a friend working there that they had only evaluated Tamino, found it useless, and never used it in production.

    Well, we did finish the software on time, but it was a complete nightmare. Software AG hardly gave us any straight answers (even though they charged big $ for customer support).

    Tamino itself was missing a lot of features and seemed designed as a system for storing documents, totally lacking traditional database qualities (uniqueness, reliability, scalability, ...) We couldn't even get a reliable unique key from the database. The id we did get "could change" if we were to backup and restore the database. Tamino also scaled very badly with simple queries taking up to a minute on the fastest PC we could buy.

    Needless to say, the software was thrown away and rebuilt with a reliable SQL database.

    I would strongly discourage anyone from bilding an application on top of an xml database, especially Tamino. If you really want to build your application on top of an xml database, I would seriously ask myself why and what difference it would make. Also, if you really need an xml interface, choose an ordinary sql db that has a xml plugin.

  13. don't waste your time with XML by Anonymous Coward · · Score: 4, Insightful

    XML is a file format. Repeat after me. A *text file format*.

    It is not a database, nor a data model, nor should it have anything to do with data storage and manipulation. You can store XML documents *in* a database (just like you can store dates, IP addresses, or JPG data). You can index and join on XPath components of an XML file. And you get XML documents *from* a database. But the database itself has little to do with XML. A well-designed XML database is just a well-designed relational database, and XML is just another data type.

    People are now reverse-engineering a hierarchic data model from XML text files. But the hierarchic data model is less general than the relational model, and in fact was used and rejected *40 years ago* as not being general or powerful enough. Funny how history repeats itself.

    Example: for simplicity, the relational model specifies that ALL data must be stored explicitly in the database. For instance if you have three rows of data, you can't assume any particular order unless the order can be calculated from the contents of each row. But XML nodes have implicit order, which means even the simplest XML document mixes data with metadata. Even a simple query requires dealing with both.

    I recommend anyone who has ever uttered the term "XML database" with straight face to go back and learn some basic relational principles. I think you will agree that all data models are either 1) flawed and incomplete; or 2) reduce to the relational model.

    In CS we don't have a lot of formal models to guide us, as in engineering or other science. Much of CS is entirely ad-hoc. However we do have a sound and complete model for data storage (relational model) and hardly anyone uses it. It boggles my mind. Do people not *want* their programs to work predictably?

    1. Re:don't waste your time with XML by jo42 · · Score: 1

      Thank GNU for this small slice of clear thinking. Every time I hear a programmer babble away and toss XML around, I want to smack him. The most retarded use of XML I've seen so far is on mobile devices running Java crap. Not only are you limited by speed, you also have a limit on the size of your application. One wee program ran much faster when the geniass stopped using XML and just used a plain olde text file. Heck, he even had room to make it fully functional.

      "Use brain. Repeat." - me

    2. Re:don't waste your time with XML by Anonymous Coward · · Score: 0

      Big agreement. XML solves nothing except being a readable metadata format. I'd call it metadata rather than "real" data because it's really up to whoever's writing the app as to how XML is going to get used - and if they screw it up, fortunately, it's probably not going to be so hard to reverse-engineer the format later. But for something on the scale and complexity of a database, XML is a problem, not a solution.

  14. The Problem by SmurfButcher+Bob · · Score: 4, Funny

    ... is that XML is only half of the solution.

    For an XML database to really shine, it needs to be integrated with with a TCP/IP filesystem. Once the physical data is stored using TCP/IP (as opposed to FAT or NTFS), the XML database really begins to take off because the data is already in a network format.

    I swear to god there was a Dilbert on this...

    --

    help me i've cloned myself and can't remember which one I am

    1. Re:The Problem by thhamm · · Score: 1

      TCP/IP filesystem? can i fsck that?

    2. Re:The Problem by SmurfButcher+Bob · · Score: 5, Funny

      Well, you really need to have a TCP/IP based File I/O for any performance with an XML database. Although technically, you would probably get better gains by switching to an HTML database. The HTML database would be better, anyway, because it'll run in any web browser, and it doesn't exactly care what filesystem is in use. That, and all these "data integrity" whiners can then use any CSS validator to check the validity of the data. That way, your HTML Programmers can write on whatever platform they wish, enabling a new paradigm for a pan-dimensional database structure to coexist and re-leverage new legacies before they are implemented, in a cost-efficient and transcendentally transparent manner.

      I found that Dilbert, btw! It was an E-Mail based database! Now if you'll please excuse me, I'll be over here, ducking under a table.

      --

      help me i've cloned myself and can't remember which one I am

    3. Re:The Problem by thhamm · · Score: 1

      now you are personally responsible for any mental illness i might get, by thinking about "performance" with XML. you insensitive clod! :)

    4. Re:The Problem by SmurfButcher+Bob · · Score: 1

      Oh, it gets worse.

      I'm proposing that we start a brand new paradigm - .ini-file based databases. If we can make a database revolving around XML, we can use those exact same arguments to make one that revolves around .ini files. And, the .ini file approach will be more extensible and robust, since it has strong legacy support.

      --

      help me i've cloned myself and can't remember which one I am

    5. Re:The Problem by Randolpho · · Score: 1

      I can't believe you got modded *Informative* for that. Too funny.

      --
      "Times have not become more violent. They have just become more televised."
      -Marilyn Manson
    6. Re:The Problem by thhamm · · Score: 1

      no way bob. the line has to be drawn somewhere.

      i'll stick to my good ol pencil'n'paper (tm).
      talking about data integrity >20yrs.

      or maybe in this case, >5yrs. if at all.

    7. Re:The Problem by thhamm · · Score: 1

      no kidding. maybe /. should use an 'xml database' too. fsck performance, its just darn cute. not.

    8. Re:The Problem by 0racle · · Score: 1

      I feel that 'impressive,' and 'made my head hurt,' would be better mods.

      --
      "I use a Mac because I'm just better than you are."
    9. Re:The Problem by hayriye · · Score: 1

      INI files are obsoleted by Windows Registry database!

    10. Re:The Problem by bani · · Score: 1

      I can't believe you got modded *Informative* for that.

      you must be new around here.

    11. Re:The Problem by LadyLucky · · Score: 1
      What an idiot.

      All you have to do is serialize your motherboard through an HTML port.

      --
      dominionrd.blogspot.com - Restaurants on
    12. Re:The Problem by SmurfButcher+Bob · · Score: 1

      Nope, what if you have PARALLEL PROCESSORS? Who's the idiot now?

      --

      help me i've cloned myself and can't remember which one I am

    13. Re:The Problem by Mignon · · Score: 1
      TCP/IP filesystem? can i fsck that?

      You don't have to; if you implement the TCP/IP filesystem, it will already be fscked.

  15. Uh, SQL? by droleary · · Score: 1

    How's about simply using a database dump? What's the point of introducing an XML parser when you already have an SQL parser at the ready?

  16. You're pre-optimizing... by Anonymous Coward · · Score: 1, Insightful

    The current solution exchanges data via XML, but the data itself is stored in a SQL Server database. There is a concern about the overhead involved with wrapping and unwrapping the XML to get the data in and out of a relational database.

    Premature optimization is the root of all evil.

    You say "you're concerned". That means you don't know.

    Why don't you find out?

    If you have a schema and some of your major transactions speced out, then do some performance testing and see where your bottleneck may be. For Gods sake, don't guess. Know. Find out.

    I will bet that you will find that one solution isn't overly performant than the other, and the benefits of a good RDBMS will make up for any delta in performance from a specialized DB.

  17. I'd pick an alternative by Zareste · · Score: 1

    This is just me, but I tend to write my own database programs to suit my own purposes. It's really not that extraordinary; track entries with overhead, define column-separation string (or put column overhead at the start of the entry), a few other boring details and you can have a whole table in a simple text file.

    Now I'm not an XML expert so this comes with a grain of salt, but I personally don't like the human-readable format because it's really not that hard to get, or code a program that'll read a normal database out for you.

    XML just happened to annoy me because of the overhead. Normally, you can track a file up to 16,777,216 bytes with three bytes of overhead per entry, and store data types (char string int) with a single byte. XML writes out "2" just to store a single number, not to mention the preceding spaces. This certainly doesn't do much for speed and file sizes.

    So this method will just use up a lot more CPU and bandwidth than necessary, even with the tricks servers may use to encode and decode entries. I tend to just stick with the other formats.

    --
    I am NOT a number! I am a - oh wait, I'm number 761710. Look! 761710!
    1. Re:I'd pick an alternative by Zareste · · Score: 1

      I'm sorry. When I said 'XML writes out "2"', I meant that it writes "<integer>2</integer>" but ./ thought those were tags.

      --
      I am NOT a number! I am a - oh wait, I'm number 761710. Look! 761710!
  18. .NET XML - Relational Mapping by c0d3r · · Score: 0, Offtopic

    In .NET you go to the Server explorer and choose your database and table from a tree control and drag the tables directly to a canvas to design a dataset. You can then define releations. Then, you drag a data adapter to a web page or windows form, type the sql and visually associate it with the data set. Its done all visually except for the sql and even that can be generated.

    1. Re:.NET XML - Relational Mapping by Anonymous Coward · · Score: 0

      Fascinating. Pity it has nothing to do with the issue at hand.

  19. Proverb by Anonymous Coward · · Score: 5, Funny
    I once had a problem.
    I thought: "Oh, I know: I'll just use XML!"

    Now I had two problems.
    1. Re:Proverb by Anonymous Coward · · Score: 0

      I like the proverb. What was it originally?

    2. Re:Proverb by TheLink · · Score: 1

      That's fine if the two problems are easier to solve.

      But an ex-colleague seemed to take months to solve a parser issue. The library used was barfing on some stuff, or something like that.

      I suppose everything is fine if the other end actually sends you proper XML instead of broken XML... My guess is my ex-colleague had to deal with a broken case.

      I'd have used Perl instead of Java (which my colleague was using).

      Aside: It seems more common for the perl module writers to actually use the stuff they write, and use them for real world cases. Whereas lots of Java stuff seems to be written by people just to meet a spec/requirement given by their project managers.

      --
    3. Re:Proverb by Anonymous Coward · · Score: 0

      It was for regexes, and said by Jaime Zawinski. Lisp people like Zawinski tend to use parsers where a perl person would use a regex and fuck up.

  20. Where will Tamino be in 5 years? by Omega1045 · · Score: 1
    Seriously, where will Tamino be in 5 years? Microsoft has a pretty farking good track record with SQL Server as a good DBMS (at least since SQL 97, IMHO) that they have continued to develop and support. MS has paid a lot of PhDs and programmers to come up with some fairly complicated stuff that works. And it is fast. And it has XML integrated via MS SQL Server, or you can write your own via ASP & IIS or C# (or any language for that matter). Getting XML into and out of SQL server cannot be a whole lot more work that what SQL server has to do internally to put its data over the network in its own tabular data format for consumption by clients.

    I currently work with Oracle, and all of communication between server to server and server to client is done in XML. The server XML and networking code is written in C and Tcl. Converting the requested data into XML is not really the slow part. Where we see the slow part is in pushing very verbose data across a network. Tamino will have this same problem.

    If you are not happy with SQL Server for whatever reason, choose something else like Oracle, DB2, or friggin MySQL! They can all do XML. I know that Oracle and SQL server both have a lot of built-in XML support.

    --

    Great ideas often receive violent opposition from mediocre minds. - Albert Einstein

  21. Relational-friendly text alternative by Tablizer · · Score: 3, Interesting

    In case anybody is interested, here are some suggestions for making a more relational-friendly alternative to XML, Here is a wiki topic.

    Another potential problem is that existing RDBMS tend to be strong-typed. However, "dynamic relational" is not out of the question. Just because current RDBMS are strong-typed and have "static schemas" does not mean that is the only way to do it. There is a distinction between limits of implementations and limits of relational theory.

  22. XML,SQL,XML Query, Databases by Ankh · · Score: 4, Informative

    There seem to be a lot of confused comments on this, but hey, it's slashdot :-)

    If you mostly deal with the sort of data for which relational databases are generally optimised, you'll probably not be very interested in XML solutions, as they are solving problems you don't have.

    If you routinely get questions like "how often is part 1976 mentioned in the same repair procedure as part 2001?" or "which of our 150,000 documents have chapters containing five or more subsections any of which does not yet have a summary?" then the XML approach becomes more interesting.

    In my book on XML databases (1999 so I don't recommend going out and getting a copy today) I talked about using a hybrid system, with metadata picked out of XML whenever a changed version is stored (e.g. you might use a CVS commit script) and stored in a relational database.

    With a relational database you have a lot of flexibility to change your queries but the data representation has to be static. Even changing the type of a column can be difficult in an RDBMS.

    Queries may be a little harder with the XML system, but the data storage is more flexible and you have native knowledge of sequence and hierarchy that are traditionally absent using SQL.

    More recent versions of SQL have added some XML support, understanding the different sorts of queries that people typically run against such very different sors of data. There has been a lot of research over the past 30 or 40 years (hierarchical databases predate the relational model) on hierarchy, sequence and thesort of irregularity that RDBMS people call semistructured data and the rest of us call XML :-)

    XML Query is a query language designed to run over both relational and XML-native data sources (and others, for that matter) and to be optimized very efficiently, so that people like IBM (makers of DB2), Oracle, BEA, Software AG and othes can have efficient implementations. There's also standards work on how to embed XML Query expressions in SQL.

    The public XML Query Web page is at www.w3.org/XML/Query and lists quite a large number of implementations. Software AG have participated in the XML Query development.

    You might like to look at the XML Query use case document and see how close the examples map to your own situation.

    Disclaimer: I work for the W3C, participate in the XML Query WOrking Group, and maintain the XML Query Web page. But it sounded like it's the sort of information you were looking for.

    I can't comment on the quality of Tamino, as I have not used it, but I will also note that if you stick to openly-defined standard query languages wherever you can, there's a good chance you could move to a different implementation if you needed to with relatively little cost. This is similar to SQL, of course.

    There was lots of hype around XML, but that doesn't mean it's all false, nor that it was all true. XML is a good way to interchange structured, hierarchical imformation, but it probably won't cure acne :-)

    Liam

    [slashdot::Ankh -- Liam Quin, W3CXML Activity Lead]

    --
    Live barefoot!
    free engravings/woodcuts
    1. Re:XML,SQL,XML Query, Databases by flockofseagulls · · Score: 2, Interesting
      Ankh wrote:
      If you mostly deal with the sort of data for which relational databases are generally optimised, you'll probably not be very interested in XML solutions, as they are solving problems you don't have.
      That sounds like it means something, but I don't think it does. The examples that follow, "how often is part 1976 mentioned in the same repair procedure as part 2001?" or "which of our 150,000 documents have chapters containing five or more subsections any of which does not yet have a summary?" seem more like full text search problems (something Google would be good at) more than problems XML would be good at. If the data is structured it can be stored in and queried from a relational database. If it's not structured simply searching the text will work better than marking it up as XML.
      With a relational database you have a lot of flexibility to change your queries but the data representation has to be static. Even changing the type of a column can be difficult in an RDBMS.
      Huh? What do you mean by data representation? The database schema? Or how the data is physically stored? In a relational database the schema can be changed at any time, and a correctly-designed schema can almost always be changed without affecting queries or existing application code. Changing the type isn't difficult at all, assuming the data in the column is compatible with the new type. Or do you mean that changing a column's type will affect application code? How does XML make that easier? By making everything a string? Please explain what you mean by RDBMSs requiring static data representation, or how changing the type of a column can be difficult. All of the relational DBMSs I've used (Oracle, SQL Server, Sybase, MySQL) support changing column types and schemas very easily.
    2. Re:XML,SQL,XML Query, Databases by Ankh · · Score: 1

      If I expand the example to find all occurrences of part 1976 (ignoring dates, of course).... it becomes clearer, sorry if I was too terse.

      As for data that fits well into the relational model and data that doesn't, consider trying to do precise queries on mixed content data, in which text and markup is interleaved. The most common approaches in the past to this were either to store the entire mixed content (e.g. a paragraph) as a single blob or long text column or to split it up into separate items.

      If you store a paragraph in a blob you're stuck, and have to retrieve the whole thing and do the query client-side.

      Stored procedures can help, of course, and that's where built-in XPath and XQuery engines start to shine.

      If you "shred" the paragraph into separate items... since it's a tuple of unpredictable length (a list) you end up using a column, with lots of yummy joins to put it all back together, but then you can ask questions like, "to which versions of the motherboard do these three words apply, the domestic version or the military ruggedized version?" for example.

      If it all sounds like full text stuff, that might be because documents have been sufficiently intractible for relational databases in the past that people have had to rely on doing vague searches based on which words appear in them.

      Even though I was the author of a text retrieval package a long time ago, it's clear that these systems don't meet people's real needs, just as CSV files don't remove the need for relational databases.

      At most businesses the majority of information that isused on a day-to-day basis is in memos, reports, electronic mail, letters to and from customers, quotes and other documents, as well as product documentation and specifications.

      Managing more of this data can be very productive, if done with care.

      Best,

      Liam

      --
      Live barefoot!
      free engravings/woodcuts
    3. Re:XML,SQL,XML Query, Databases by flockofseagulls · · Score: 1

      That explanation makes more sense, though you didn't defend your earlier comment about the alleged difficulty of changing a column type in an RDBMs.

      It's easy to demonstrate that any hierarchical or nested data (such as a document marked up with XML) can be stored in a relational schema: the relational model is a superset of the hierarchical and network models (with much better integrity). So the question isn't can an XML database do something an RDBMs can't do, but rather does it make sense to manipulate and query XML marked-up data directly in some circumstances.

      Clearly documents marked up with XML have useful applications independent of relational databases; the two technologies don't really overlap except in arguments like this one.

      I think RDBMs and XML markup can be complementary. In the example you gave, for example, I would think that storing the documents in XML has many benefits, but structuring some of the elements in an RDBMs for query purposes would add a lot of performance, and perhaps allow a level of integrity enforcement that XML does not. If part numbers and applications, for example, were stored in a relational schema with the document names they applied to, one could query the RDBMs (fast) to identify the specific XML marked-up documents one was interested in, for further querying or formatting.

      Whenever this XML database topic comes up it's framed as either RDBMs or XML. But relational databases solve different problems than XML markup, and the fact that some people have ignored relational technology and tried to re-invent it in terms of XML does not mean that XML is a data storage and retrieval technology. Likewise relational databases are not mark-up or full-text tools.

    4. Re:XML,SQL,XML Query, Databases by Unordained · · Score: 1

      If you routinely get questions like "how often is part 1976 mentioned in the same repair procedure as part 2001?" or "which of our 150,000 documents have chapters containing five or more subsections any of which does not yet have a summary?" then the XML approach becomes more interesting.

      select count(distinct A_repair_proc_parts.repair_proc_fk) from repair_proc_parts A_repair_proc_parts inner join repair_proc_parts B_repair_proc_parts on A_repair_proc_parts.part_number = '1976' and B_repair_proc_pairs.part_number = '2001' and A_repair_proc_parts.repair_proc_fk = B_repair_proc_parts.repair_proc_fk;

      create view bob0 as
      select chapter_subsections.id, chapter_subsections.document_chapter_fk, count(subsection_summaries.id) summary_count from chapter_subsections left join subsection_summaries on subsection_summaries.chapter_subsection_fk = chapter_subsections.id group by chapter_subsections.id, chapter_subsections.document_chapter_fk;

      create view bob1 as
      select document_chapters.id, document_chapters.document_fk, min(bob0.summary_count) min_subsec_summary_count, count(chapter_subsections.id) subsec_count from document_chapters left join bob0 on bob0.document_chapter_fk = document_chapters.id left join chapter_subsections on chapter_subsections.document_chapter_fk = document_chapters.id group by document_chapters.id, document_chapters.document_fk;

      select distinct documents.id from documents inner join bob1 on bob1.document_fk = documents.id and bob1.min_subsec_summary_count = 0 and bob1.subsec_count > 4

      I just woke up, so I expect some bugs, particularly as I didn't test any of it. But I still fail to see how this was supposed to be hard, and how XQuery really makes it any better. Sure, I could code the above as a C or perl or whatever program running through data (PLAN NATURAL), but why?

      The first one would be easier if RDBMSs came with a DIVIDE BY clause as suggested by Date/Darwen/Pascal. It's perfectly appropriate for that sort of query, and quite relational. The second is just a nasty request, and anyone who does SQL much is probably accustomed to that. Users come up with the darndest requests sometimes. Also note that I assumed certain relation layouts, perhaps I assumed incorrectly. (Documents, chapters, subsections, and summaries would, most likely, just all be in one table. At that point, see my table names as views which select from some single table, DOCUMENT_CHUNKS where, say, DOCUMENT_CHUNKS.TYPE = various values for each view.)

      Non-normalized data is the result of people not caring enough / being lazy / not getting the information they should have to do their job. It's not an incurable disease.

    5. Re:XML,SQL,XML Query, Databases by Ankh · · Score: 1

      On changing column types, although most RDMS systems let you do it, and in many cases do so without requiring you to delete and recreate the table, a conflicting type can be a problem. But I am not trying to attack relational databases-- I use them too :-) Rather, I'm trying to illustrate some differences in approach.

      The people working on XML Query are not ignoring the history of relational databases. Heck, the language is edited by the co0inventor of SQL itself, the the Working Group chairs have been involved with (and edited) the ISO SQL spec, SQL/MM, SQLX, the JDBC and others.

      We also have people who have been involved with document management systems, with XML and SGML documents and with markup theory, in some cases for more than 20 years (before SGML was formallypublished).

      You're right that if part numbers were fished out of the documents and stored in their own column you could query them quickly, and this is what I meant by suggesting extracting metadata from the documents and storing it in relational databases,for example using the Perl DBI or a Java SAX filter with JDBC.

      The neat thing about XML Query is that you don't have to do the extracting in the same way. You can get native RDBMS performace for XPath expressions, and also do a whole bunch of stuff like cross-database joins between relational and non-relational data sources that are generally the province of "middleware" products.

      It's not for everyone, but I think it's not entirely useless.

      Best,

      Liam

      And you're right that XML is not a data retrieval technology and SQL not a markup tool.

      --
      Live barefoot!
      free engravings/woodcuts
    6. Re:XML,SQL,XML Query, Databases by flockofseagulls · · Score: 1
      ankh wrote:
      On changing column types, although most RDMS systems let you do it, and in many cases do so without requiring you to delete and recreate the table, a conflicting type can be a problem.
      Any time the domain (type, range of allowable values) changes significantly, the database and the applications that use it may face complex changes. In a lot of cases -- changing a column from BYTE to INT, or INT to FLOAT, or even DATE to VARCHAR -- the RDBMS will convert the underlying data. In other cases the DBA has to create a new column, convert the data from the old column somehow, then delete the old column and change the name of the new one. Changes that are type-compatible or type-convertible are easy; changes that are not easily convertible are of course hard, but that's not a unique problem with relational databases. And simply making everything a string, as XML does, is not a great solution; it just pushes the problem of type enforcement into the application, and loses valuable metadata (the domain of the column) in the process. Of course there's nothing preventing someone making every column in their relational schema a big VARCHAR, though that prevents the RDBMS from doing any type validations.
      The people working on XML Query are not ignoring the history of relational databases. Heck, the language is edited by the co-inventor of SQL itself, the the Working Group chairs have been involved with (and edited) the ISO SQL spec, SQL/MM, SQLX, the JDBC and others.
      I assume you are referring to Don Chamberlin. If you haven't seen it you may want to read this article from DBAzine.com:
      If You Liked SQL,You'll Love XQUERY .

      There are several more good articles on XML databases and relational databases in general at the Database Debunkings site.
    7. Re:XML,SQL,XML Query, Databases by Ankh · · Score: 1

      I think there's a big divide here. Imagine that you had 100,000 documents with multiple schemas, and that the schemas change over time, and that you're not in control of the schemas. Example: Airport Transport Association specification for documentation (I was involved in a $40m document management project in that area once. A single repair manual is tens and sometimes hundreds of thousands of pages, and the average technician needs to consult nine of them for each repair, in a sequence mandated by FAA regulations).

      Your use of SQL here will probably (I didn't check the details) work fine, as long as you mine the information correctly out of the XML/SGML documents, and reassemble the document correctly when they need to be edited. I've used such a system in fact. It worked, but it was slow (e.g. 45 minutes to copy a document on a high-end SPARC system at the time) because of all the joins needed to reassemble all the parts of the document.

      Saying "just use normalised data" doesn't really cut it -- how do you noralise an HTML paragraph? What do you do when you need to look into phrase level markup ("<step>Take one <part><pn>1984</pn><desc>size twelve wiper bush</desc></part> and...") when you don't control the form of the XML data you receive? Yes, split it up into (step, "Take one", ....) etc.list, and store that in the database, but the queries then start to get much more complex than the ones you've outline.

      Well, I don't think I'm going to convince you :-) so I'll stop here.

      --
      Live barefoot!
      free engravings/woodcuts
    8. Re:XML,SQL,XML Query, Databases by Ankh · · Score: 1

      I was referring to Don, yes, amongst others. I'm not going to take the time to argue over the DBAzine article, though. There are things I don't like about SQL too, and things I don't like about XML, even though I had a part in XML's creation.

      XML Query is not defined on the string/text representation of XML, but over instances of a data model, which can be (for example) created by projections of relational data.

      Time will tell how much XQuery will catch on, but I think at this point we're not contributing very much to the original question about Tamino :-)

      Best,

      Liam

      --
      Live barefoot!
      free engravings/woodcuts
    9. Re:XML,SQL,XML Query, Databases by Unordained · · Score: 2, Interesting

      You're not likely to convince me to use a full XML database, no.

      However, we should consider the viability of storing what you and others describe as unstructured documents in blobs with server-side operations available to you. Just because you're going to have some XML values (that's what they are) in your database doesn't mean the whole thing needs to be XML, nor does it mean you should have to do all operations client-side because you're using a relational database. What it does mean is that if you're determined to have XML values, you should have XML functions that match them. Nothing about the relational model prevents you from having this sort of complexity available to you, most vendors have just been slow to provide tools. A lot more could (and should) be done in the area of functional indexing so you don't have to "take things apart" in order to index them, too. I shouldn't have to create a separate "words_used" table to do full-text indexing on an attribute. To be fair, the relational model also doesn't say you have to break things down into small fields; I think people often get confused about this. RDBMSs usually only come with basic datatypes defined (integer, text, date/time, etc.) but it's perfectly acceptable to have field types of "list of integer" or "set of text" or "mapping of text to pair of integer and string" (yes, I generally code in C++, so STL structures come to mind). Having a field type of "XML stuff" is also acceptable.

      The key element here is, however, the claim that you don't control the format of the data you're receiving. Yes, you can use XML-only tools on your documents because they're all known to be XML documents. But if you truly don't control the input, shouldn't you also have to deal with PDFs, TXTs, TIFFs, etc.? The point is that you do control your input, you have a baseline spec to deal with. In fact, you might have more: you might require all XML documents to have, more or less, the same structure. Do you? If you do, then that's an extra assumption you can use to your advantage. Every time you make such an assumption, you're working toward normalizing your data.

      Abstractly, it would be just as appropriate to require all documents (particularly in the case of repair manuals where there are obvious patterns) to be in a very specific format, relational even. Why not? You've got them all using XML, and that's not necessarily the easiest thing to deal with -- in fact its "model" (if you can call it that) is far from simple, with a lot of gotchas (difference between putting data in a tag's attribute vs. putting it between tags.) You can't query what you don't have; just because manuals are in XML format doesn't mean you can ask "how long will this procedure take" and have it calculate the sum(step.time_required) for you, unless you actually have the data normalized. And if you've got it normalized, then the argument for XML ("it's not normalized and can't be") falls apart. The only reason you can ask how often certain part numbers are mentioned together in the same procedure is, specifically, that you know how and where to find part numbers (not just numbers of any sort) in the repair procedure. To do so, you've got to have assumptions about your document. Assumptions lead to normalization in a rather straightforward manner.

      But to be clear to those who might think us confused: XML is physical (file format), relational is logical (data model). Speed is physical, features are logical. I've seen fast SQL and I've seen dog-slow SQL, just as I've seen fast and slow sorting algorithms, memory-management algorithms, etc. Speed, in general, is easy to improve. What's not easy to improve is a feature-set when you've locked yourself in. And that's why the relational model is important: it's logically, mathematically proven. Its operations are well-defined.

      I'm not sure what we'd try to convince each other of at this point. Pretty much just talking past each other. As you say, convincing seems unlikely.

  23. relational good, but SQL SUX by Anonymous Coward · · Score: 0

    dude, stay with sql.

    I like relational, but as a relational language, SQL sucks. It is the COBOL of relational query languages. Let's move on and find a better relational language. Maybe if SQL didn't suck so much, then people would not be playing with so much XML. SQL gives people a legit reason to complain about existing relational systems. Let's remove that wart.

  24. I remember Software AG's presentation by Canthros · · Score: 1

    I sat in on one over a year ago. Thank heaven it was all technical people.

    We ruled it out because of expense, IIRC. Looked like a really nice product, if you had big bucks to spend on a document management system (which is what we were after). I did not get the impression that it was any sort of replacement for a proper RDBMS--speed was acheived, from what I remember, by storing the data hierarchically. And not abstracting any relational features on top of that.

    There was another product out of Canada called TextML that we looked at. Significantly less expensive, but at least as limited. And the company or product one had some sort of beef with certain portions of the XQuery rec that was in place at the time, so they refused to implement parts of it until it was done their way (ISTR it was full-text searching, but am not sure). I distinctly recall at least one very irritated reply from one of their folks about how they weren't going to implement it because it was a bad idea or something (time obscures details, sadly), which prompted some digging, from which it emerged that they had submitted an alternative proposal.

    Anyway, they looked like nice solutions to storing and accessing XML documents, but a terrible solution for storing or retrieving data in XML, because they were both significantly geared toward text searching and CMS-type stuff. In the end, we used neither. Our need was not so pressing, and our budget was.

    --
    Canthros
    1. Re:I remember Software AG's presentation by Ankh · · Score: 1

      The XQuery Working Group is indeed still working on text retrieval (and also on update facilities so that you can change the data).

      Full text was split off so that the work didn't add any more delays to the main specifications. There are implementations of drafts, but they should be considered very early.

      There are open source XML Query implementations too, of course.

      Liam

      --
      Live barefoot!
      free engravings/woodcuts
  25. OT -- sig by mooingyak · · Score: 1

    I almost suggested adding

    if ( 0 ) { printf("enough"); }

    But then I realized I was on the wrong verse. Can't think of anything for "I set it up".

    --
    William of Ockham had no beard. The most likely explanation is that it was chewed off by squirrels every morning.
  26. Don't bother with an XML "database" by Randolpho · · Score: 3, Insightful

    There are two possible reasons you're using XML to transport your data from one database to the other.

    The first is that you just heard XML is a great way to transport data, and decided to use it.

    The second is that you're using the XML for more than just transporting data from one database to another; you're using it at some point with your application.

    In either case, the bottom line is that XML is not good for you. If your data fits in a relational database, you should USE RELATIONAL MEANS TO ACCESS YOUR DATA. Don't use that nifty new XML reader to access your data. It's not nearly as fast or flexible as basic SQL; it's actually much more trouble than it's worth.

    If you're just transporting data from one relational database server to another, use a flat file, or better yet raw SQL dumps. If you're accessing the data with an application, use SQL or the underlying API.

    The only reason you *ever* need to use an XML database is when your data doesn't fit into a standard relational schema. In fact, if you try to fit standard data into an XML database, you're much more likely to end up with a ton of overhead, both in storage and speed.

    Fortunately, non-relational data is extremely rare. So rare, in fact, that I've yet to see a non-contrived-proof-of-concept "real life" example.

    --
    "Times have not become more violent. They have just become more televised."
    -Marilyn Manson
    1. Re:Don't bother with an XML "database" by platypus · · Score: 1

      If you're just transporting data from one relational database server to another, use a flat file, or better yet raw SQL dumps.

      You are narrowing down the aspect of transporting data too much. The most aspects of transporting data is not between db servers (with the same db schemas) but between application servers, which might use relational databases as their backend to store data - but in totally different schemas.

      So, you want to exchange data not between the database backends - whose db structure you might not even have under control, there are a lot of applications out there which totally abstract away the database and create need/create/maintain their own structure in the database - but between the app servers, where the knowledge about the data semantics is and where you want to implement the mapping.
      And if you are looking for how to sanely exchange data between the applications, XML can be very handy, especially if you don't control all of the involved applications. That way you can very strictly specify the message structure with XML Schemas and implementations in the different applications can be verified to a large extend without having to connect the applications, just by using the Schematas.

    2. Re:Don't bother with an XML "database" by Anonymous Coward · · Score: 0

      > Fortunately, non-relational data is extremely rare.
      What about Word documents? Or Excel spreadsheets? Or PowerPoints?
      If you define "non-relational data" as "non-normalized data stored in an RDBMS", you are probably correct. However, if you define "non-relational data" as "data not stored in a database", you are probably totally wrong.

  27. eXist XML DDatabase by pajama · · Score: 1

    I have used eXist XML:DB.
    It supports XPath and XQuery, give it a try:

    http://www.exist-db.org/

  28. Combination by infohord · · Score: 1

    I have been playing for some time with the idea of "best of both worlds". With some applications, especially document oriented applications, there are benefits to working in XML directly. One of the largest drawbacks is search and query. I have played with the idea of storing most of the data in XML and using an RDBMS to index it. Often we can meet 90% (arbitrary) of searches with just a few well defined fields. I am exploring running each file through a SAX parser or even simpler string regular expression scripts on the way to the file system and extracting a few key fields to a RDBMS. Then execute my quries against the a combination of the RDBMS and the file system. Anyways, in direct answer to your question, no I have not used that software.

  29. Probably a stupid question, but... by ArtStone · · Score: 3, Insightful

    When dealing with XML, you need a DTD that defines the data contained in the XML expression in order to parse the string into meaningful data structures (right?)

    When an "XML database" is changed, is the data prior to the change left in its old XML format pointing to the original DTD, or does it require conversion of all existing data? How can the data be accessed while that conversion is going on?

    How would the method of implementing a schema change be communicated to other places which have already archived copies of an old XML data entity? DTD only defines current state information - it doesn't communicate "If XYZ = 1 in DTD.v1 then set XYZ2 to "A" and set new field ABC to "foo" for DTD.v2". Each iteration of change would become increasingly more complex unless the data is converted.

    This is not to say that the same issues don't exist with SQL or relational databases - but just abstracting the organization of the data doesn't mean that your problems are solved.

    Lately, I've been using mySQL - and the developers have some curious ideas about the "real world". Even the most trivial changes to the database schema require mySQL to copy and rebuild the entire table... like adding a new index or adding a new field at the end of the table. When tables start having millions of rows, that means this becomes a much less attractive product.

    The rationale for doing things this way had two reasons - first, it was the easiest way to implement schema changes. Second, "People should never be changing data schemas in a production environment".

    Oh, really? When did we regress to the idea that databases can go down overnight in order to back them up and to implement schema changes?

    --
    Final 2006 "Proof of Global Warming" US Hurricane Count -> 0
  30. half of one / six a dozen of another by Doc+Ruby · · Score: 2, Insightful

    The SQL DB doesn't store its data internally as "SQL". It's stored internally in some proprietary binary format. Which is optimized for the peculiar performance profile of that RDBMS. Relational DBs use different algorithms for working with their data, and the data is stored with either redundancy or precomputed values, depending on the unique algorithms. From which they derive their higher performance. SQL is just a high-level (more "human") language interface between programmers and the DB engine. Which was specified in such a way that it's not interchangeable across different DBs, partially because it does not specify a schema description which can be packaged with the data to be decoded with the context of that schema.

    XML is designed to package schema info with the data exchanged between DB instances. It's higher level, more verbose, and not optimized for data processing (except for the import/export). So you'd better be absolutely certain that your overall system performance is bottlenecked by your interchange processing performance, more than it will be bottlenecked by the "XML-native" DB processing XML data, which isn't optimized for performance.

    --

    --
    make install -not war

  31. dude you must be a mediocre database programmer by Anonymous Coward · · Score: 0

    When words like caching, views and snapshots don't show up in your speech.

  32. Ananova was built on Tamino by munkinut · · Score: 2, Informative

    I worked on the http://www.ananova.com/ website, which was originally built on Tamino. Tamino couldn't handle the load and was a nightmare to admin at the time. Doubtless SoftwareAG will have fixed the lack of backup and restore tools by now. Not soon enough for us to migrate the whole thing onto Oracle shortly after release though.

    --
    re-invent wheels ... you never know
  33. Project 90% XML based by golgoth14 · · Score: 3, Informative
    I'm working on a project using XML Native database, Java JAXB and Mozilla platform.
    Actually, I'm using exist-db.org and it works fine but I have some performance problems when I want to sort data.
    I have tested Ipedo, TextML, dbXML, XHive, ...
    and TextML was the faster but it doesn't support XQuery.
    Ipedo was the faster with an old XQuery version support. I think it's the best product because it provides an RDBMS bridge to query with SQL and XQuery and some other features like XViews.
    My application use XQuery/XSLT to read data and JAXB to check and execute business method before storing.
    I think the main problems with XML Native database is performance and no transaction support but document locking.
    But, the advantages are:

    • Powerful query language XQuery
    • No code to modify data directy in the database.
    • Database export as XML documents, modify with "notepad" and re-import.
    • Easy and quick test data editing. Don't need to use SQL and Java to insert the test data.
    • Easy database deployment without DBA !!!
    • Power(full) Full Text searching
    I think XML Native databases are to use when your application needs to manipulate a lot of text data.
    Like CRM, Groupware, Administrative application, fulltext and contextual searching, ...

    You must try at least one XML Native Database in your life to compare it with RDBMS and Object databases and make your own opinions.
  34. use RDF by Anonymous Coward · · Score: 0

    I've done some research for myself on xml databases. My conclusion is that it does not scale very well.
    Once your xml files get to be Megabytes the whole xpath tree query 'concept'doesn't work anymore. My best bet would be the xmldb from sleepycat (berkeley db).
    Because xml is tree structured it's very complex to efficiently work with. I would suggest using an RDF 'triplestore' DB. In RDF everything is a triple (a flat format). This means it's easier to manipulate, because your db is nothing more then a list of triples.
    Also while RDF is not a pure XML technology, you can serialize it to XML. It's a W3C recommendation, and there's plenty of software/frameworks around that do work and perform and are stable (I use redland myself http://librdf.org).

  35. Where's your bottleneck? by PizzaFace · · Score: 2, Insightful

    You say you are concerned about the overhead of wrapping and unwrapping XML, so you are considering using a database that keeps everything in XML all the time. I think you are trying to solve the wrong problem.

    Have you timed the job of wrapping/unwrapping XML? My guess is that on modern hardware, that task is trivial. Bandwidth is a more common bottleneck for XML data transfers, and that problem is usually mitigated by compressing the XML before transfer. But I never heard anyone complain about a CPU taking too much time to extract the data from XML.

    If your application queries the data selectively, you will probably find that the difference in query-processing time, between a traditional SQL database and a native XML database, more then makes up for any difference in format-conversion time.

    Let your database use its own, efficient, optimized internal data formats. XML is much more suitable for data transfer than for data manipulation.

  36. If all you want is to store the XML data by Ulrich+Hobelmann · · Score: 0

    Why don't you just use files and a http/ftp/webdav or whatever server?

    Don't make things more complicated than you have to!

  37. Choose wisely by drsDobbs · · Score: 1

    I think xml databases, or xml extensions to relational database, have their uses but they are (off course) not the ultimate answer to all our problems. In specific situations they certainly live up to the hype in others no, but that's also the case for relational databases.

    I think it is vital to make a difference between the actual storage and the way to access it, the 'interface'. With Oracle (since 9iR2) one can store the xml as a clob, as an XMLType and object types that are based on your xml complex types but also (traditional) in a relational model. To access data one can use different strategies: the more or less traditional way using SQLXML (maybe as xml view) to present the relational data as xml or using XPath or XQuery. It is also possible to present your xml data in a relational manner. All the options have their pros and cons. But I want to stress it again: there is not a single answer, it depends on the situation. If only for storage it may be a very viable options to store the xml as one big clob. For extensive searching it may be better to use the xsd based XMLType. But for retrieving relational data the use of an xml view may be the better solution.

    I my current project We use Oracle (9iR2) for centrally storing our xsd's and xslt's. Our applications can retrieve them via standard http calls and with a webdav client like XML Spy we can easily maintain them. Heck, in this case we are even not interested how they are actually stored.

  38. Another "I agree" by Anonymous Coward · · Score: 1, Interesting

    Juat as background - I work in Application Integration and spend much of my time dealing with moving data to and from RDBMS systems (Oracle, SQL Server, etc) and various external formats (EDI, flat file, XML, etc.)

    It sounds like you are in danger of changing the original data store (the relational database) in order to preserve a data transfer mechanism (XML). This is probably a bad idea.

    Why is the data in the database to begin with? Is it the database for some other business application? Probably - be carefull not to adversly impact that other application by changing it's database.

    Why is the data being moved as XML? Do you have control of both ends of the transfer or does some outside entity control the data format?

    Be carefull not to solve the wrong problem. Is the _primary_ business requirement to move data between two business systems, or is it to store and transfer XML data? It almost always is the former, even with an externally driven need to use XML to transfer data. In which case: leave the database alone and deal with the problems integrating the data tranfer (XML or otherwise) with the database.

    And the parent post is right - XML is a bit of a fetish right now. This leads to it getting mis-applied to a lot of problems.

  39. sql server 2005 supports xml as internal datatype by Anonymous Coward · · Score: 0

    sql server 2005 supports xml documents as an internal data type that can be selected, inserted, updated, deleted, and indexed. The indexes on xm include all the nodes / attributes of the document, in order. When you query attributes / nodes out of the xml document, they are returned in the order which they occured in the document.

    Its really VERY fast. It handles xml documents as fast as any other standard data type such as varchar , int etc.

  40. No by iamacat · · Score: 1

    Whatever else XML is, it's not a good representation of data for queries and transformation. Different pieces of data don't have a fixed parent-child or one-to-many relationship in real life tasks and so shouldn't be stored as a tree.

    Store your data in a well-researched normalized, relational form and format your query results as XML if you like. How is the progress on binary XML to avoid killing your network and CPU on your thin client?

  41. It's not only a "text file format" by scrutinizzzer · · Score: 1

    For instance, eXist DB (http://exist-db.org) serializes XML data as binary DOM tree. So, it doesn't need to parse text XML data all the time. One more thing is Cocoon (cocoon.apache.org) which converts XML to a pipe of SAX event processors.

  42. Missinformed Story by oliverthered · · Score: 1

    . The current solution exchanges data via XML, but the data itself is stored in a SQL Server database. There is a concern about the overhead involved with wrapping and unwrapping the XML to get the data in and out of a relational database.

    1: you can always put the XML in a text field if you want to.

    2: I would be far more concerned about using raw XML, it's not RIFF so you can't goto a single point without parsing the whole file, it's not indexed, so you have to search the whole file for the data you need. etc....

    3: If the data was in XML to start with you would probably need to use XSL to transform the data, which because of 2 is far slower than just wrapping the rational database data into XML.

    Basically you want a back-end that's good for searching and retrieval (e.g. a rational database) that can export data in a portable format e.g. XML.

    A friend of mine used to have to process tones of GIS data sent in XML format, the processing the XML format was the slow bit, everything else (like dishing up datasets based on the data) was nice and fast. The company held several TB's of data, so I would say it represents the scalability of XML.

    BTW, I am all for XML and all against proprietary data formats that do the same thing but worse (e.g. /etc/fstab)

    --
    thank God the internet isn't a human right.
  43. Objectivity or Caché? by mosel-saar-ruwer · · Score: 1

    If your current relational database schema is either 1) small flat files or 2) a few big tables with most/all of the data stored in "blob" columns: i.e. blobs, clobs, byte arrays, or big varchars. You might be a candidate for an XML database. I'd get two experienced DBA's to agree there was no realistic way to normalize the data, first, but that's me.

    We're doing "scientific" computing, and we're finding that classical "SQL/RDBMSs" just don't cut the mustard:

    CLASSICAL RDBMSs: Essentially "ASCII" languages [SQL, VB, Java], with the idea of "typing" thrown in as an afterthought
    SCIENTIFIC: Needs very strong typing not found in ASCII languages like "SQL" [e.g. Intel/AMD 96-bit doubles, Altivec 128-bit doubles, Sparc 128-bit doubles, LabVIEW 128-bit timestamps, etc]

    CLASSICAL RDBMSs: Very 32-bit in nature. Examples: SQL blobs max out at 2^32 bytes, Java can't take 64-bit longs as array counters, etc.
    SCIENTIFIC: HUGE datasets; easily break the 2^32 barrier; 64-bit language support a must

    Classical SQL/RDBMSs are fine as long as your data is very ASCII-ish and very short [e.g. first name, last name, street address, zip code, phone number, SSN, etc]. But as soon as you start dealing with very large datasets containing strongly typed data, you're SOL with a classical RDBMS.

    Anyway, two systems we had been thinking about were Objectivity and Caché. Any thoughts?

    Obviously it would be nice if we could get the standard functionality you'd expect from a mature RDBMS package: Seamless backup to a failsafe server, seamless integration with a tape backup system like BackupExec or ArcServe, seamless integration with an industry standard authentication service like ActiveDirectory or Novell Directory Services, etc.

    Thanks for any advice you can offer!

    1. Re:Objectivity or Caché? by rossifer · · Score: 1

      Your requirements are outside of my experience. My first thought was that you could probably add the needed column types to PostgreSQL and Oracle without too much trouble, but you've still got other problems.

      Usually, you select an RDBMS over other means of data storage when one of the requirements is to allow future users to ask currently unknown questions quickly. With the data sizes you're talking about, I don't think that any common means of db optimization will allow for particularly fast queries and I suspect that that requirement isn't applicable to the raw data.

      When I did scientific computing at Fermilab in the early '90's, we left the raw data on tape and created a whole series of databases for analyzed results. We used a 64-bit physical/virtual addressing system to generate references back to the taped data blocks, but pretty much left the raw data alone after processing. At that point, 2.4TB was considered an unmanageable quantity of data, but we did pretty good...

      Back to your problems: I've never looked at the two systems you mention, but I would suggest rethinking your data management approach at the same time you are thinking about vendors. If you want blobs over 4GB, what you want is to store metadata in the RDBMS and the data block on disk as a file. The reason most RDBMS's set the blob limit to 32kB (not 32MB) by default is to discourage mass storage of untyped data in the database.

      Sorry I couldn't help more.

      Regards,
      Ross

  44. file systems by mosel-saar-ruwer · · Score: 1

    Back to your problems: I've never looked at the two systems you mention, but I would suggest rethinking your data management approach at the same time you are thinking about vendors. If you want blobs over 4GB, what you want is to store metadata in the RDBMS and the data block on disk as a file. The reason most RDBMS's set the blob limit to 32kB (not 32MB) by default is to discourage mass storage of untyped data in the database.

    Right - I don't have any problem with a classical RDBMS serving as not much more than a front end to a file system [although, at that point, you're essentially dealing with little more than what M$FT has been touting as "WinFS" lo these many years].

    The problem is that, as far as I know, there is no standard [non-proprietary] way to store a "file pointer" in a classical RDBMS package. That gets back to the ASCII-ish nature of classical RDBMSs and their access languages [SQL and VB in particular]: They're great if all you have is very ASCII-ish [poorly typed] data [e.g. names, street addresses, zip codes, phone numbers, SSNs, etc.], but the minute you throw something at them that needs to be strongly typed [e.g. "what follows is a pointer to a file in the NTFS file system that lives on the D: drive" or "what follows is a pointer to a symbolic link to a file that was mounted from an NFS share on a server that resides across an FDDI line to a server farm in Lower Gondwanaland"], then you're just screwed.

    With the data sizes you're talking about, I don't think that any common means of db optimization will allow for particularly fast queries and I suspect that that requirement isn't applicable to the raw data. When I did scientific computing at Fermilab in the early '90's, we left the raw data on tape and created a whole series of databases for analyzed results. We used a 64-bit physical/virtual addressing system to generate references back to the taped data blocks, but pretty much left the raw data alone after processing.

    Nowadays, everybody is bumping up against the 2^32 barrier.

    For instance, the new standard in audio recording is 24-bit [i.e. 3 byte] samples at 192K samples per second. But then a single channel comes in at

    (3 bytes per sample) X (192,000 samples per second) X (60 seconds per minute) X (60 minutes per hour) = more than 2GB per hour
    And that's just one channel for one hour; if you're doing something like 5+1 [surround sound] audio, you're looking at more than 12GB per hour [and God forbid that your studio session should go ten or twelve hours at a time].

    Now SQL blobs max out at 4GB [2^32 bytes], and Java is an equally 32-bit language. For instance, nothing like the following will work in even the most recent versions of Java:

    public class SixtyFourBit
    {
    public static void main (String [] theCommandLineArgument)
    {
    long theLong = 1;
    theLong <<= 32;
    theLong += 1;

    long [] theLongArray = new long[theLong];

    for(long i = 0; i < theLong; i++)
    {
    theLongArray[i] = i;
    }
    }
    }

    So if you want to have any hope of getting at data that is inherently 64-bit in nature, then you need something like a very recent version of C++ [at least as recent as the 1998 standard], or C# itself.

    PS: You mention some time at Fermilab; I think they're using Objectivity at CERN, which is what caught my attention in the first place.

  45. Obligatory Windows Server 2003 commercial quote by game+kid · · Score: 1

    "XML what?"

    --
    You can hold down the "B" button for continuous firing.
  46. Two things by Anonymous Coward · · Score: 0

    1. You can download Tamino from this website to trial it: http://www.xmlstarterkit.com/
    2. There is support on this website: http://forums.tamino.com/ I recommend that you take a look at these pages, then download the trial version and try it for your needs. Maybe it will be just what you need, maybe not. Spend a few hours to find out!