Slashdot Mirror


XML and Transcoding - How Would You Do It?

morzel asks a doosy: "XML is one of these words everybody's talking about yet no-one really knows how to use it in specific applications or server technologies. At the Apache XML Project, some work is being done on integrating XML/XSL in the server itself, but personally I like IBM's idea of a transcoder in between a range of (XML) servers and a range of clients. But... how can it be done?" (More)

"Suppose you have to develop an on-line application, and you'd want to go with XML on the server side, and everyday browsers on the client side. Portable platforms like Palm and WAP-enabled phones will probably be a client platform that is being used frequently.
What tools -open source or commercial- are available to accomplish this?

The elements of the system are:

  • XML Enabled Database system: Data is retrieved by the transcoder using HTTP or your favorite protocol
  • Transcoding gateway: should translate the XML data using XSL (or another way) to a form readable by the client. The exact translation or the XSL to use can be set by the server (included in the XML source), or be detected by the gateway.
  • Browsers of all colours and kinds.
A typical usage of this system would be the publishing of an on-line application without having to bother with client troubles except for writing the XSLs. I do web development, and the amount of work that goes into making sure every platform works as it's supposed to be is way too much in comparison to the functionality of the system. Specially when exotic clients like PDAs and WAP mobile phones are requested client platforms (e.g.: a sales follow-up app), the burden of getting everything working and having a UI that does the job is a real nightmare...

XML is the wave of the future, that's for sure... But what tools are available to actually incorporate XML in a system that can do all things we poor webdesigners dream of?

All suggestions welcome! "

15 of 139 comments (clear)

  1. Standard formats needed... by pb · · Score: 3

    Lisp has been doing this stuff forever. Maybe it'd be a good idea to look into the formats that expert systems use to exchange data; I bet they're pretty generic.

    Of course, that won't happen, we'll all make our own stripped-down, human-readable versions, with big gaping flaws, until someone either standardizes it, or hides something nasty and binary with a GUI and dominates the market (*hint* I wonder who wants to use XML and "open standards"....) So let's try to come up with a real open format now, instead. :)
    ---
    pb Reply or e-mail; don't vaguely moderate.

    --
    pb Reply or e-mail; don't vaguely moderate.
  2. Mino XML parser by ChipX86 · · Score: 3

    Well, this is kind of a shamless plug, but I'm developing a XML parser at http://mino.portaldesign.net. It is LGPL. The library can be used in any programs and the parser that comes with it can be used for converting XML files to HTML on-the-fly.

    I'm working on XSL support (so people can easily say what XML tags should become in HTML), so that should be done in the (hopefully) near future. For now, feel free to download the latest alpha and play with it.

    In the near future, I plan to have support for databases, CSS, XSL (as mentioned above), and a few other XML-related technologies.

    People familiar with C/C++ should easily be able to write custom modules for converting from XML to HTML using the library by looking at the examples in xmlhandlers/. Anyone want to help develop this?

  3. On the browser by Gleef · · Score: 3

    Ideally, browsers should develop to the point where they understand XML as well as HTML and XSL as well as CSS. There has been significant effort to do this in the Mozilla browser, the XML/CSS combo works quite well, and the person developing an XSLT (XSL Transformations) engine for Mozilla is talking about having something useful around May. Similarly, Internet Explorer 5.0 has a base understanding of XML (styled with CSS), and surely plugins for decent XML/XSL encoding for IE are likely to appear soon after Netscape shows that it's a feature people demand.

    In the meantime, there are some Java Servlets out there to do the transformation on the server side. The server will grab the XML and XSL file, do transformations, and output HTML (or whatever format) to the client. I haven't played with them enough to recommend one as being particularly better, but there's some handy stuff out there.

    ----

    --

    ----
    Open mind, insert foot.
  4. XML FAQ by jkorty · · Score: 4

    The XML FAQ is here.

  5. XML and XSLT are the way to go by __donald_ball__ · · Score: 3

    Hiya. I'm one of the authors on the cocoon project and I admit my biases upfront. I think, and many of you seem to agree, that the web publishing industry (more generally, the electronic information publishing industry) is in desperate need of a standard way of seperating (and mixing) content and design. XML (a generic tree description language) and XSLT (a generic tree merging and transformation language) offer a very elegant way of accomlishing that goal. The cocoon project is currently focused mainly on two goals: creating (and implementing) a standard way to create XML fragments dynamically, and determining (and implementing) the best way to maintain a site back-ended by XML and XSLT. I encourage brave developers to come check it out - the basic stuff (XML+XSLT -> HTML) works very well, the more elaborate stuff (SQL,LDAP,POP3 -> XML+XSLT -> HTML) is coming along very well, and we're playing with a very interesting take on the whole *SP paradigm called XSP - I was personally highly skeptical at first but am beginning to see the light.

    As far as IBM's product goes - once you drill down into the technical details, it looks very much like cocoon. Interestingly enough, some of the closed source components that IBM's product relies on were donated a few months back to jump start the xml.apache.org site (namely, the XML4J parser and the Lotus XSLT processor). The main thing that IBM seems to be offering here is its 'transcoder' technology - which may be interesting and certainly bears investigation, but for my money, you're better off checking out (and having a voice in the development of) the open source apache projects.

  6. i'm workin' on it, dammit. by Uberdog · · Score: 3

    xml rocks. every piece of online information should be in xml. usability on the web is horrible right now. the fact that search engines and yahoo-style directories are the main entrances to the web is horrific. the fact that google can't find me a single page on gkrellm (a kick-ass system monitor for linux) pisses me off to no end when i'm bored with my current skin. with everything in xml the extraction of data would be much simpler and therefore the interfaces to the web would be much more effective.

    the current problem is that

    1. lots of people know what xml is, but don't really know what to do with it.
    2. the processing of xml data at this point is very intense. rendering an xml web page (or add in the scaling of images, too, and call it transcoding as ibm does) takes a lot of work on the server side and there's not currently a way for it to be rendered on the client-side (browsers don't support this yet).

    i'm working on a solution and need help...so it's actually pretty smooth that this article came out in ./ at this point.

    in a huge blow to problems #1 and #2 above (as well as quite a few others), i am initiating the creation of Uberbia, the most open source of web sites. the backend is zope, which is a tres cool open source web application environment which can conveniently output its internal data as xml. what this allows is for information to be created in zope and stored in zope's native db format and served up as web pages (for instance) quickly, but then also output as xml. problem #2 solved. and when browsers can handle the xml...shove it out that way.

    zope also allows for information to be very easily created and shared. this is one of the main goals of Uberbia.

    the idea for Uberbia was born out of the fact that the Open Source community has been living in an environment of relatively closed content management on the internet. Sure, one could create a web page and post a HOWTO they just wrote. And then post a message to a relevant mailing list letting everyone know that resource is available. And then submit the HOWTO to the LDP and wait for it to be approved and posted on the LDP page. Uberbia will remove a lot of this hassle and allow the Open Source community to easily create and manage it's content. and the data will go into an xml-aware application. problem #1 solved, at least for the Open Source community. well, okay...so i'm still workin' on it, but it'll get solved, dammit.

    on trying to figure out what i was talking about, Ethan (a friend and to-be-developer of Uberbia) wrote:

    sounds to me like you want to build an open-content information space. am I totally off-base? Bring "source" up to the next level of abstraction? Collaborative environments of information?

    yup. he gets it. but the possibilities that arise from having such a body of contributors and open content in xml are insane. for example, imagine turning on a "newbie" feature in Uberbia that automagically inserted links to the proper entry in the jargon file for every word that was defined there. not difficult with zope and the data in xml

    so, essentially i'm responding to this ask slashdot question by calling out for help with an open source project that wants to solve this problem and others. some work has been done, but there's a lot more to do. sourceforge is graciously both hosting the development of this and hosting the project itself. if you are interested at all in the development of something like this or have some really smooth-ass ideas, let me know or join the mailing list.

    i hope some of that made sense.

    word, Uberdog

  7. You're looking at the problem the wrong way by X · · Score: 3

    I think you're not looking at the problem the right way. Typical applcation development breaks things up into domains. These layers usually include a persistence domain (your database), a business logic domain, an application domain, and and a presentation domain.

    XML really doesn't change any of the domains EXCEPT the presentation domain. You don't need an XML enabled DB, as you NEVER want to have the outside world talking directly to your DB. XML (combined with HTTP or whatever else) is one way of presenting your application. The various transforms that you would do using XSL are just "aspects" of the same presentation. So this doesn't completely change the way you build applications, just how you do your presentation.

    I've written more than a few apps that were available both as GUI applications and web servers. Both versions shared the same code base up until the last layer.

    As far what you need to do an XML system, I think it's a lot like an existing HTML system. With HTML, you need a database server, an app server, and a web server for an HTML system. The web server is normally scripting enabled so you can do handy transforms with the raw data.

    With XML, it's basically the same concept, except your "XML server" needs to be using XSL to script transforms of the XML data. What we currently don't have is a very good way of doing this. Ideally you'd actually want the CLIENT to do the transforms as the XML data is usually much terser than whatever the XSL will generate. However, nobody trusts the clients to do this, so you might as well go with the XSL engine on the server.

    --
    sigs are a waste of space
  8. Some examples... by evlist · · Score: 3
    Hi,
    <quote>
    But what tools are available to actually incorporate XML in a system that can do all things we poor webdesigners dream of?
    </quote>

    There are many tools available to build such a system.

    To mention only Open Source projects, I could suggest using Apache JSERV with Apache Cocoon as a framework, Castor or Quick to bind XML data to Java objects and a OODBMS like ozone or a RDBMS like PostgreSQL.

    These are my favorites ;)

    They are very powerful and highly flexible, but the price to pay is that they are rather complex to use, that you need time to get on speed with them and that you loose focus on the core techniques behind them.

    To try to get a good understanding of these core techniques, I have set up some simple examples showing how one can bind XML documents into java objects, store these objects in a OODBMS and use them in a XSLT sheet both in standand alone mode or as a servlet.

    These examples are available on our web at http://downloads.dyomedea.com/java/ and a mailing list has been created to exchange and discuss such basic tips.

    Hope this helps.

    Eric van der Vlist

  9. Re:Beware XSL by anthonyclark · · Score: 3

    Looking at any non-trivial XSL stylesheets, you can see what a generally bad idea it is. My advice would be to use a real programming language with DOM bindings.

    I wouldn't write off XSL on the strength of that article at xml.com...

    When I first looked at XSL some months ago, I thought that it would be a messy and difficult language. I was wrong. XSL, IMHO, is the right solution for translating XML into pretty much anything. Yes, it does have a steep initial learning curve (much like our favourite OS :-) but once that is out of the way, you understand why the language is so useful. Why does it look so unwieldy? Because it's a "dialect" of XML. (Which I think is a good thing - it shows how flexible XML is) Typical XSL is as simple as saying "if you encounter this XML element, do this with it." Editing XSL text is really quite easy with the correct syntax highlighting. (TextPad is a good editor under windows)

    As for non-trivial XSL stylesheets? On our project, we have written XSL to transform our XML data into binary outputs. The stylesheets used ran into tens of thousands of lines! I think that qualifies for non-trivial in anyone's book. I admit that the XSL is difficult to read, but show me any source that is easy to read when >10k lines...

    XSL as a complete solution? No. Even in a relatively simple XML to HTML documentation tool I wrote, I called the XSL from a JavaScript app that handled things like file access and other helper functions. This was under Win2k, using the built in script engine to call the XSL via COM. (yes, even MS get's things right sometimes) The point is that XSL is better for tranforming XML than trying to use a DOM-manipulating language binding...

    On another note, why does everyone assume that XML is solely for exchanging data on the web/net? I've used it for documentation, log files, test cases, application persistence and application exchange formats. It's a lot more useful and flexible than people think.

    --
    ----- Documentation is worth it just to be able to answer all your mail with 'RTFM' - Alan Cox.
  10. XML Script by rjb · · Score: 3

    You might like to check out this page. One of the things they have is an interpreter (X-Tract) that reads a template (written in XML!) and performs pretty much arbitrary transformations on XML input data based on this template. Looks pretty cool and simple to use. X-Tract is free for download. Funny I didn't find any info on license terms though.

    I tried doing some very simple stuff with the Linux version, and the only complaints I have are:

    • fetching the input data via HTTP doesn't seem to work (as it should according to the docs)
    • when I tried calling it from a CGI it freaked out, seems that env variables override explicit XML Script commands in the template -- not what one would expect. Fixed it by clearing the environment
    • the docs, though pretty exhaustive, are not very reader-friendly (to me)
  11. XML and MetaHTML by hqm · · Score: 3

    You should take a look at MetaHTML (www.metahtml.com), which is a sort of macro
    like programming designed to emit HTML (it
    was developed before XML was invented). It
    was developed by Brian Fox and myself when
    we had a company called Universal Access (ua.com). MetaHTML
    is superior in some ways to XSL, because it is
    more a general purpose programming language, yet
    it's evaluator does a lot of the work of parsing
    XML syntax expressions. We used to use it
    to do many XML-ish things, such a generate the
    MetaHTML documentation automatically from a
    structured representation in the database.

    MetaHTML has also been under GNU public license since about 1996.

  12. Re:Grr.. by PigleT · · Score: 3

    Well that's unfortunate. A very quick trip straight to the Web Consortium shows their pages on XML straight up, complete with links to the XML FAQ and of course, just what you always wanted, the XML 1.0 Spec. If that's not an adequate definition, read the source for your favourite parser!

    --
    ~Tim
    --
    .|` Clouds cross the black moonlight,
    Rushing on down to the circle of the turn
  13. We already do this. Our website is live. by evilandi · · Score: 4

    I work for AssureSoft whose AssureWeb website is live (work out the URL for yourself, it's not obscure but we don't want to be slashdotted). The site provides financial information to subscribers. You have to have a username and password to get the full range of services- we dole out passwords free to British independent financial advisors.

    Our first XML-based service is a quotations system which allows users to get a quote for a pension or mortgage from a wide range of companies in real time (typically 5-20 secs).

    Why we needed XML

    Our problem was that each company had a slightly different way of asking for customer details. We decided to create an XML data type definition, now adpoted as industry standard by UK financial standards body Origo. This standard means that we can present pretty much the same input form, with a few optional extras, for any financial product.

    The main use of XML is in passing the input data from our web server to the companies' quotes servers.

    Layer 1: Client Browser
    Layer 2: AssureWeb server
    Layer 3: Company Quotes server

    The XML goes back and forth between layers 2 and 3. We compile standard CGI GET/POST client requests into XML on the webserver and fire them at the quotes server. The quotes server fires back a response as XML again, and we parse this and present it to the client as a standard HTML web page. There is no XML on the client side.

    Provided the company quotes server conforms to our XML standard, we can use that server for quotes. Adding new products or companies becomes a lot easier- typically we can go from scratch to beta with a new product within days. Previously it would have taken many months to write and test each individual product. XML allows us to re-use both code and input/output standards to a level never seen before.

    Our next step will be a comparative quotes service. Users will be able to enter one set of data, and fire it at multiple companies. They will then get back multiple quotations, from which they can select the best based on their criteria. Effectively we will be having multiple concurrent layer 3 transactions.

    --

    --
    Andrew Oakley - www.aoakley.com
  14. A small warning... by tgd · · Score: 3

    A small warning for those thinking about moving down the XML/XSL route who haven't done any testing on it:

    Its slow. VERY slow.

    Most XSL implementations have significant performance and scalability issues as compared to more common custom technology for producing dynamic web pages.

    There's no argument that its a better technology, but I've known several commercial web sites that have spent considerable resources developing XML/XSL implementations and having to roll back the technology when they discovered they needed four or five times the number of servers to be able to use it.

    Anyone know of any top-tier sites that are actually using the technology?

  15. Why? by Matts · · Score: 4

    Mind if I ask why you're doing this? XML parsers are off-the-shelf free commodity tools now.

    Spend your time working with those tools (XML4C, expat, rxp to name a few) to create higher level tools. Don't re-implement an XML parser - I can guarantee you it will be full of obscure bugs where you didn't understand the spec, didn't understand how to cope with character encodings, or just did something wrong. This stuff, despite the XML spec suggesting that a graduate could write a parser in a matter of weeks, is hard, and experienced people (such as James Clark) have put out excellent products for all to use under non-restrictive licences. Theres even an LGPL parser already out there called libxml (ships with gnome).

    If you don't believe you'll create a broken parser, see the recent XML conformance tests on XML.com.

    I'd also love to see you move from a non-working XML parser to something supporting XSL "in the near future". I appreciate your enthusiasm, but the XPath spec has some tough little nuts to crack (I know - I'm cracking them right now) and then implementing XSLT from an 80-odd page spec - wow - good luck to you!

    (I'm not trying to poo-poo your project, but so many people start working on stuff that's already being worked on in the open-source community that it's just wasted effort).

    --

    Matt. Want XML + Apache + Stylesheets? Get AxKit.