Pretty Printing From An XML File?

← Back to Stories (view on slashdot.org)

Pretty Printing From An XML File?

Posted by timothy on Thursday October 14, 2004 @10:45AM from the xml-is-be-all-end-all-except dept.

Omega1045 writes "Where I work we are developing a new product that receives an XML document (on a W2k workstation), and we need to format and print said document. We are currently using XSLT + CSS to build a cool little HTML page out the the XML, then use a browser to print out the HTML. However, while HTML is a nice format for display, it is not a nice format for printing. We have messed around with the idea of spitting out Rich Text with XSLT. However, Rich Text is confusing and quite frankly sucks. We are looking for a (free if possible) format that we can translate our XML document into via XSLT, and print. The best idea we have at this point is to translate into a Word or OpenOffice XML schema document, and use one of those applications to print. Other ideas?"

13 of 65 comments (clear)

Min score:

Reason:

Sort:

Postscript? by Hanji · 2004-10-14 10:48 · Score: 3, Interesting

I'm not actually familiar with the details of postcript at all, but it certainly seems a logical format to consider if printing things is your concern.

--
A Minesweeper clone that doesn't suck
FOP by pi_rules · 2004-10-14 10:50 · Score: 4, Informative

Apache FOP Homepage

Very powerful if you ask me. I used it on a project back in 2000-2001 and was pleased with how it turned out at the time. I'm sure the current product is much, much, better than it was back then.
Try PDF by Mastos · 2004-10-14 10:50 · Score: 4, Informative

I have a similiar problem I solve through the use of XSLT and XSL-FO. Use XSLT to transform the XML into XSL-FO. Then, use Apache FOP to render the XSL-FO into PDF.

Another variation is to transform your XML into an HTML subset, then use a standard XSLT to transform the HTML into XSL-FO. A similiar technique is used by Aurigadoc to create all sorts of output formats using an XML source.
XSLT-FO by JumpSuit+Boy · 2004-10-14 10:54 · Score: 4, Informative

http://xml.apache.org/fop/
http://www.cranesoftwr ights.com/training/ has a book about how do this that was created using XSLT-FO

There are also paywhere implimentations XSLT-FO this. Basicaly it is the extension to XSLT for print.

--
Oh really?
Consider YesLogic Prince by PornMaster · 2004-10-14 10:54 · Score: 3, Informative

Prince is a batch formatter for converting XML into PDF and PostScript by applying Cascading Style Sheets (CSS). Unlike other formatters, Prince prints any XML vocabulary without relying on proprietary markup

--
500GB of disk, 5TB of transfer, $5.95/mo
LaTeX by Asgard · 2004-10-14 10:59 · Score: 4, Insightful

Generate a LaTeX document file, compile it using PDFLatex and print. Or, use normal LaTeX and print directly from it, depending if .dvi files offend you.
Docbook by ptaff · 2004-10-14 11:13 · Score: 3, Informative

Another XML-based format is DocBook, which originally was SGML based but now has a XML DTD too. From this format you can output to ps, pdf, rtf and plenty of other formats.

You could also hack one of the docbook XSL stylesheets (using XSLT? would be pretty!) to make it parse your own format.

Feel ready to own one or many Tux Stickers?
XMLPDF by WasterDave · 2004-10-14 11:32 · Score: 3, Informative

Never quite sure what the hell it does myself, but a few people here swear by it:

http://www.xmlpdf.com/

Cheers,
Dave

--
I write a blog now, you should be afraid.
You're almost there... by BladeMelbourne · 2004-10-14 11:35 · Score: 4, Insightful

Having been in the same situation before, this is what I suggest...

Take the XML and the XSL and transform it into 100% valid XHTML. HTML 4 is deprecated, the standard will not be updated. XHTML 1.0 is 5 years old already - start to use it.

Use CSS - pay attention to
@media screen,print
{ /*Styles for browser and printer*/
}
@media screen
{ /*Styles for browser only*/
}
@media print
{ /*Styles for printer only*/
}

If it doesn't print well, you probably need to refresh your CSS here: http://www.w3.org/style

Goodluck.
Listen to me ;-) by cookiepus · 2004-10-14 12:45 · Score: 3, Interesting

I've had do to just this, actually... here's the setup. Don't ask me why certain things were the way they were, certainly you can improve. I inherited some of this. But it worked...

First, we had a bunch of product data in a MS SQL server db. We had a Java (I think) task that nightly dumped XML file (one per product) based on the DB.

Then, we applied an XSLT transformation to each XML to produce the static HTML page for that day (static both to reduce server load and optimize google's searching of it, since Google didn't/doesn't like dynamic content)

Then we wanted to produce a printer catalogue, so rather than printing pages, I made an XSLT that transformed the XML not into HTML but into FOP. FOP is some Java shit from Apache that takes FOP files and spits out a PDF.

Obviously I don't remember details, but it worked.

I had the idea to generate the PDFs not just for the printed catalogues but also as "printable version" for each HTML page. So both PDFs and HTMLs were generated nightly. Yeah it took a while but it was cool.

It also served to improve our pagerank because (1) the PDFs made it look like we've got twice as much content and because (2) google gave higher weightings to PDFs (at the time, anyway)

And, it was easy.

--
Ecce Europa - Web Design for Business
ASCII by Rie+Beam · 2004-10-14 13:05 · Score: 3, Insightful

You'd be surprised what a little coloring and some ASCII artwork can do.
Me Too by KevMar · 2004-10-14 15:08 · Score: 3, Insightful

We had this problem once, but worse.

When I started with my current employer, we had a very complicated PDFing process. Every night a transfer workstation would copy datafiles localy from a backup of the production server. A pervasive driver was loaded to read the dat files. Access would import the data from pervasive and run a report that was saved as a RTF file. It was then opened in Word where a macro would then PDF the document and close. The PDF was then copied to the webserver for the users to download.

What a mess and a nightmare to debug. It would work for a few months and then at seamingly random times, it would crash horibly for several days in a row.

When it did break, i felt like I wasted a lot of time tracking down ghost problems. In my slow days I rewrote it.

It now pulls read only data from the production server with that pervasive driver into a xml file. Then apply a xsl transform and pass the result to the FOP processor and place the result directly on the webserver.

A process that took an hour to run now finishes in 2 minutes. It is quick enough, we run it every 20 min. FOP was quick to setup and the examples are like a blue print and easy to figure out.

I have never had a problem with the new implementation and the end user had no impact and was unaware of the change.

I would recomend using a FOP processor to my friends.

--
Im a gamer, not a grammer major. This post is full of spelling and grammer mistakes.
Re:Try Docbook by CRCulver · 2004-10-16 07:55 · Score: 3, Informative

LaTeX doesn't do Unicode, you'll have to translate those characters.

Actually, LaTeX does do Unicode, and quite well. You just have to install Dominique Unruh's unicode package if your LaTeX distribution doesn't already ship it. I've used it for over a year to typeset documents with lots of Old Church Slavonic, Greek, and Hebrew, easily mixing scripts in one document and being free to keep all foreign scripts in UTF-8.

If you would like to see some examples, check out my two tutorials for LaTeX for philologists (which I still work on and update from time to time).