Slashdot Mirror


Fulfilling the Promise of XML-based Office Suites?

brentlaminack asks: "Almost a year ago Tim Bray of XML fame said 'when the huge universe of MS Office documents becomes available for processing by any programmer with a Perl script and a bit of intelligence, all sorts of wonderful new things can be invented that you and I can't imagine.' Now that MS has dropped the ball on the XML Office front, and StarOffice has fulfilled its XML promise, where are all those 'wonderful new things?' Is anybody out there writing Perl/Java/whatever programs to take advantage of StarOffice XML? Could this be an opportunity for Free/Open/Libre software to leapfrog MS Office in real productivity as XML proponents have promised all along?" What kinds of new and wonderful things can you come up with?

432 comments

  1. XML... by ewombatnet · · Score: 5, Insightful

    I think one of the main problems with the embedding of XML architecture into office productivity software is unfortunately the end user. I mean, how long have programmes like MS Word had "document properties" contained in them, and how many people are actually using them? I'm currently working on a project to retrieve documents accross a company's backed-up data from the past 10 years, and there is very very little metadata available for us to do any searching on. Unless the embedded XML contained within office suites is brought more "to the fore" and in the face of users, instead of being a behind the scenes 'option', people just are not going to use it

    1. Re:XML... by Anonymous Coward · · Score: 0

      You mean like when OOo supports tables that are more than a page in height?

    2. Re:XML... by Trolling4Dollars · · Score: 5, Insightful

      There are two ways to look at this. ONe way is to make the assumption that the problem lies with the user and the other is that the problem lies within the computer. Even though computers have gotten easier to use, they aren't really easy at all for the average user. The barriers to ease of use are plenty:

      -Feature overload (many features that users will never use)
      -PCs are incredibly complex because they are so flexible and can do so many things.
      -User interfaces are pretty poorly designed and don't seem to be getting any better.
      -Humans don't "interface" well

      If the mode of interacting with computers was like interacting with another person, they would be considerably easier to use. I often joke with my wife that *I* and the ultimate user interface. If you think about it, the best interface for the average user would be a very human-like avatar. Yes, this interface would suck for someone like me (a real computer user), but that's not who it would be targetted at.

      Getting back to the XML subject, these same problems are what keep it from gaining any ground with the average user. The average user still doesn't "get" electronic documents. That's why they always resort to printing them out on paper. To be sure, there are times when a document SHOULD be printed on paper, but that's only really about 20% of the time. The other 80% a document is much better to keep in electronic format. With XML, so much the better. But if the average user has trouble understanding even a basic text file, the ultra-documents that XML can lead to will be completely bewildering. How do we solve this? I've argued this before over and over again: we need new input devices and now I will extend that to new output devices. If we had more variety with the output device, XML documents would be the next "great thing". The XML document has arrived too soon. If we had electronic paper that XML docs could be loaded into, there would be a revolution. It will happen, not just yet. And when it does happen, look for some big corporation to be backing something that looks a lot like XML, but it will have a different more friendly name and will be claimed as innovative.

    3. Re:XML... by chiasmus1 · · Score: 5, Insightful
      The important thing about XML is not the end users. As an end user I could care less about the formation of the document as long as I knew I would always have an application that could read the document.

      With XML documents, if the file format is well known, there will be filters for it. Major Office Suites will support well known file formats. If the file format is not as well known, but it is simple XML, there are high chances that smaller applications will also have filters for it.

      I like to write web software and I was discouraged when I discovered that I could not find a Perl library to create OpenOffice.org files, so I created one of my own. Granted it is not the best library, and is probably full of bugs, but it was easy to create and the research was painless. It does the job I made it for and I use it.

      Compare that to the time when at work my boss asked me to take a Pick Basic binary datebase file and extract the data from it. I had to play around a while to figure out which bytes meant what and how to get the information out.

      XML not only makes creation easy, but makes reverse engineering trivial. XML is not for the end users, it is for the developers why do not have the time to sit and read the 500 pages of the file format spec.

    4. Re:XML... by rgigger · · Score: 3, Insightful

      I just had a thought. What I really want to do is generate some sort of office documents on the web. That way I can make word processing documents, spreadsheets, charts, graphs etc that my clients can download. Now I would love to just generate Open Office XML files and have them use those. The problem with that is that none of my clients use Open Office and they are not going to for the foreseeable future.

      Here however is my super cool idea that I just came up with:

      An open office server. If open office can export to MS Office Formats what's to stop me from doing the following (other than time).

      1) create my templates in open office XML format
      2) extract the parts of open office that import from the OO XML format to it's internal format, and export to MS Office format.
      3) Create a PHP extension (or maybe apache module) to expose this functionality to my web apps.
      4) Insert dynamic database driven content into my OO XML templates, convert them to MS Office format and stream them out to a client.

      Maybe not the product of an ideal world but given the fact that MS Office is both closed an ubiquitous this seems to be a great way to leverage the capabilities open office in handling XML and MS Office import/export.

    5. Re:XML... by ReelOddeeo · · Score: 3, Informative

      Once you learn how to do it, it is definitely possible from, say, a Java program to connect to a running OOo (OpenOffice.org==OOo) make it open a document and re-save it in Word format. You can even make OOo do this without flashing anything on the screen.

      There is a definite learning curve. You need to learn Uno.

      IMHO, despite the learning, this would be way easier than trying to extract the parts of code you need from OOo and building a "converter" program. Maybe I say this because I have spent the time learning Uno and can now program OOo functionality from multiple languages, and how to integrate it into a web server like this seems obvious to me.

      I have personally programmed OOo to do things from: OOo-Basic, Java, Python and MS Visual FoxPro. I know from postings from others that it is most definitely possible to use Delphi and VB.

      Just as an example of what can be done, I built a Maze generator in java. You can run the maze generator on a different computer. Even a different OS. It connects to a running OOo, and then creates a multi page drawing of complex mazes. (You can get it at www.OOoMacros.org or at www.OOoExtras.org.)

      --

      Those who would give up liberty in exchange for security and DRM should switch to Microsoft Palladium!
    6. Re:XML... by Daengbo · · Score: 1

      The problem is that OOo now only does thiskind of thing in graphical mode. Look at the trouble someone had getting automatic PDF generation of OOo or Word documents: create a macro; automatically check a directory every n minutes for new files; open the new file; print to PDF; and delete the old file. He needed a dedicated Xsession on tty9 logged in to a dummt user all the time to accomplish this. It should be easier than that.
      Look for printall.sxw here for more details.
      These missing perl programs need to start appearing to make this easier (and no, I don't do perl).

    7. Re:XML... by Anonymous Coward · · Score: 0

      "Unless the embedded XML contained within office suites is brought more "to the fore" and in the face of users, instead of being a behind the scenes 'option', people just are not going to use it"

      In the face of users? yeah, right, when I have to write a doc quickly at work, I'm really thinking more about how someone will search for it in 10 years time.

      We should be making things easier for the user, not the computer.

    8. Re:XML... by Anonymous Coward · · Score: 0

      I don't think you quite get it. XML file formats are never well-known, unless a DTD is published. Any particular XML file format is a particular file format. You can discuss XML file formats and drop the "XML" from the discussion without losing much content.

      I don't think you quite get it. File formats are never well-known, unless a DTD is published. Any particular file format is a particular file format. You can discuss file formats and drop the "" from the discussion without losing much content. :)

      Fact is you just need to sit down and read that 500 page file spec or you'll still get it WRONG and if the program that's meant to process that file format is not sufficiently robust, you'll still crash it. The fact that it's arranged markup-style rather than with binary flags just makes it a bit more bulky and convenient to alter, but at the end of the day there's files that fit the spec, and files that don't.

      XML benefits come down to this:
      1) it's slightly easier to recover from a file that is slightly out of spec as you can skip to the next markup code you know (but according to the standard, it's illegal to do so, one must terminate with error).

      2) it's slightly easier to modify by hand.

      some will tell you XSLT is a benefit, but in fact XSLT is just a translator document, you can use it on any 2 file formats if you can figure out the transform.

    9. Re:XML... by JacobKreutzfeld · · Score: 1
      Until word processors stop encouraging people to do WYSIAYG (what you see is all you get) content creation, you will never be able to get useful content out of the document. MS Office and most others cause you to spend your time fiddling the look-n-feel, setting some lines in bigger fonts, or making other sections in italics. Instead, the author should specify text-sections as DocumentTitle or BookTitle and let the word processor format it. Then, XML would have useful tags denoting semanticly meaningful areas of text; seeing junk like FONT BIGGER WIGGLY ORANGE makes a mockery of markup. TeX and even HTML try to encourage concept-based markup; WYSIAYG does not.

      Of course there's no reason they couldn't replace those font, style, justification, color, and "effects" buttons with ones for title, paragraph, quote, and so on. Or more abstract ones like parent, seealso, etc. But I can't see that happen: everyone likes to dork around with the format instead of concentrating on the content -- form over function. Sigh.

    10. Re:XML... by Anonymous Coward · · Score: 0

      I think you and your wife are forgetting about the true "ultimate user interface", my friend and yours, Clippy.

    11. Re:XML... by Anonymous Coward · · Score: 0

      I think you and your wife are forgetting about the true, "ultimate user interface", my friend and yours, Clippy.

    12. Re:XML... by Cragen · · Score: 1

      And the higher up the food-chain you go, the worse it is. The user just wants somethng with a remote control, I think. The "customer is always right". I say: Give what they want until they explode! :)

    13. Re:XML... by Anonymous Coward · · Score: 0

      I mean, how long have programmes like MS Word had "document properties" contained in them, and how many people are actually using them? I'm currently working on a project to retrieve documents accross a company's backed-up data from the past 10 years, and there is very very little metadata available for us to do any searching on.

      Perhaps part of the solution to this lies in the computer. Humans don't want to spend a lot of time adding metadata that no one will ever see, and we know that we can figure out the importance of a document from its context-title, location, etc--or expect not to go looking for 10-year-old-documents. If the software can identify key aspects of the document while it's being prepared and add XML metadata transparent to the user, then the computer becomes empowered magically to find documents based not on their name.doc, but from a more general description (eg: letters to Fred).

      Hey, it can't be that hard, right: Clippy already figures out some context.

    14. Re:XML... by Trolling4Dollars · · Score: 0
      The user just wants somethng with a remote control, I think.

      And mind reading as well. Have to wait a long time for that...

    15. Re:XML... by nepheles · · Score: 1

      A suggestion.

      File-type inheritance. XML is probably ideal. Yes, you'd need to change the spec. But just disregard that for a while. This doesn't have to be XML.

      GIF, JPEG, PNG, PSD, TIFF, BMP... all are image formats. All display an image. They have slightly different purposes (TIFF for excellent quality, JPEG for effective compression), but the point is that they share the same purpose, mostly. Why not an XML 'Image' DTD that defines the interface for interacting with an image. So I can write a utility in Perl, Java, C - whatever - to read images. They don't all use the same format, they all extend it, adding functionality (the JPEG type would have the ability to set the compression level in the image). So you don't lose any functionality -- it's there if you want it.

      There could be a sub-interface on the Image type -- the VectorImage. I can also write a program which handles VectorImages -- maybe Illustrator, SVG, and Freehand documents. But you can still treat all these as pure images, if you like

      This has probably been thought of before. What do people think?

      --
      ((lambda x ((x))) (lambda x ((x))))
    16. Re:XML... by rgigger · · Score: 1

      Is is possible to access this through PHP?

    17. Re:XML... by bhtooefr · · Score: 1

      Umm, I call BS. I have OOo 1.1RC3, and I made a 2 page long table a couple days ago.

    18. Re:XML... by BSD+Yoda · · Score: 1
      And mind reading as well. Have to wait a long time for that...

      Maybe not....

    19. Re:XML... by Dever · · Score: 1

      " And when it does happen, look for some big corporation to be backing something that looks a lot like XML, but it will have a different more friendly name and will be claimed as innovative. "


      Yeah, and it will come from Redmond and be proprietary, and it'll break everytime you try to do something not exactly covered in the included Office-annoy-you-to-death-Avatar tutorial.

      Did I mention that it will overshadow whatever well implemented version OSS may (who knows) have released, through sheer force of product lock in.

      Yippee

      --
      - I'd prefer not to.
    20. Re:XML... by ReelOddeeo · · Score: 1

      I do not know of a PHP bridge to UNO. There is no reason, in theory, why one could not be developed.

      On the other hand, from PHP you could execute Python scripts, Java programs, or do an XML-RPC type function call to a running Python or Java based service on the same or different machine.

      For example, suppose your web program wanted to accept an Excel spreadsheet in order to process some data from it. You wanted to use OOo's excellent ability to decipher Excel into something you can deal with.

      One approach is that you make an RPC or SOAP call to another daemon (in Java or Python) which then uses OOo to open the Excel doc and re-save it in some more accessible form. (Maybe just a plain OOo document, which is just zipped XML.) Multiple web service threads calling the Excel-to-OOo converter process would need to be queued up within the converter service, since you can only call OOo to convert one doc at a time.

      Another general approach, a batch approach, is that the web page merely accepts the Excel submission and then returns a "Thank you for submitting your crap to us." message on the web page. The Excel file is just saved into some submission folder. A seperate process is awakened (I'm not specifying how, maybe by the PHP page?) and processes each Excel file in turn. The processing could consist of opening the Excel doc in OOo, and directly accessing the contents of the cells via. OOo's UNO api, and then updating some database records. If there are problems with the submitted spreadsheet, a response can be e-mailed back to the submitter.

      Similarly you could build a web page that offers a service of turning Word Docs into PDF's, or other formats, based on OOo. Go to the OOoMacros.org site and see the Document Converter I wrote last weekend. (Just a macro driven wizard-like front end to using a small subset of OOo's import/export filters.)

      --

      Those who would give up liberty in exchange for security and DRM should switch to Microsoft Palladium!
    21. Re:XML... by rgigger · · Score: 1

      Thanks! That is all very, very helpful. I don't know if I will ever actually need this but if I do I'll have a much better grasp of where to begin.

  2. Re:I hate XML on Mac's by Anonymous Coward · · Score: 0

    um... don't titanium powerbooks come with 802.11 built in?

  3. standardization by Unregistered · · Score: 4, Insightful

    one missing thing is standardization accross OSS. When abiword (and koffice?) support oo files, then we might see more of this. Also, i personally can't think of a use offhand that oo.org can't already do. Once people begin to find uses for this, then more people will actually try to write scripts to take advantage of XML.

    1. Re:standardization by chill · · Score: 5, Informative

      The next major release of KOffice is supposed to adobt the OO file formats as their own standard.

      --
      Learning HOW to think is more important than learning WHAT to think.
    2. Re:standardization by Surak · · Score: 1

      It's already happening. KOffice, and I believe Abiword have already promised support for the OOo XML format. Corel, though not OSS, is on the bandwagon too.

    3. Re:standardization by jdray · · Score: 0

      Are you new here? You must be. Nice to see a new face.

      --
      The Spoon
      Updated 6/28/2011
    4. Re:standardization by Gortbusters.org · · Score: 0

      brokencomputer doesn't get online much..

      --
      --------
      Free your mind.
    5. Re:standardization by Anonymous Coward · · Score: 0

      Good point, but trying to get people to adopt the format is difficult. I work with a program called Bible Works that has a built-in editor allowing me to deal (and type) in Hebrew/Greek/English with a hot-key switch. The feature is nice, but the format is just a sooped-up RTF file with some added formatting that increases the file size (and isn't transferrable to another editor). I told them to think about adopting OOo's format, but all they could say was, "How can we use someone else's format?" I think their starting to get it, but man it was like talking to a wall for a while.

    6. Re:standardization by Anonymous Coward · · Score: 0

      Obviously Abiword shouldn't have to play catchup with OO.org's format - but they want a common file suite, so OO.org's format was submitted to OASIS for standardisation a while ago and everyone is awaiting the results of that.

    7. Re:standardization by rmohr02 · · Score: 1

      And AbiWord already supports them, just not as the default. See http://developers.slashdot.org/comments.pl?sid=763 40&cid=6816160.

    8. Re:standardization by Anonymous Coward · · Score: 0

      Talking to christians is often like that.

  4. anything that will translate manager speak? by hattig · · Score: 5, Funny

    Maybe a script to de-buzzword meaningless missives from above?

    E.g., "We wish to engender a positive business atmosphere" => "Free beer at lunchtime"

    1. Re:anything that will translate manager speak? by cptgrudge · · Score: 3, Funny
      Maybe a script to de-buzzword meaningless missives from above?

      Not a script, but perhaps a free (as in beer) Word plugin? Bullfighter

      --
      Qualitas edurus commercium, nullus penitus net rimor, nullus deus beneficium
    2. Re:anything that will translate manager speak? by bigdavex · · Score: 4, Funny

      #/usr/bin/perl
      print "We're doing more layoffs and getting more bonuses.";

      --
      -Dave
    3. Re:anything that will translate manager speak? by Anonymous Coward · · Score: 0

      Did you know there has been a simillar filter for political speeches in UNIX-like OSes for years?

      You can find it at '/dev/null'

    4. Re:anything that will translate manager speak? by winkydink · · Score: 1

      #!/usr/bin/perl print "Those who are staying have already been notified.";

      --

      "I'd rather be a lightning rod than a seismometer." -Ken Kesey

  5. Well... by Otter · · Score: 4, Informative
    ...when the huge universe of MS Office documents becomes available for processing by any programmer with a Perl script and a bit of intelligence, all sorts of wonderful new things can be invented that you and I can't imagine.

    Well, I'm taking a break right now from generating new Excel graphs by copying old ones and changing the source data, which isn't so bad, and those fucking error bars, which is. Oh, and the scatter plot points are superimposed so you can't click on the back ones.

    So if I could do a find&replace on a flat file, I'd have been done an hour ago.

    Other than that, no, I can't imagine either. VBA exists now and it's not like we're all flying around with wings and harps.

    1. Re:Well... by BlueGecko · · Score: 4, Funny
      VBA exists now and it's not like we're all flying around with wings and harps.
      True, but after extensive work with VBA, I grew these sharp red horns and a big red tail with a spike on the end...
    2. Re:Well... by croddy · · Score: 4, Insightful
      MS won't stand for an XML file format -- it's human-readable. the last thing MS wants is for their file format to be easily convertible and transformable. it's a pity, because switching Office files to XML would quickly make them insanely useful.

      imagine you write an outline in word. file -> export as -> presentation... or in access you select some rows and export to a spreadsheet. this is where staroffice stands to beat them.

      but MS Office derives its profitability from incompatibility -- you have to use their products to get full use of their file format. so using MS Office will necessarily sacrifice this functionality.

    3. Re:Well... by Anonymous Coward · · Score: 0

      I did a "save as" in Office 2003, and it saved a .xml file that was pretty readable--as in I could read the text that was in the original .doc file. What else do you need?

    4. Re:Well... by Gaijin42 · · Score: 0, Troll

      Office has an XML format already. Its pretty complex, so I dont know how "human readable" it is, but very computer readable

    5. Re:Well... by Anonymous Coward · · Score: 1, Insightful

      An explanation for what the tags mean? Sure, Office saves out XML. But god knows what exactly the tags mean. MS sure doesn't document them fully. Usually the most you can recover is the unformatted text...

      I'd rather a fully documented binary format to undocumented XML, personally.

      Also, last I checked, MS was still saving out in "MSXML", and scattering wierd [blah..] constructs throughout the XML.

    6. Re:Well... by yerricde · · Score: 1

      What else do we need to support WordML? We need a schema so that we can interpret the other data in the text such as what parts are paragraphs (equivalent to HTML's <p>) and what parts are the outline (equivalent to HTML's <h1> through <h6>) and then write some sort of XSLT filter to convert documents from WordML to HTML, DocBookML, or some other widely recognized XML application. Unfortunately, it appears Microsoft has not licensed its XML schemas for free redistribution.

      --
      Will I retire or break 10K?
    7. Re:Well... by Anonymous Coward · · Score: 2, Funny

      I have not heard of BSD for a long time ... is it dead or what?

    8. Re:Well... by Chokolad · · Score: 1

      > Also, last I checked, MS was still saving out in "MSXML", and scattering wierd [blah..] constructs throughout the XML.

      Can you be a little bit more specific? It looks like totally valid XML to me. I am curious what kinds of weird extenstions to XML MS put into Word XML format.

    9. Re:Well... by woozlewuzzle · · Score: 1, Flamebait

      I thought the whole idea of XML was that it was self-documenting? So what will be the next big thing to save the world?

    10. Re:Well... by croddy · · Score: 1
      what I mean is that "The" Microsoft Word .doc format should *be* XML -- the full-featured default file format, possible zipped or gzipped. not with cryptic stuff scattered inside the tags, but human readable...

      *that's* what XML is for. it's a format for describing structured documents that's both human-readable and machine-readable. fine, it can "Save As" XML -- but if I decompress a StarOffice document, I can look at the XML, without any documentation, and basically understand how the formatting code is working.

    11. Re:Well... by YrWrstNtmr · · Score: 2, Insightful

      magine you write an outline in word. file -> export as -> presentation... or in access you select some rows and export to a spreadsheet. this is where staroffice stands to beat them.

      This is what Office does (rather) well. Use an xls as a data source for an MDB, a word doc, and a presentation, all at the same time. Or link database info to a remote presentation.

      And while Office prefers Office, you CAN link to and from bare text files. Whether delimited or fixed length.

      Way back with Office95 we were pulling backend data off a UNIX box into a VB/Access frontend. Seamless to the user.

    12. Re:Well... by perlchild · · Score: 2, Interesting

      The fact that the format is XML doesn't say anything about the format being "open". That's why Microsoft was proposing XML to standard bodies, and trademarking DTDs and Schemas...
      What other people in the thread is for Microsoft to give us 100% of the schema, and so far Microsoft has shown zero will to do so witout legislation compelling it to. 100% of the schema would allow Corel and/or IBM to feature-copy 100% of Microsoft's Office features, and they certainly will of course say that legistlation to force them to give away their competitive advantage would be anti-american.
      Someone with a different agenda would probably say such a thing would have provided a better, more balanced punishment to Microsoft's monopoly than the minimal slap on the wrist they had.
      I personally think they should have been made to refund 50% of the purchase price of all Windows licenses, as half of the value was created by "Everyone else is using it" and that advantage was gained through illegal monopoly, and very creative enforcement of copyright laws. But that's neither here nor there.

    13. Re:Well... by Citizen+of+Earth · · Score: 1

      Everybody loves open standards except for the 800-lb. gorilla of any industry.

    14. Re:Well... by pjrc · · Score: 1
      imagine you write an outline in word. file -> export as -> presentation...

      Not hard to imagine, since Office has supported this for many years, using OLE, COM, Active-X, and proprietary formats.

      XML may be many things, but this has existed for a very long time without XML.

      or in access you select some rows and export to a spreadsheet. this is where staroffice stands to beat them.

      An even better example. Imagine that, being able to export from Access to Excel. Who woulda' thought? Certainly not all the MS Office users who've been doing this for many years.

      This is an even better example of why Star Office and OpenOffice.org will overtake MS Office, as Sun only now bundles a cripple-ware database app, and OpenOffice has none at all.

    15. Re:Well... by Upphew · · Score: 0

      You BSDard!

    16. Re:Well... by TonyGreene · · Score: 1

      Well, I'm taking a break right now from generating new Excel graphs by copying old ones and changing the source data, which isn't so bad, and those fucking error bars, which is. Oh, and the scatter plot points are superimposed so you can't click on the back ones.

      So if I could do a find&replace on a flat file, I'd have been done an hour ago.


      Gnumeric uses an XML format. Maybe you could open a copy of the XLS in Gnumeric, then save in Gnumeric native format (XML) and run your search/replace on that. Then open and save back to XLS and see how it looks.

    17. Re:Well... by Korgan · · Score: 2, Funny

      Ahhh... so you got MSOffice to run on WINE in a BSD environment then? ;-)

    18. Re:Well... by Anonymous Coward · · Score: 0

      AFAIK, it is totally valid XML. It's just not all the meaning of the document formatting is expressed in XML - some of it is in constructs like

      Totally valid XML, and totally incomprehenisble unless you can understand the undocumented [] instructions.

    19. Re:Well... by Anonymous Coward · · Score: 0

      Bah.

      example was <tag attribute="[insert complicated code here...]" >

    20. Re:Well... by poot_rootbeer · · Score: 1

      MS won't stand for an XML file format

      And yet, high-end versions of Office 2003 are going to have XML support. So whether they WANT an XML file format or not, they're getting pressure from the market to deliver one. They're not so important that they can afford to make a product that doesn't meet enterprise customer's needs.

  6. Forget it.... by vivek7006 · · Score: 0, Redundant

    "Is anybody out there writing Perl/Java/whatever programs to take advantage of StarOffice XML"

    I am sure there are many. But Microsoft will continue using propriety formats for MS-Office. Why will they open their format and loose all the market share?

    1. Re:Forget it.... by GreyWolf3000 · · Score: 1

      There is nothing "loose" about Microsoft's market share. They want to tighten it if anything; why would they want to lose it?

      --
      Slashdot: Where people pretend to be twice as smart as they really are by behaving like children.
  7. XML is no silver bullet by PissingInTheWind · · Score: 0, Flamebait

    And also it sucks to work with.

    I still can't understand why people invest so much time and money into that half-assed idea that is XML.

    Better alternatives have existed for a long time.

    --

    A message from the system administrator: 'I've upped my priority. Now up yours.'
    1. Re:XML is no silver bullet by Anonymous Coward · · Score: 0

      Better alternatives have existed for a long time.

      Pointers please?

    2. Re:XML is no silver bullet by Anonymous Coward · · Score: 0

      tex. latex.

    3. Re:XML is no silver bullet by Anonymous Coward · · Score: 0

      lisp s-expressions

    4. Re:XML is no silver bullet by Anonymous Coward · · Score: 0

      please note I don't actually believe this, I was just parroting what at least a few other slashdotters have said.

    5. Re:XML is no silver bullet by Anonymous Coward · · Score: 0
      To any of the lisp-heads out there: the world doesn't give a shit about a superior language that **NOBODY** uses! If more than 2% of the people who know XML and the various XML technologies (XSD/XSL/XPATH...) knew Lisp, then maybe you would have something to talk about.

      Until then, STFU!!!!!

    6. Re:XML is no silver bullet by Anonymous Coward · · Score: 0

      Better alternatives have existed for a long time.

      Pointers please?

      tex. latex.


      Not just for publishing, but as a general-purpose, glue, data-interchange format that XML is evolving to be.

    7. Re:XML is no silver bullet by axxackall · · Score: 1
      Better alternatives have existed for a long time.

      Do you mean S-exps? I agree.

      --

      Less is more !
  8. Nope by OeLeWaPpErKe · · Score: 1

    They lied ;-p

    Seriously though, koffice will use the exact same fileformat as staroffice. Is that wonderful enough for you ?

    1. Re:Nope by Anonymous Coward · · Score: 0

      **SHOCK** You mean that Microsoft has LIED??? Why would Microsoft undermine their years of positive goodwill and accepting attitude that has engendered from their honesty and truthfullness in the past? **LIED** ??? Why, with Microsoft lying what else could be wrong with the world? You mean, like, maybe Santa Claus isn't real? The tooth fairy? What other bedrocks of my life will be destroyed?

    2. Re:Nope by OeLeWaPpErKe · · Score: 1

      Well can't help you with Microsoft there, but on the Santa Claus issue I am somewhat of a pragmatist.

      As long as they bring me presents they (obviously) exist.

  9. Not a big innovation by Doug+Merritt · · Score: 5, Interesting
    documents becomes available for processing by any programmer with a Perl script and a bit of intelligence, all sorts of wonderful new things can be invented

    This is just a return to part of what made Unix so powerful in the first place: text formats that can be manipulated by the whole suite of command line tools. "Those who don't understand Unix are doomed to re-invent it, poorly" (Henry Spencer).

    Back in the 70s we used nroff/troff for document formatting, producing in some cases professional-quality camera-ready books...but the source code was easily fed to spell checkers, formatting-command-strippers, sort, wc, etc etc etc.

    XML is ok...not bad as a meta-format...but it's not some kind of new magic; it's just more of the same as what we always used to do.

    The great step forward is moving away from the crud that happened in the middle: proprietary underdocumented binary formats that couldn't be fed to filter pipelines.

    In this case, moving backwards is progress. But expecting something amazing to be invented is a bit much; it was already invented a long time ago.

    P.S. pet peeve...people credit Knuth (admittedly an amazing guy for the Art of Computer Programming) for reinventing typesetting with TeX. Now, TeX is nicer than nroff/troff in multiple ways, but it's worse in some others (TeX is not set up for command line filters!), and in any case is only an incremental improvement, not a revolution over the older Unix tools. Credit is not properly being given.

    --
    Professional Wild-Eyed Visionary
    1. Re:Not a big innovation by Anonymous Coward · · Score: 3, Insightful

      Now, TeX is nicer than nroff/troff in multiple ways, but it's worse in some others (TeX is not set up for command line filters!), and in any case is only an incremental improvement, not a revolution over the older Unix tools. Credit is not properly being given.

      I see your point. But have you tried doing mathematical formulas in groff? In (La)TeX they're a breeze (relative to just about everything else out there). Right tools for the right job I guess.

    2. Re:Not a big innovation by brentlaminack · · Score: 2, Interesting

      I'll agree on Tex. I remember the day I gave up on it. I attended a lecture by Knuth himself on abstract graph theory. Guess what he used to generate his overhead transparencies with? Colored felt-tipped markers. Here is the great Knuth himself, the creator of TeX with near-infinite computing resouces available, and he hand-draws equations with felt-tipped markers!! At that moment, I knew TeX was dead.

    3. Re:Not a big innovation by pigscanfly.ca · · Score: 2, Insightful

      What do you have against TeX?
      TeX is god [ok maybe not $DIETY god , but fairly high up there] .
      TeX , along with latex , allows me to do wonderful things with documents generating into multiple formats. Although I have had some eps integration problems (who knew plot utils used some funky ass default font that know one has ever heard of before) it was my fault for not checking to make sure that I had the right fonts installed. TeX is wonderful for typesetting , it puts the control back in the user .

    4. Re:Not a big innovation by kfg · · Score: 5, Insightful

      The great man himself gave you a clue to great wisdom. Not everyone has that chance.

      And you blew it, Grasshopper.

      The lesson was, "The right tool for the job."

      Sometimes the right tool, despite all the modern technolgical advances, is still a rock.

      KFG

    5. Re:Not a big innovation by Anonymous Coward · · Score: 0

      XML is not a step backwards. Yes, you're generally in the right direction - since XML is a textual format, it more easily lends itself to a small and synergistic toolset. My point is, this is not a "step backwards".

      XML lets you build a much richer data structure than we could in the 80's with what amounted to tab-separated arrays. *THAT* is a step forward. If you think of the adage ALGORITHMS+DATA_STRUCTURES=PROGRAMS, with richer DATA_STRUCTURES that lets you have richer PROGRAMS. eh?

      - David

    6. Re:Not a big innovation by Anonymous Coward · · Score: 0

      use eqn to produce the groff code :)

    7. Re:Not a big innovation by sharkey · · Score: 4, Funny
      Sometimes the right tool, despite all the modern technolgical advances, is still a rock.

      When all you have is a rock, everything looks like Bill Gates' head.

      --

      --
      "Outlook not so good." That magic 8-ball knows everything! I'll ask about Exchange Server next.
    8. Re:Not a big innovation by nihilogos · · Score: 1

      More likely is that he prepared the lecture in 20 minutes. Why waste time typesetting a lecture for pimply undergrads?

      --
      :wq
    9. Re:Not a big innovation by shellbeach · · Score: 1
      [ok maybe not $DIETY god , but fairly high up there].

      That'd be what Jenny Craig worships, right??

    10. Re:Not a big innovation by Anonymous Coward · · Score: 0

      I think that you and he mean different things by "mathematical formulas". If you're writing an actual math paper, no way is eqn going to do shit for you. I wrote all my Signals and Systems papers in TeX; no way would eqn have been able to handle it.

    11. Re:Not a big innovation by Anonymous Coward · · Score: 0

      Just becuase someone came up with a new wiz bang gadget does not mean it is better. They made digital watches, but most people go with analog. When and if someone concives a use ( the fabled killer app) it will be used. Otherwise it is just another overkill feature in a large pile that is the modern word processer.

    12. Re:Not a big innovation by Anonymous Coward · · Score: 0

      "If you're up there, please save me, Superman!"

    13. Re:Not a big innovation by 0x0d0a · · Score: 1

      I use LaTeX for writing my documents. I rather like the approach of being able to write documents in emacs with raw text.

      That being said, LaTeX is grotty as hell and sucks in many ways. It produces nice output, but the syntax is fairly obfuscated, it's a lousy programming language, it gets unreadable quickly for some fairly common elements of documents, some very common characters need to be escaped, there's a lot of redundant functionality, it produces PS/PDF files that don't have extractable/searchable text (I'll buy into this if the goal is just to beat native PS kerning), it's a damned pain to use pdflatex and latex simultaneously (slightly different feature sets).

      It'd be nice if someone made a much easier to use, cleaner typesetting language, but it'd probably be a tremendous amount of work.

    14. Re:Not a big innovation by Rogerborg · · Score: 1

      >Why waste time typesetting a lecture for pimply undergrads?

      So that they'll do as you say, rather than (not) doing as you (don't) do? If you won't eat your own dogfood, why expect anyone else to?

      --
      If you were blocking sigs, you wouldn't have to read this.
    15. Re:Not a big innovation by Anonymous Coward · · Score: 0

      sometimes there just aren't enough rocks.

    16. Re:Not a big innovation by Anonymous Coward · · Score: 0

      But as the earlier article pointed out, TeX and its dialects do not work particularly well with the unixy tool-chain paradigm.

      On the other hand, I have had very good results preparing documents in DocBook/XML, using a hand-crufted Perl script to translate the subset of DocBook that I use to LaTeX, and use LaTeX as the formatting engine, the endpoint of the tool chain. This effectively separates the typesetting tool from the document structure; I can use standard tools to transform the same DocBook XML to HTML, for example. And it is easy to imagine other XML applications that can accomplish tasks like outlining. In one case, I used an ad-hoc XML application to crack out all the elements in my document to assist in the preparation of a glossary.

      LaTeX still occupies a preeminent place as a formatter - but XML has huge potential as document markup that can be manipulated by standard open-source tools.

    17. Re:Not a big innovation by John+Allsup · · Score: 1

      The biggest problem is the amount of overcomplication and obscure workarounds in the inner workings of LaTeX in order to get things working as LaTeX wants them. This makes it hell to change what happens if the standard stuff and packages doesn't do what you want it to. (e.g. having labels and references store and retrieve extra metadata, doing layout of theorem+proof environments slightly differently, etc.)

      Separation of the programming stuff from the document markup stuff from the document appearance stuff is an essential missing bit of LaTeX.

      (And yes... I do (pure) maths, wrote my MSci and MPhil theses in LaTeX and am currently doing my PhD thesis in LaTeX. I've done various bits of hacking around to get it to do what I want, but have generally got tired of doing so.)

      --
      John_Chalisque
    18. Re:Not a big innovation by MattRog · · Score: 1

      to quote Fabian Pascal:
      The problem is that those who say 'right tool for the job' are usually those who have no clue what the job is -- they only know tools and how to apply them mechanically.

      Which means the vast majority of practitioners.

      --

      Thanks,
      --
      Matt
    19. Re:Not a big innovation by Anonymous Coward · · Score: 0

      In the 80s I could use Lisp. Hell, I can use lisp now. XML is just business computing slowly grasping at ancient academic stuff. XML is Lisp, reimplemented badly. Again.

    20. Re:Not a big innovation by Sunnan · · Score: 1

      That's the beauty of unix pipes, though - someone could write a kick-ass replacement for eqn and still keep the rest of the workflow.

    21. Re:Not a big innovation by nihilogos · · Score: 1

      So that they'll do as you say, rather than (not) doing as you (don't) do? If you won't eat your own dogfood, why expect anyone else to?

      That doesn't even make sense.

      --
      :wq
    22. Re:Not a big innovation by Rogerborg · · Score: 1

      >>So that they'll do as you say, rather than (not) doing as you (don't) do? If you won't eat your own dogfood, why expect anyone else to?

      >That doesn't even make sense.

      It doesn't not make (no) sense.

      --
      If you were blocking sigs, you wouldn't have to read this.
    23. Re:Not a big innovation by Anonymous Coward · · Score: 0

      Or maybe you knew that he didn't have time to put together a formal presentation with transparencies, etc., and was just lecturing pretty much off-the-cuff from his vast knowledge?

      -Tim, the AC Poster Child

    24. Re:Not a big innovation by nihilogos · · Score: 1

      > It doesn't not make (no) sense.

      I subscribe to the principal of the excluded middle.

      --
      :wq
    25. Re:Not a big innovation by Anonymous Coward · · Score: 0

      Typesetting equations with groff is a breeze using the eqn macros. I find this aspect much more intuitive than LaTeX - and the results look good.

      Where groff comes in a bit 2nd best is on the macro language. The groff macro style is a bit archaeic and not as easy to work with than LaTeX. Where it shines is the small footprint and the filter oriented processing - makes it an ideal tool for report generating off databases etc.

    26. Re:Not a big innovation by Tony-A · · Score: 1

      P.S. pet peeve...people credit Knuth (admittedly an amazing guy for the Art of Computer Programming) for reinventing typesetting with TeX.

      not reinventing typesetting, but inventing non-buggy typesetting.

      And I'm still waiting for volume 4.
      (Probably the only case where the "of what" needn't be specified;)

    27. Re:Not a big innovation by Tony-A · · Score: 1

      XML is Lisp, reimplemented badly. Again.

      But the parentheses are marked, so the COBOL programmers can keep up with them ;-)

    28. Re:Not a big innovation by Doug+Merritt · · Score: 1
      not reinventing typesetting, but inventing non-buggy typesetting.

      I think you must mean that TeX improved on the design of older tools like troff. For instance, few thought that the 2-character limit for command/macro names was ideal, so TeX wins there, among other places.

      But like I said, such design improvements are incremental improvements, not revolutions.

      Or did you literally mean "buggy"? Which bugs in nroff/troff/tbl/eqn/etc are you referring to?

      --
      Professional Wild-Eyed Visionary
    29. Re:Not a big innovation by Tony-A · · Score: 1

      Which bugs in nroff/troff/tbl/eqn/etc are you referring to?
      I neither know nor care. My not knowing does not make anything bug-free. My knowledge is essentially that Knuth got diverted from finishing the Art of Computer Programming because he got hung up in something to do with typesetting.

      General Rule: All programs past a certain size have bugs.
      Footnote: With the possible exception of something written by Knuth.
      Observation: It may be possible to create bug-free programs, but it's probably much more trouble than it's worth.

      Note: Having a bug does not imply that you can get it to exhibit buggy behavior. In many cases it take two bugs to get together before you can notice anything. I've even seen a triple.

      Constructing a nuclear power facility is "design improvements" over a caveman piling rocks on top of each other to make some sort of shelter.

    30. Re:Not a big innovation by Doug+Merritt · · Score: 1
      Ah. I misunderstood your original phrase about bugs, that's all. Sure, all good points.

      As for Knuth's bug free software, I've always felt torn about that, because certainly it is admirable, and I would like to say we should all aspire to write bug-free code, but yeah, Knuth's shining example unfortunately does make it questionable whether it is really worth that level of effort.

      Especially because the side-effect seems to be that Knuth's code is required to be single-author, and cannot be enhanced... e.g. I can't remove his extraordinarily annoying TeX progress messages. (Which I could claim are bugs of the design-defect sort, but obviously Knuth disagrees.)

      There may be a path to bug-free software, but admirable as Knuth's efforts are, they don't point to a method that can be realistically followed in general.

      So I'd say the whole issue about bug-free typesetting is a minor side-issue, and return to my insistence that TeX was an incremental improvement. Troff and nroff and groff are all usable in practice, whether they have bugs (probably) or not.

      I didn't see anyone mention it, but his claim to fame with Metafont is stronger than with TeX, as far as I know.

      --
      Professional Wild-Eyed Visionary
    31. Re:Not a big innovation by Tony-A · · Score: 1

      return to my insistence that TeX was an incremental improvement.

      Oh yes, in the larger sense, that's all it can be.

      Systems evolve. This can happen even if the people and the infrastructure does not change in the slightest. If you have a perfect model of the world, the world will take that into cognizance and change itself, thereby rendering your perfect model wrong! I think that while it's impossible or not worthwhile to eliminate all bugs, it's both possible and worthwhile to ensure that the consequences of bugs are not all out of proportion to the causes. Lots of stuff seems to be like a fender-bender in Dallas completely stops traffic in Boston. There has to be a better way than that. Me, I like the ability to throw untested stuff into a production system with impunity. About half the time I get it right without having to think.

  10. Yes and no by Cranx · · Score: 1

    OpenOffice documents are, ironically, not as desirable to automate the production of as PDF documents, I think.

    With XML libraries maturing at their current rate, and transformation schemes abounding (XSL, scripting, etc.), I think that XML being the format of any word processing document format is simply less in-demand these days. Those that need to can certainly build OpenOffice documents quite easily, but I think most people are generating HTML, man pages and PS/PDF documents from source DocBook or simple YAML sources.

    In a nutshell, it's not that OpenOffice isn't living up to the hype, it's that so much is crashing down on Microsoft Office in so many different ways that looking only at OpenOffice will not give you the whole story.

  11. MS Office is required by generic-man · · Score: 3, Insightful

    XML is not a selling point for an office suite. Users expect a good user interface and an easy migration. OpenOffice is not there yet. Its help assistant spawns 1024x768 help windows to say as little as "I have automatically capitalized the first letter of your sentence." It has no integrated PIM software to unseat Microsoft Outlook. It has no easy migration path for the millions of users who open documents with useful macros and scripts. OpenOffice has no drop-in replacement for Microsoft Access-driven applications; primitive as Access is, many companies use it to develop simple database applications that would need to be recreated from scratch in another suite.

    At this point in time, there's no reason to switch from Microsoft Office to another office suite simply because this new suite uses XML. XML is best suited as a tool for the back-end developer, not an excuse to migrate to a product that has so many rough edges in its current form.

    --
    For more information, click here.
    1. Re:MS Office is required by karmavore · · Score: 1

      Well there is an obvious solution for the help. They could try an animated staple.

      --
      Speech: Free
      Beer: $699.00
    2. Re:MS Office is required by mAineAc · · Score: 1

      Except for MS office is kludgy adn bloated. It is totally annoying to use. I find Star Office much more productive and it has database tools with it.

    3. Re:MS Office is required by aminorex · · Score: 1

      OpenOffice is not a replacement for MSOffice.
      StarOffice is.

      --
      -I like my women like I like my tea: green-
    4. Re:MS Office is required by Apreche · · Score: 1

      I agree with you that good user interface for an office suite is key. I use OpenOffice, but it isn't the greatest. The reason I use it is because it is as close as I can get to what I want, with or without paying money.

      I know I'm not alone when I say I think all these office suites are too bloated and full of crap. Sure, there are lots of lite word processors out there, but wordpad doesn't cut it either. I can't seem to find a word processor that has all the stuff I need and none of the stuff I don't.

      As for XML the real advantage is that any old application can modify your files. Let's say I'm playing a game and I change the controls. The game can dynamically change the help files, which can be in the XML format of my word processor. That's the kind of stuff that we needs.

      --
      The GeekNights podcast is going strong. Listen!
    5. Re:MS Office is required by Malcontent · · Score: 1

      "It has no easy migration path for the millions of users who open documents with useful macros and scripts."

      First of all this is a really bad idea. If you are opening up documents with scripts in them stop right now.

      Secondly this can never happen. MS has patents and copyrights on VBA. Open source people will never be allowed to implement that functionality. At best they can implement their own scripting which they have. Unlike MS though they let you choose your own language.

      "At this point in time, there's no reason to switch from Microsoft Office to another office suite simply because this new suite uses XML"

      It looks like you will never switch no matter what. You set criterea for switching so high that the open source developers would be sued or jailed if tried to please people like you.

      --

      War is necrophilia.

    6. Re:MS Office is required by shyster · · Score: 1
      I can't seem to find a word processor that has all the stuff I need and none of the stuff I don't.

      That's because you need different stuff than I do, for example. Some MS exec once said that even though Word users only use 10% of the features, they all use a different 10%.

      Solution? Modularity. We should be able to pick and choose which options to install. Office gets close with Windows Installer technologies and install on 1st use or disabled selections, but fails to remove all the clutter of the removed features...in effect only saving cheap and abundant disk space.

    7. Re:MS Office is required by Nefarious+Wheel · · Score: 1

      Not all macros are virii. There are some very large, very expensive 3rd party document management suites that rely on them, and without those products thousands of sysadmins are condemned to explain to millions of public servants why they shouldn't store essential documents on their C drive.

      --
      Do not mock my vision of impractical footwear
    8. Re:MS Office is required by ljavelin · · Score: 1

      MS-Office isn't required for me. I switched to OpenOffice just last week when I finally converted my desktop PC to Linux from Windows 2000.

      No, OO it isn't perfect. And not all MS-Office documents convert perfectly (I use call-outs in PPT a lot :-( ).

      But it is good and useful. I approve.

    9. Re:MS Office is required by Anonymous Coward · · Score: 0

      No, VIM is. And EMACS is a replacement for VIM.

    10. Re:MS Office is required by shellbeach · · Score: 1
      OpenOffice is not there yet. Its help assistant spawns 1024x768 help windows to say as little as "I have automatically capitalized the first letter of your sentence." It has no integrated PIM software to unseat Microsoft Outlook. It has no easy migration path for the millions of users who open documents with useful macros and scripts. OpenOffice has no drop-in replacement for Microsoft Access-driven applications; primitive as Access is, many companies use it to develop simple database applications that would need to be recreated from scratch in another suite.

      All very true. But for those of us who don't want to use Outlook, don't need Access, and don't want to pay several hundred dollars it's not a bad substitute. There's a long way to go yet before OOo is better than MSOffice. But who said it needed to be better when the vast majority of users don't need anything more than what it provides?

      Getting back on topic, I agree with you in part on the value (or lack thereof) of an XML based format. But it's conceivable that companies could use an open XML format to automate tasks - thus giving them a good reason to switch. Which was basically what this story was asking about ... :)

    11. Re:MS Office is required by An+Onerous+Coward · · Score: 3, Funny

      So that's what OpenOffice has been missing all this time. I knew there was something a bit off about it, but I could never put my finger on it.

      The answer, my friends, is an integrated E-mail/Calendar suite. Integrated right into OpenOffice. This is what will finally drive a stake through Microsoft's undead heart.

      Integrated E-mail. Integrated Calendaring. Right in the office suite. All integrated and everything. You all know you want it. Now go, my toiling minions! Build! Build, I say!

      --

      You want the truthiness? You can't handle the truthiness!

    12. Re:MS Office is required by Malcontent · · Score: 1

      Lots of people willingly lock themselves to one vendor. The people you speak of have willingly lead themselves down a dark and expensive path. Every two years not only do they have to upgrade office but also have to upgrade the app written in office macros due to slight differences in the VBA. In all honesty there is no hope for companies with CIOs that stupid.

      Having said all that none of what you say in your post adresses the real issue. I will say it again. Open source developers will are not allowed to duplicate that functionality. They can get arrested or sued. They can and have implemented IP free languages in open office. Your stupid ass CIO is not able to extricate himself or his company from the clutches of MS but their competitor just might. I suspect the competitor which is able to shift vendors will save money and eventually win in the marketplace.

      That's the way capitalism works. You are not supposed to be a slave to your vendors, it's supposed to go the other way.

      --

      War is necrophilia.

    13. Re:MS Office is required by Anonymous Coward · · Score: 0
      It has no integrated PIM software to unseat Microsoft Outlook.

      Oh for crying out loud. Just because Microsoft decided to bundle Outlook with MS Office doesn't mean an office suite requires a PIM in order to be defined as such. What does an email client have to do with typing letters or editing spreadsheets? Anyone?

    14. Re:MS Office is required by Anonymous Coward · · Score: 0
      I find Star Office much more productive


      Personally I haven't used Star Office so I wouldn't know.

      and it has database tools with it.


      Obviously you haven't used MS Office, so I'm surprised that you know.

      By the way, both Excel & Access are "database tools" and pretty good ones...in my opinion...if you're dealing with relatively small databases.
    15. Re:MS Office is required by Nefarious+Wheel · · Score: 1
      Ah, such insight. I can assure you that the bayesean modifier of path-darkness is given very little weight against the cost of support in a large, enterprise-level COTS-based solution, although expensive-ness certainly is.

      I am not talking about a few blocks of VBA script hacked in a back room, I am talking about multi-million dollar applications from vendors who build the support component (i.e. paying programmer's wages for upgrade work) into the solution up-front, sold to customers for whom spending 40% or more on apps support is considered a bargain compared to the large multiplier placed against not having a solution in place.

      In the overall context of a commercial solution, the spreadsheet rules. Lots of effort goes into examining the pros and cons of each selection under the heading of due-diligence. This means paying for the opinions of engineers and strategists to weigh the value of this aspect of support costs. Scripting a MS app with VBA or adding a few macros is changing a small piece of a pre-existing package in a fairly non-invasive way, as opposed to an OS-approach, which allows change to all of it. Is this necessarily the worst way to achieve the lowest total cost of ownership for a package? On the other hand, how many wheels do they have to pay to re-invent before an open-source variant becomes no longer economically viable? The market will decide, as you imply -- e.g. as soon as a better, cheaper, well-packaged Open Source document managment system appears on the market you can be sure it will get equal air play in the bid responses. In that context, whether or not Open Source efforts can duplicate the functionality is commercially irrelevant. I know of some deep and complex DMS' that are entirely J2EE except for the MS-doc & macro bits, which are considered presentation-layer only & not worth the effort to worry about. Does this mean they're pandering to Microsoft? I don't think so.

      I am not an apologist for Microsoft, neither am I a believer in the sanctity of Open Source. Code is code, and this is a competitive world. Get over it.

      There are lots of excellent reasons for moving to open source. Emotional appeal isn't one of them.

      --
      Do not mock my vision of impractical footwear
    16. Re:MS Office is required by Malcontent · · Score: 2, Insightful

      Reading your post I get the impression that you are unaware of some important facts. For example you seem to think that open office is not scriptable. I can assure that it is. It can be scripted in java or basic (not VBA) or even python. Both the Open office variant of BASIC and Python are open source languages and Java is available from many vendors. What open office can't do is to implement VBA which could get them arrested or sued. BTW most people who have knowledge of multiple programming languages seem to agree that python and Java are vastly superior to VBA.

      Now getting back to the point which you seem intent on ignoring.

      No matter how much you spend, no matter how complex your application if as a result of buying something you become a slave to your vendor then you lose. Vendor lock is bad for business. It's doubly bad if your competition is not also locked in.

      In business you have to have to control your vendors, you have to play them against each other, you have to constantly keep them on their toes by threatening to drop them and move on to somebody else. A business has to cowtow to clients and has to bully their vendors not the other way around.

      Your CIO is locked by Microsoft. He can't leave, if he threatens to leave MS will laugh in his face, audit him and then charge him double just for fun.

      I shudder to think what CIO thought that a multi million dollar application built on office macros was a good deal though. When I hear of stuff like this I wish there was a law that forced you to tell me what company he works for. I would like to know so that I don't have any stock in a company that is being managed so badly. A CIO who is that clueless is probably making bad decisions left and right and obviously the CEO or the board of directors are too stupid to call him on it.

      --

      War is necrophilia.

    17. Re:MS Office is required by Nefarious+Wheel · · Score: 1
      Sigh. No, I am not assuming Open Office is not scriptable. Yes, I speak multiple languages, picked up a few in 20 years as a systems programmer. Yes, I prefer Python to VBA. Yes, I agree vendor lock-in is bad, having spent the last ten years as an architect and development manager.

      By the way, when you refer to vendor lock-in do you mean some vendors and not others? Would you include small vendors who write mods to open source apps? It's quite easy to be locked in by them, too -- been there, done that, ate the shirt, that problem has a lineage dating back to the days when software was distributed as Cobol source. Everyone could read it, didn't help because nobody wanted the job of trying to save somebody else's crap code.

      I was referring to certain macros that revector the "save as" options into SAN instead of allowing an army of clerks to write their own filing system. Distasteful to me, godsend to large organisations. And I'm not speaking for any CIO in particular, so I'm afraid I can't give you a single target upon which to vent your wrath. Customers of document management systems are usually organisations constrained by law to have their docs carefully controlled, and that means governments, so you're a stockholder whether you want to be or not.

      Actually, large organisations have more clout with Microsoft than you might believe, and are not as locked in as you think. Think of Telstra, or the government of India.

      --
      Do not mock my vision of impractical footwear
    18. Re:MS Office is required by scrytch · · Score: 1

      > It has no integrated PIM software to unseat Microsoft Outlook

      When Outlook manages to integrate, let me know. As of OL2K (I have not yet used XP, most people are still using 2K), it still has no VBA support, just VBS with a weird forms editor that acts like nothing in any other MS app. It can embed IE, but not any other app, even though IE can. It cannot embed the word or excel *app* in the outlook window -- let alone a clipping, which makes the outlook bar a glorified start menu. When someone attaches a word doc, it'd sure be nice if I could simply convert it to inline automatically, that'd be integration. Dynamically creating a flap on my outlook bar per page ala acrobat would be gravy.

      Jeez, they haven't even managed to integrate IE into explorer as seamlessly as Konquerer, let alone their office apps. MS's interoperability hasn't really progressed much since Office 4.x on Windows 3.1

      --
      I've finally had it: until slashdot gets article moderation, I am not coming back.
    19. Re:MS Office is required by General+Cluster · · Score: 1

      >

      I won't name specific companies, but I can promise you that just about every large investment bank on Wall Street has million dollar VBA apps. I have built many of them. These banks have thousands of office users and every one of those employees perform repetitive tasks with the aid of VBA. The savings from the productivity can go up to the tens of millions.

      The CIOs weigh the measurable and significant productivity increase against the EXTREMELY unlikely scenario that the financial industry will migrate to Open office in the next three years and they invest.

      These CIOs are not intimidated by any means. They have no problem choosing JAVA over C#, or Oracle of SQL server, but in the office space (which they watch carefully) they do not yet see a clear alternative to Office.

    20. Re:MS Office is required by Anonymous Coward · · Score: 0

      When someone attaches a word doc, it'd sure be nice if I could simply convert it to inline automatically, that'd be integration. Dynamically creating a flap on my outlook bar per page ala acrobat would be gravy
      ^^^^^^^^^^^^^^^^^^^^^

      No, this is dumb. Because of viruses, macro and other types, we are forced to make people open attachments instead of being in-line in the client. *Think* of the consequences of allowing this to be done again.

    21. Re:MS Office is required by Sepper · · Score: 1

      Not exactly...

      What is missing is an Access-like Program. Something that allows to use a database, and generated formatted documents as output, WITHOUT having to program anything ( Except maybe SQL queries)...

      I can't remember how many times i saw Access in use... and could not name an alternative that would do the job well....

      --
      I live in Soviet Canuckistan you insensitive clod!
    22. Re:MS Office is required by scrytch · · Score: 1

      I knew someone was going to chime in with this "all automation is bad" business.

      Who said it had to execute macros? If it simply pulled the doc out of the file to preview it, and never executed any macros, a preview would be safer than opening it. I haven't seen a macro virus in the wild for years anyway, and it's not like people really hesitate to open attachments even after the hugely verbose warning popup.

      Secondly, it's not as if it's not *already* executing arbitrary code when you have auto-preview on.

      --
      I've finally had it: until slashdot gets article moderation, I am not coming back.
    23. Re:MS Office is required by Malcontent · · Score: 1

      "By the way, when you refer to vendor lock-in do you mean some vendors and not others? "

      All vendor lock is bad. That is why you must strive to have access to the code of everything in your enterprise. Whether that's done with open source or by actually purchasing it does not matter. Vendors routinely drop products, get bought and sold, change strategies or simply die without warning. In any case you are left holding the bag unless you have access to the source code.

      "Actually, large organisations have more clout with Microsoft than you might believe, and are not as locked in as you think. Think of Telstra, or the government of India."

      I believe telstra is one of the first companies to sign up for the new sun "java desktop" linux distribution. India is clearly heading down the linux path. Appatently they both agree with me.

      --

      War is necrophilia.

  12. Apache module by codepunk · · Score: 5, Interesting

    I sure would like a apache module that can CSS and display native open and star office documents.

    --


    Got Code?
    1. Re:Apache module by mo · · Score: 1

      I believe somebody sells the XSLT scripts to transform OO docs to html and then you can use axkit or cocoon on the server-side.

    2. Re:Apache module by luisdom · · Score: 1

      You almost have it:
      OO has an XSL that renders OO to xhtml.
      So, you only have to do a module that decompresses the .xml files and apply the xsl (for example, with apache xalan).
      I'm not sure how the XSL handles the multiplicity of files (contents.xml, styles.xml...)

    3. Re:Apache module by panoplos · · Score: 1

      Not that they have the transformations for OO.org XML to HTML, but a framework does exist that would make this procedure trivial at most.
      Have a look at the Apache Cocoon project.

  13. The wonderful things by tlianza · · Score: 1, Troll
    where are all those 'wonderful new things?'
    Ha, the real question is - where is that "huge universe" of documents? That's the input that will eventually spur on such innovations. As long as there are a total of about 6 documents worldwide written in StarOffice format, I wouldn't hold my breath for neat tools to slice and dice them.

    (yes, I said 6... I am exaggerating)

    1. Re:The wonderful things by The_Dougster · · Score: 1
      Ok... mellow... mellow...

      Sorry, I didn't mean to be so darn vicious, but that just ticked me off for some reason. Your comment is both untrue and inflamatory, and its like your are beating on a baby. Pick on somebody your own size. OO is spanky new and generally used only at home except in the rare enlightened companies which never caved in to MS strong-arm tactics and still run Unix or the like.

      OO is better than MS Office and it always will be. Any gift given out of charity is much more meaningful than a similar thing which you bought. If I destroyed something your Father gave you, then offered to "buy you a replacement" how would you feel?

      --
      Clickety Click ...
    2. Re:The wonderful things by shyster · · Score: 1
      If I destroyed something your Father gave you, then offered to "buy you a replacement" how would you feel?

      Well, I don't think you could buy a replacement for the rights to life, liberty or the pursuit of happiness (or property)...those are the things my "Father" gave to me. So I would be very sad indeed.

    3. Re:The wonderful things by The_Dougster · · Score: 1
      Well, I don't think you could buy a replacement for the rights to life, liberty or the pursuit of happiness (or property)...those are the things my "Father" gave to me. So I would be very sad indeed.

      Well said, and I agree, although I wasn't intending to get that deep here. Even still this is the kind of feelings that get invoked. A gift, which is essentially what Open Office is, is much almost always more meaningfull and valuable than a purchased item, even if the purchased item is supposedly superior (which in this case it clearly is not).

      Hundreds and thousands of people have given their valuable time and effort, donating the time from their lives, the one thing that each of us has in so limited a supply, to make this wonderfull Open Office suite, so that poor students and, well anybody, can have a first class little computer system without paying tribute the the MS monster that murdered all the other little guys like WordPerfect and Lotus123 so that they could be the only game in town.

      Microsoft:

      • Bought DOS to begin with!
      • Stole IBM OS/2 technology and made Windows 3.0 and NT
      • Blatantly copied Apple MacOS in a sucky way
      • Crushed all competitors mercilessly and now stifle newcomers
      • Were responsible for thousands of buggy commerical programs
      • Have caused untold trillions of dollars of lost productivity
      • Make my Mom have to deal with stupid viruses

      I've watched them claw their way to the top over the last fifteen years. You aren't going to get me to ever say one good thing about them until they are gone. They are like some kind of organized crime group and I for one will not deal with them ever.
      --
      Clickety Click ...
  14. PHP Script that generated reports by brandonp · · Score: 5, Interesting

    I created a PHP script a few months ago that allowed a client to upload StarOffice templates for company documents. Then the the script automatically generate documents by pulling data from a database and inserting it into the StarOffice document.

    Was really easy, StarOffice documents are zipped files that contain the XML files. I just unzip'ed the file, inserted the appropriate data into the content.xml file and zipped it back up.

    I was absolutely amazed by how easy the StarOffice files were to work with. I'm really excited about the possibilities that are in store for us, especially ones that are better than my little hack.

    Brandon Petersen

    1. Re:PHP Script that generated reports by hattig · · Score: 2, Interesting

      Sounds cool. Now is there a command line tool that can take said resultant XML file and create a PDF from it?

      (would be great for certain automated server applications where there is no display, etc, and running StarOffice isn't an option because you want it automated)

    2. Re:PHP Script that generated reports by brandonp · · Score: 1

      That is what I was wishing for. I settled for creating a pdf's in PHP using FDFD. If only there was a command line tool to create the pdf's from a OpenOffice file, I could maintain the appearance of the customer's templates.

      Instead, I have to create a class file to generate pdf reports with FPDF, taking all the freedom from the client to update their own reports without my help.

      You make a very good point.

    3. Re:PHP Script that generated reports by awtbfb · · Score: 2, Funny


      It would be nice to not be constantly pestered about TPS Reports. Now where's my red stapler...

    4. Re:PHP Script that generated reports by shellbeach · · Score: 1

      Is there a reason why you can't just use OOo's "export as PDF" function?

    5. Re:PHP Script that generated reports by Daengbo · · Score: 1

      In order to use OOo to do automatic PDF export, you need a dummy user, and extra X session on tty8, and a cron script, all with OOo open all the time. Seems kind of a waste, huh?

    6. Re:PHP Script that generated reports by goodEvans · · Score: 1

      Thank you

      Thank you, thank you, thank you, thank you.

      Yours was the first of many messages on this theme that I saw, and you just made my life a whole lot easier. I had one of those "if we did this, then we could do this, then this, then this..." epiphanies.

      Thank You!

    7. Re:PHP Script that generated reports by Anonymous Coward · · Score: 0

      Use Adobe Distiller or write and OO macro, check this post on OO dev list.

    8. Re:PHP Script that generated reports by hattig · · Score: 1

      Will OOo run/install with no X server installed?

      (even StarOffice, version 7 has PDF export built-in, so a StarOffice "tools" install that would include the required libraries and command line applications such as so2pdf that can be invoked without any reliance on installing the whole application on a server without X11 installed)

    9. Re:PHP Script that generated reports by Anonymous Coward · · Score: 0

      Check the OO archives. According to a report you can at least install OO without X11 running, whether you need the libraries is not clear. If you can get OO to print without X you essentially have an OOxml to ps converter and converting ps to pdf is fairly trivial. (AFAIK, this is how OO pdf output actually works.)

    10. Re:PHP Script that generated reports by Rogerborg · · Score: 1

      You had an epiphany that you can use an XML parser to produce some really ugly, kludgy CSS/HTML that still won't look like what the XML schema describes, rather than passing the DTD to the client and letting it display it properly?

      --
      If you were blocking sigs, you wouldn't have to read this.
    11. Re:PHP Script that generated reports by Omni-Cognate · · Score: 1

      I'm no expert on this kind of thing, but that sounds nothing like what the original poster described. The server wasn't generating HTML, it was generating StarOffice documents. The user would upload a staroffice document, and the server would use it as a template to generate new documents by inserting data into the appropriate places, which was easy to do because staroffice documents are just zipped up XML files. Once the server had inserted the data, the server would zip the files up again and hand out a *staroffice document*. I see no CSS or HTML anywhere in this process.

      --

      "The Milliard Gargantubrain? A mere abacus - mention it not."

    12. Re:PHP Script that generated reports by goodEvans · · Score: 1

      No, I had an epiphany that I could create a set of standard documents in a new directory when someone sets up a new job, and put job details in them at the same time.

    13. Re:PHP Script that generated reports by Anonymous Coward · · Score: 0

      Exactly. I needed to generate a word processing document of CD Labels pre-filled with text specific to the user. With StarOffice, this was a completely trivial exercise. Unzip, search-and-replace, zip, done.

      I was quite impressed with what StarOffice had done with their XML file format. I didn't need to refer to any documentation, it was all quite obvious what went where. Hint: OOo lets you save XML with and without whitespace (indenting and line breaks).

      By the way, XML isn't the only text-based file format, RTF is textual, too. But XML is a lot easier to reverse engineer; there's nothing to learn, you just read what's there.

  15. Yes, Standardised Financial Reports by jechonias · · Score: 5, Interesting

    The biggest dream that the financial world has ever had with an XML concept has been the concept of standardised financial reports.

    Imagine a world where any finacial (excel based or otherwise) report from any public company can be compared with any other company report and we can all be sure of how the figures were calculated and what they mean.

    AND they are fully comparable. And fully importable into any financial package. No longer is any one company dependant on one financial package. Come to think of it there is no way the vendors of such products will ever allow this to happen!!!

    http://www.xbrl.org/

    jech

  16. Command line rendering by pirodude · · Score: 4, Interesting

    If there was a way to render out the open office/star office documents on the command line it would explode in the reporting area. Being able to have the end user making a really nice template and have a perl script fill it then pass it off to a pdf or printer is key.

    1. Re:Command line rendering by Coryoth · · Score: 1
      Being able to have the end user making a really nice template and have a perl script fill it then pass it off to a pdf or printer is key.


      As other people have commented elsewhere, this was one reason nroff and TeX were so cool - you wrote a basic template and it was very easy to have a script fill in the fields - then voila postscript ready to go to the printer. I don't use nroff/troff, so I don't know if it has a PDF output these days (though I imagine it does). Certainly TeX does.


      I did always wonder why no one made a nice frontend for generating TeX templates etc. for exactly this purpose - let the end users design the templates instead of a TeX guru...


      Jedidiah

    2. Re:Command line rendering by rmohr02 · · Score: 1
      If there was a way to render out the open office/star office documents on the command line it would explode in the reporting area.
      Try 'unzip document.sxw;cat document.xml'.
  17. Reporting is a great use of OOo's XML format. by Gravatite · · Score: 5, Interesting

    My team & I just got done building some billing software for one of our customers, and OpenOffice.org's XML based documents turned out to be perfect for generating reports. Our customer is able to open up the document and change the formatting of any report at will, and then we have some Ruby code on the backend that parses the XML document, fills in all the real data from the database and then uses the CLI interface to OpenOffice to render the document as postscript. It was a quick easy way to get powerful report generation with a format that non-technical people could edit that required just a little bit of glue code on the backend, and it's the XML format that made it all possible.

  18. Difficult by iMMo · · Score: 2, Interesting

    I did take some time and decompress a StarOffice document -- I was attempting to write a couple of modules for manipulating StarDraw images to create dynamic flowcharts.

    It took some time to get up to speed, as the compressed XML is split across four different files (content, meta data, settings and styles). Mostly, I was concerned with modifying the content document.

    Each of the documents is written with space in mind, and for the document I was dealing with, the content was 20K on a single line. I had to process the XML just so I could understand the physical structure. Once that was done, it really wasn't that difficult to manipulate the doc by hand, re-zip the content and open in StarOffice.

    (Unfortunately, I didn't have the time to even start, much less complete, the modules. Damn day job).

    1. Re:Difficult by WayneConrad · · Score: 1

      I worked with Gravatite on this. He did the part where we got OpenOffice to print from the command line; I did the XML munging. It would have been hard, but of course we cheated :)

      First we configured OpenOffice so that it saved our report templates in friendlier XML format, not all on one line (uncheck Tools -> Options -> Load/Save -> General/Size Optimization for XML Format).

      Next, we didn't actually parse the XML. We unzipped the zips, then did a simple regex search & replace to get data into the report. For example, we might replace "|date|" with "2003-03-01". For the parts of the report that had a variable number of rows, we used an OpenOffice table, which makes a pretty regular structure; that makes it very easy to find a row and duplicate it without actually having to parse the XML for real.

      Then we'd zip it all back up and feed it to OpenOffice to print.

      Cheating is good for you (tastes good, too).

  19. True WYSIWYG HTML editor by Delirium+Tremens · · Score: 2, Interesting

    XML developers and Web designers are now able to work on some XML-to-HTML transformer that matches closely what the average office user is spending his time creating with the WYISWYG Writer program. This could be a nice alternative to Frontpage, for example.
    Of course, OpenOffice 1.1 already comes with a nice HTML tool, but that doesn't stop anyone from trying to do better.

    1. Re:True WYSIWYG HTML editor by delta407 · · Score: 2, Interesting
      are now able to work on some XML-to-HTML transformer that matches closely what the average office user is spending his time creating
      The guys at Typo3 have done exactly this. They write an extension that takes a normal Office 2003 XML document (like this one) and displays it as normal HTML (like this). The resulting HTML is subject to the same rules as all of the other HTML produced by Typo3, which means the appearance of everything can still be changed by modifying a template.

      Typo3 has always been feature-rich (though terribly complex), and an XML-based document interchange system that can handle documents made in common word processors is a very useful feature indeed.
  20. Automatic Generation of Pretty Reports by pjack76 · · Score: 5, Interesting
    You know, with charts and graphs and your corporate logo on them. The charts and graphs are populated from a database somewhere. Suitable for your board report.

    I bring it up because my organization paid Crystal reports $10,000 to be able to do this. If I could have written a little perl script that connects to the database and emits an OpenOffice doc, then I could have saved the organization ten thousand dollars, and saved myself a world of pain. (The only thing more evil than Crystal Reports is crystal meth.)

    You might be wondering why I wouldn't just use HTML and some library that automatically creates chart PNG images -- the reason is we have to email the report to our board members because they're demanding like that. So we use Crystal to generate pretty PDFs with all the charts. We also let the board members log into our system to generate their own reports via the web, which they can then email to the group.

    So having an XML-based document format for this would be wonderful, especially if OpenOffice would provide a command-line utility for converting from OO format to PDF.

    --

    Wow, a lucrative publishing contract! I don't have to be evil anymore. --Meteor

    1. Re:Automatic Generation of Pretty Reports by footNipple · · Score: 1
      (The only thing more evil than Crystal Reports is crystal meth.)

      LOL...Well I guess I'm off to prison if the crystal reports lab in my basement ever gets busted

    2. Re:Automatic Generation of Pretty Reports by Mournblade · · Score: 1

      I don't know about command line, but I do know that the newer versions of OO.o sport an export-to-pdf function. I'd be surprised if it couldn't be accessed through the command line somehow.

    3. Re:Automatic Generation of Pretty Reports by mabhatter654 · · Score: 2, Insightful

      Only problem is that it doesn't import any metadata. hyperlinks, bookmarks, etc...It's just a cold rip of the pages. That limits it's usefullness because you can't do anything with the resultant PDF [i.e. HR manual, reports, manuals] just look at it. That's severly limiting for corperate use.

    4. Re:Automatic Generation of Pretty Reports by merlin_jim · · Score: 4, Funny

      The only thing more evil than Crystal Reports is crystal meth

      Funny you should mention that... I'm at work right now (10:00 PM local time; been here since 9:00 AM) for that very reason! And I'll give you a hint, I've never touched crystal meth

      --
      I am disrespectful to dirt! Can you see that I am serious?!
    5. Re:Automatic Generation of Pretty Reports by Coryoth · · Score: 1
      So having an XML-based document format for this would be wonderful, especially if OpenOffice would provide a command-line utility for converting from OO format to PDF.


      If all you want to do is get to PDF then LaTeX and pdfLaTex are not a bad method for doing it. Certainly the hyperref package makes producing fully linked PDF documents a breeze, and in pdfLaTex inclusion on PNG graphics is also very easy.


      The distinct downside is that ultimately you want to write your own documentclass in TeX/LaTeX for the reports (thus alleviating a lot of the hassle of trying to force a basic LaTeX documentclass to do formatting it wasn't intended to do. This can create very simple to generate and very professional looking output - but it is a bitch to write, and you really want to learn your TeX to do it right. It would be a "write once" deal though if you do it right.


      If OpenOffice had a nice easy way to do this sort of thing, so you didn't have to be a TeX guru to set it all up, then certainly it would be great.


      Jedidiah

    6. Re:Automatic Generation of Pretty Reports by Ugmo · · Score: 2, Interesting

      I used to make PDF's with Perl Scripts from Database reports. I made HTML Documents from the database queries and then used HTML2PS to make Postscript files. I could make PDF's from the Postscript files, see GSView it comes with a script ps2pdf. The results were mailed to interested parties.

      I made use of "Programming Web Graphics with Perl and GNU Software" O'Reilly Book and some extra research on the Web. It was mostly a pretty print of lots of HTML tables as PDF's + text.

      Some customers demanded Word docs.
      I tried using RTF to produce Word doc files and found it was easier to output HTML and put a .doc extension on the file. I found MS-Word will automatically open it up and it will look nice.

      I did not output Graphs. You could try using Gnuplot to output graphs in postscript. A little cutting and pasting of the Poscript files ( tables, text from HTML2PS, in one file, graphs from gnuplot in another) paste them together with perl and turn the whole thing into a PDF (html2ps then ps2pdf) should produce something, though, I do not know if it would duplicate your Crystal Reports.

    7. Re:Automatic Generation of Pretty Reports by otprof · · Score: 1
      The distinct downside is that ultimately you want to write your own documentclass in TeX/LaTeX for the reports (thus alleviating a lot of the hassle of trying to force a basic LaTeX documentclass to do formatting it wasn't intended to do. This can create very simple to generate and very professional looking output - but it is a bitch to write, and you really want to learn your TeX to do it right. It would be a "write once" deal though if you do it right.

      I don't think you'd necessarily need to write your own document class, though it would be nice to have. Instead of that, I'd use the Memoir class, developed by the excellent Peter Wilson. At CTAN, check out macros/latex/contrib/memoir/.

      I used it to uglify my dissertation, breaking almost every classical rule of typesetting in the process of satisfying our "style" guideline.

      Memoir is VERY configurable, and the best documented LaTeX package I've ever worked with.

    8. Re:Automatic Generation of Pretty Reports by Deusy · · Score: 1

      Actually, Crystal Reports is easy once you know how. Finding out that 'how' takes forever.

      The trick is to start at the table which defines the initial ordering.

      So, for an example of a report on client accounts, sorted by client, then by year:

      Client -> Account -> Year

      It took me 3 days to do my first report. It's not taken more than 20 minutes to do one since.

      --

      Free Gamer - Free games list and commentary

    9. Re:Automatic Generation of Pretty Reports by merlin_jim · · Score: 1

      The trick is to start at the table which defines the initial ordering.

      No, the trick is getting a dll that likes your particular version of the .NET framework, interoperates with the other developer's, and doesn't throw license errors every few minutes...

      --
      I am disrespectful to dirt! Can you see that I am serious?!
    10. Re:Automatic Generation of Pretty Reports by BigBadBri · · Score: 1
      May I suggest some crystal meth - it'll have your synapses scintillating and sorting out Crystal Reports in no time!

      ;)

      --
      oh brave new world, that has such people in it!
  21. Well, I would be... by Coryoth · · Score: 1

    But I don't use office suites. I have plenty of perl scripts to play with, reformat (so to speak - convert articles to slides etc.), and produce LaTeX, which has been a readily available option for years. I'm sure if I end up using StarOffice or OpenOffice.org then I may well be inclined to produce useful scripts for those - in the meantime though I'm quite happy with what I've got.

    Jedidiah.

  22. XML by clinko · · Score: 1

    I like the idea of XML but I can't find one good source of all the XML data lists.

    SOMEONE FOR THE LOVE OF GOD give me a list of XML sites so I can actually finish my app that uses it (hence the "it's still beta" in my sig for about 2 years now)

    later

    1. Re:XML by Doug+Merritt · · Score: 1
      I can't find one good source of all the XML data lists....give me a list of XML sites so I can actually finish my app

      Hmm? I know XML, but I have absolutely no idea what you are talking about. If you really want people to answer your plea, I suggest you be a lot more specific about what you are asking for.

      --
      Professional Wild-Eyed Visionary
  23. Bias by Peaker · · Score: 1

    Offtopic, true.
    But what's this bias people have for the inferior Perl? More and more people realize that Python is superior in almost every possible way...

    1. Re:Bias by Anonymous Coward · · Score: 1, Insightful

      er...? Python superior?

      Any language where white space is important to determining the blocking structure (e.g. Make leaps to mind) is badly broken. You don't want to totally ignore white space (FORTRAN leaps to mind) but you don't want the number of spaces/tabs before a statement to indicate anything significant.

      Of course, any programming that looks like line noise (e.g. APL or TECO) is also badly broken. Since Perl can look like line noise, I think this applies.

      Java ... now, there's a great language.

      - David

    2. Re:Bias by Teflik · · Score: 1
      Any language where white space is important to determining the blocking structure (e.g. Make leaps to mind) is badly broken... you don't want the number of spaces/tabs before a statement to indicate anything significant.
      Uh... why not?
    3. Re:Bias by Smallpond · · Score: 1

      Yeah, yeah, and Beta was better than VHS. Get over it.

      You can take my Perl away when you pry it from my cold, dead keyboard.

    4. Re:Bias by Anonymous Coward · · Score: 0

      (strikes the match)

      (lights the fuse)

      (runs behind a boulder)

      (5... 4... 3... 2... 1...)

      Perl fan: Perl's text-processing is vastly superior to Python. What are you smoking?

      (begun, this flame war has)

    5. Re:Bias by Peaker · · Score: 1

      Any language where white space is important to determining the blocking structure (e.g. Make leaps to mind) is badly broken. You don't want to totally ignore white space (FORTRAN leaps to mind) but you don't want the number of spaces/tabs before a statement to indicate anything significant.

      Huh? Why?
      All programmers indent their code. Those that don't aren't even programmers. Why do programmers indent their code? So that humans are able to read it. Why do programmers place braces? So that the compiler would be able to read it.

      Python just lets you indent for the humans and take care of the compiler at the same time.

      Also, it makes sure that the way the code looks is synchronized with what it does. while indentation is insignificant to many compilers, it is very significant to the humans readers - causing a discrepency between the program the programmer reads and the one that the compiler reads.

      The "It uses whitespace - it sucks" claim about Python is quite weak, unbased and used by rather ignorant people who never really tried using it. Using signficant whitespace for a while - I can say I'll never look back.

      Java ... now, there's a great language.

      I must say I haven't used Java a lot myself, but any language that combines poor performance with static typing is a hybrid of the worst of C++ and Smalltalk (The quote "Java has the simplicity of C++ and the blazing speed of Smalltalk" comes to mind).

      Java is overly "pure" about its OO constructs (Everything must be in a class, etc) while being too weak on its functional constructs (No lexical scoping and closure-like code).

      Pretty much any Java program narrows down to a much much simpler and smaller Python program.

  24. Word to RTF to XML to HTML by PeterHammer · · Score: 5, Interesting

    At my company, once a failed startup with new life under the wings of a huge corporate parent, we have been using a homebrewed Web publishing system that takes Word 2000 or XP documents, saves them in RTF format, then uses a utility created by Majix to transform the document to XML. From there we use perl, and some XSL to get the document into XHTML combined with some JSP to produce documents that we deploy on our production env. The good part: the system was entirely free of license fees (other than office and Windows of course). The bad: it was a pain in the behind to get all the parts together.

    The steps to produce valid XML from Word are the biggest hack I have ever been a part of as an engineer. We had to write a custom VB DLL we run inside (what else) an IIS server which takes the documents uploaded by authors, then saves the documents as RTF. Control is then handed over to Tomcat, which takes the RTF and uses some custom classes that make Majix a server to transform the documents into XML. All in all we had to use VB, VBA, Java, JSP; two separate server configurations (IIS and Tomcat) and a bunch of really ugly glue to stich all the parts together.

    I for one, and I am sure I speak for my entire team, would love a solution which saves us this ugly cludge.

    1. Re:Word to RTF to XML to HTML by codepunk · · Score: 1

      If building what you describe today I would make it short work and rock solid by using BIE Business Integration Engine to glue it together.

      --


      Got Code?
    2. Re:Word to RTF to XML to HTML by Anonymous Coward · · Score: 0

      If you're already automating Word using VBA, why not just save as HTML rather than RTF?

    3. Re:Word to RTF to XML to HTML by PeterHammer · · Score: 1

      Because good old Word generated HTML is about as ugly as it gets, and it is not XHTML too boot. It would be much to try and anticipate every piece of garbage they insert. Majix has a nice little DTD that it's documents conform too, so once it gets to the XML it is an easy job for perl (to get rid of angled quotes and other obnoxious Unicode) and XSL to get to XHTML and application logic.

    4. Re:Word to RTF to XML to HTML by PeterHammer · · Score: 1

      But you still need to write custom code to plug Majix and Word (though I assume someone has that around) into BIE.

      It would still be a lot easier if word generated XML in the first place.

    5. Re:Word to RTF to XML to HTML by Anonymous Coward · · Score: 0
      Web Interface > Logictran > Docbook XML > XSL-T > HTML

      Cross platform, your language of choice, no Java or ASP necessarily involved.

    6. Re:Word to RTF to XML to HTML by Anonymous Coward · · Score: 0

      That's crazy. Why not use FrameMaker? The licensing is probably less than MS Word, and, translation to XML is easy enough (and from thence, anything). If you don't want to futz around with writing your own XML DTDs and the corresponding FrameMaker EDDs, it supports DocBook XML out of the box.

  25. XML aside, the PDF support rocks by Anonymous Coward · · Score: 0

    I just tried out the RC for OpenOffice 1.1 and it rocks. It would be nice if OpenOffice text document generated links for index and table. It's probably just user error on my part.

  26. The more things change... by Billy+the+Mountain · · Score: 1

    XML (in word processors, at least) is nothing really new. Remember WordPerfect? It had a feature called "Reveal Codes" which when activated displayed the underlying "markup" behind the document. One could argue that this was a primitive XML format. I argue that while it was great and all, such an accessible format worked well but didn't inspire great advances in unimaginable new ideas.

    BTM

    --
    That was the turning point of my life--I went from negative zero to positive zero.
    1. Re:The more things change... by NumLk · · Score: 2, Insightful

      I fondly do remember WordPerfect's Reveal Codes feature. While this is more a reflection on the simplistic nature of WordPerfect (and other word processors of the day), being able to see all of the formatting codes as they appeared in a document was great help when trying to format a document to look a certain way, but have it turn out completely different. Also, if I remember correctly, you could even type in the codes exactly where you wanted them to appear.

      --
      Children in the backseats don't cause accidents. Accidents in the back seats cause children.
    2. Re:The more things change... by Anonymous Coward · · Score: 0

      no, but it sure did make my secretaries MUCH more productive than what they are today. excel has got to be one of the buggiest pieces of crap I've ever come across. go ahead and try cross-linking dozen of xls file attaching to various spreadsheets in each document. god forbid if you need to do a sumif across something like this.

      funny, but star office hasn't presented the same problems...

  27. Two Things... by Serapth · · Score: 2, Interesting

    First Off
    Microsoft did not drop the ball with XML. Microsoft disappointed the slashdot crowd by not going completely open... geee...... big shock there. Microsoft maintains dominance to their office suite by controlling the file formats behind it. Opening that up, without reason would be absolutely stupid from a business point of view. Granted, its an un-popular stance, but that doesnt make it any less true. MS played along with the XML game to be able to use XML as a buzz word... and in some ways, they truly have embraced XML... just not in their holy cash cow called Office. Take a look at Visual Studio (dot) Net, and you will see how strongly MS has infact embraced XML.

    Secondly...
    XML is perhaps one of the most over hyped technologies ever. Self describing datatypes are nothing new. The only really remarkable thing about XML is how embraced by the industry it was. In all honesty... the difference between XML and CSV files really isnt that signifigant. Granted... XML is far beyond anything a CSV ever did, but they all present the same result. In the current work environment I am in, all our enterprise systesm support input/output now via CSV. In addition, im in the auto industry, so the whole hype of Webservices+XML really isnt that special either. RIght now, they have ANX and EDI... granted... XML + Web Services would be much more straight forward... but in 20+ years of evolution... has it really come that far?

    Sorry for the anti-status-quo opinion, but I cant help but believe that XML is way overhyped. Useful... sure... but definatly overhyped!

    1. Re:Two Things... by Deep+Esophagus · · Score: 1

      I'm with you on this one. I have been a database applications developer for 18 years, and XML is the pits. Why does the industry hate the ISAM format so much? You can do more with a single line of dBase/Clipper/FoxPro than you can with a page and a half of SQL statements. Try searching by last name + date of birth on a 200,000 element census database in under 2 seconds with an enormous, clumsy XML file and see how far you get. A pox on XML... long live Wayne Ratliff.

    2. Re:Two Things... by Anonymous Coward · · Score: 1, Insightful

      There's a huge difference between XML and CSV-type files. There's a huge range of stuff you can do in XML that are impossible in CSV type files.

      Specifically, XML allows some really interesting data structuring plus validation that's really powerful (DTD's and Schema's).
      - David

    3. Re:Two Things... by TummyX · · Score: 4, Insightful

      What are you talking about?

      CSV? LOL.

      Does CSV have a transformation language (XSLT)?
      Does CSV have an easy to use parser & object model (SAX, DOM)?
      Does CSV have an in document addressing language (XPATH)?
      Does CSV have a standard way of supporting hierarchical data?

      Just cause you think it's overhyped doesn't mean it isn't worth every bit of that hype. I've been using XML since 1998. I shudder when I think about the pre-XML days.

    4. Re:Two Things... by Serapth · · Score: 2, Interesting

      I think you misunderstand me here. You say you shudder to think of the pre-XML days... well, the pre-XML days, well... they were CSV.

      Now... the thing is, many of the things you have mentioned are already expressed by Relational databases, which is generally what the CSV file is generated from in a batch based system. In alot of ways, that stuff already existed... just not in the file format, but in the process of creating said file format!

      Dont get me wrong, im not saying that XML is shitty... im just saying that XML is way over hyped. For a replacement for a system that has been around for 20 years... is XML really that special? Does XML really grant us that much beyond what CSV and good databases behind the scenes really help that much??? The proprietarity of XML schema's really dont make the standard just that open, now does it? It has the capacity of being an open standard, but on the whole, you often need to know the format in advance... how is that much different from CSV's and standard batch outputs have already presented?

      XML is not much of a step forward really... it is a step forward no doubt... but perhaps the best solution would be to make the data self enacting. Namingly, couple the logic to the data... so that code and data can exist as one.

    5. Re:Two Things... by aminorex · · Score: 1

      Actually, it is under-hyped.

      I did a hypometric regression on late 90s
      technologies, and XML was 27% below the
      weighted mean.

      --
      -I like my women like I like my tea: green-
    6. Re:Two Things... by jefu · · Score: 1
      the difference between XML and CSV files really isnt that signifigant
      Granted... XML is far beyond anything a CSV ever did

      I'm confused. Is the different not that signifcant or is XML far beyond...?

    7. Re:Two Things... by TummyX · · Score: 2, Insightful


      Does XML really grant us that much beyond what CSV and good databases behind the scenes really help that much???


      Yes because XML fits in places where databases aren't even worth considering. If you think XML is a replacement for relational databases then you're a bit lost IMO.

      How many generic CSV parsers are there? Are the fields (tabs?) self describing?

      Think of an OS and applications today and the various files they use. Think of configuration files, shortcut files, bookmark files, document files, project files etc. Think of all those files that have until recently all been stored in proprietry, hard to interpret and sometimes buggy binary files.

      Yeesh.

      XML is a huge step forward.

    8. Re:Two Things... by p_tweak · · Score: 1

      Does CSV have an easy to use parser & object model?

      • CSV
        strtok - The manpage is 1 page long...
      • XML
        - Hummmm... PLEASE tell me where I can find a library that requires less than 20 pages reading to use.
    9. Re:Two Things... by ReelOddeeo · · Score: 1

      In all honesty... the difference between XML and CSV files really isnt that signifigant.

      Um..., wrong.

      Where do I even begin with the problems with CSV?

      First, I suppose, no two vendors even agree what CSV is other than it has something to do with commas. Are values enclosed in quotes? Single or double? What about if a quote is in a value? No quotes? Okay, then what about if a comma appears in a value? Use an escape character? Which one? There are different conventions for escaping quotes. Suppose the value I want to enclose is 12" Drill (i.e. twelve inch drill), an inventory item.

      "12"" Drill", "AX-1234"

      Or maybe...

      "12\" Drill", "AX-1234"


      Next, CSV only supports "flat" data. A table. With XML, I can send you not only a header record, but detail as well. I can send you a bunch of Purchase Orders. Each PO has header fields like PO Number, Delivery Date, Vendor ID, but also has detail lines such as the order lines. Try that in CSV without having to come up with some kludge like a record type indicator and re-using fields.

      --

      Those who would give up liberty in exchange for security and DRM should switch to Microsoft Palladium!
    10. Re:Two Things... by TummyX · · Score: 1

      strtok? please. you might as well just give scanf as an example. what a joke.


      Hummmm... PLEASE tell me where I can find a library that requires less than 20 pages reading


      I think any programmer worth his salary can learn how to use use XML by reading a shorttutorial.

      I mean, it's not that hard. It's even easier if you're using a language like python of ruby.

    11. Re:Two Things... by yuri+benjamin · · Score: 1

      Think of an OS and applications today and the various files they use. Think of configuration files, shortcut files, bookmark files, document files, project files etc. Think of all those files that have until recently all been stored in proprietry, hard to interpret and sometimes buggy binary files.

      For most of computer history, config files were text files.
      I'm not just talking about /etc/*, what about *.INI that windows used before the registry file was invented?

      --
      You make the mistake of thinking you can educate the fundamental stupidity out of people. You can't.
    12. Re:Two Things... by TummyX · · Score: 1

      Ahem that's not the point. They were text files with a certain format. If you wanted to parse/read/write to the files you had to either go thru the supplied apis (Win32 GetProfileString & family) or manually parse the format yourself.

      Look at X's config files -- they're a different format from inetd's config files which are a different format from samba's config files. You'd have to write a different parser for each config file. With XML, it doesn't matter that the schemas are different, you can still easily parse, read and write to those files easily and with one common API.

    13. Re:Two Things... by horza · · Score: 1

      TummyX writes:
      Does CSV have a transformation language (XSLT)?
      Does CSV have an easy to use parser & object model (SAX, DOM)?
      Does CSV have an in document addressing language (XPATH)?
      Does CSV have a standard way of supporting hierarchical data?


      You can't get any easier than parsing CSV! Even the most basic languages can do it in a couple of lines. And the object model (a two dimensional array) is pretty well understood by everyone. For transformation, normally a simple loop with some basic logic will suffice.

      We do data exports to various companies, and 90% of them prefer CSV to XML because they can dump it straight into a table in a relational database.

      I like XML, but it's the right tool for certain jobs as is CSV.

      Serapth writes:
      XML is not much of a step forward really... it is a step forward no doubt... but perhaps the best solution would be to make the data self enacting. Namingly, couple the logic to the data... so that code and data can exist as one.

      How is this different to a XML document with PHP embedded (or Perl/Python if that way inclined)?

      Phillip.

    14. Re:Two Things... by 16K+Ram+Pack · · Score: 1
      So where is the schema definition standard for a .CSV file that is an industry standard and defines for the producer and consumer the rules governing the data format of fields?

      As for dumping into a relational database, certainly Oracle, DB2 and SQL Server have tools for processing XML input and output, and I think there's limited support in MySQL.

      CSV for me is what you use when you don't have XML as an option (like antiquated systems that haven't been updated).

    15. Re:Two Things... by Shimbo · · Score: 1

      You can't get any easier than parsing CSV! Even the most basic languages can do it in a couple of lines.

      Yes. And it'll break on stuff with embedded commas. Or maybe it will work with one CSV syntax and not with another (yes, there is more than one CSV syntax). Or get the escaping rules wrong.

      Sure, all this could be solved fairly easily. XML just gives you a rock solid specification and a set of widely available libraries. The slight added complexity in parsing does mean that folks tend to use an off the shelf parser, rather than knocking up one that will break at half a dozen edge cases.

      Whatever the technical merits (it surely is overhyped), it does have one main advantage: people are talking about interoperability again and writing schemas for their own community. That people are talking is much more important than the syntax they use.

    16. Re:Two Things... by poot_rootbeer · · Score: 1

      Does CSV have a transformation language (XSLT)?

      sed

      Does CSV have an easy to use parser & object model (SAX, DOM)?

      awk

      Does CSV have an in document addressing language (XPATH)?

      perl

      Does CSV have a standard way of supporting hierarchical data?

      Okay, I'll give you that one.

    17. Re:Two Things... by sean23007 · · Score: 1

      Just cause you think it's overhyped doesn't mean it isn't worth every bit of that hype.

      Isn't that the definition of "overhyped"?

      --

      Lack of eloquence does not denote lack of intelligence, though they often coincide.
    18. Re:Two Things... by TummyX · · Score: 1

      I would have thought that "overhyped" meant that it isn't worth all the hype. Anything less would mean that it was "underhyped". Maybe.

  28. This is the XML killer app by Anonymous Coward · · Score: 0

    Someone should use this new-fangled XML thingy to make a universal markup language that people could use to define and deliver structured data to any application in a standard consistent way.

    1. Re:This is the XML killer app by Doug+Merritt · · Score: 1
      Someone should use this new-fangled XML thingy to make a universal markup language that people could use to define and deliver structured data to any application in a standard consistent way.

      Are you joking? That's what XML is for. Except instead of "a universal markup language" it's "a set of"... XML is the meta-language used to create a horde of universal languages, one per application area.

      E.g. one for chemistry, one for math, etc.

      If you think that there should only be one such language, rather than one per application area, well, the problem is it would be too big; you'd have to support markup for choreographing dancers, markup for orchestral music, markup for the 3D displays that will come on the market in the year 2015, markup for history timelines... it would be an infinite language.

      Thus XML as a meta-language. It allows an infinite number of sub-languages to be created, as needed. That's probably as good as it gets.

      XML has good and bad points, but it's important to understand that this is its true inner nature, before getting into the tradeoffs.

      --
      Professional Wild-Eyed Visionary
  29. Resumes by Anonymous Coward · · Score: 0

    How about making an XSL style sheet for resumes in OpenOffice?

    "tags" like Name, address, education, jobs, skills. Then break them down...
    example: education -- uniName, gradYear, gpa, major, minor, ....

    If the things are in standard drop-down boxes like "heading 1" "heading 2" "normal" , etc.. are now...

  30. Ease of XML Document Formats by DJ+Rubbie · · Score: 4, Interesting

    XML does make it extremely easy to create documents on the fly, whether a plain old document or a slideshow presentation, all it needs is some template XML, original text, and some programming language to put it together.

    I wrote a song lyric storage system using PHP and MySQL, and I had the idea to have it be able to be put onto a slideshow to teach it to a group of people (or whatever). With the XML format provided by OpenOffice.org, I was able to quickly put it together and show it off, impressing quite a few people in the process. Of course, those people think Word/PowerPoint run the world, and the file format is all but a mystery to them. Hence having something generated on the fly via a webpage has its cool factor, and not to mention it was a good chance to introduce this free word processing suite to them. Also a good chance to tell them that if I were to rely on ASP/PowerPoint it would have costed much, much more.

    Open document format is the way to go in the future, because it definitely allows interoperability.

    --
    Please direct all bug reports to /dev/null
    1. Re:Ease of XML Document Formats by Serapth · · Score: 1

      Of course, those people think Word/PowerPoint run the world, and the file format is all but a mystery to them. Hence having something generated on the fly via a webpage has its cool factor, and not to mention it was a good chance to introduce this free word processing suite to them.

      One thing that everyone seems to forget ( or is un-aware of... ) is that Microsoft provides API's for the creation of Office documents via code or script aswell. In the past, due to the bean-counters dependance on Excel spreadsheets, I have have had to generate XLS files from a web interface. Trust me, if you are willing to work within the Wintel platform using MS tools... its very very very easy to do!

    2. Re:Ease of XML Document Formats by Anonymous Coward · · Score: 0
      From parent's parent
      ... if I were to rely on ASP/PowerPoint it would have costed much, much more.
      'nuff said.
    3. Re:Ease of XML Document Formats by codepunk · · Score: 1

      And having done this myself you are correct it is fairly easy to do, but it is damn unstable and it scales for shit...now go back to hacking on your hello world vb program.

      --


      Got Code?
    4. Re:Ease of XML Document Formats by Serapth · · Score: 1

      Depends on how you do it... basing an enterprise system around Excel is hacky as shit... Ill give you that... Might as well build your system from popcicle sticks and bubblegum. That said... exporting output to excel, if written correctly, its a perfectly fine thing to do.

      Hey... personally I hate excel itself, but then again, im not a bean counter. But, meeting customer requirements is perhaps the most important thing you do. As an analyst... its your job to contrast proper solutions vs customer requirements. You dont often get to make all the decisions yourself... and if you are making all the decisions yourself, you probrably arent serving your customer properly!

      Idealism is one thing... reality is something completely different!

    5. Re:Ease of XML Document Formats by Serapth · · Score: 1

      Actually... in my case, it was a ColdFusion based system... and I havent a clue how you fit powerpoint into all of this. As to "It would have cost much much more"... gee... I would have loved to have seen how... Most enterprise type environments already have either A) bought licenses for all vital employees requiring MS office or B) have a MS site license.

      If the origional poster was talking about development costs... I would love to see numbers to back that up. I hate to say it, but for the most part, Windows based programming seems to be as affordable as it gets... with the exception of the initial compiler costs. Now... if you are in a enterprise environment, where 1000$ is too much for development tools... or, even more appropriate, if 3000$ for a MSDN license is too much, perhaps its time to polish your resume... since your 50,000$ per year or greater salary, must seem oftly expensive to them!!!!!!!

    6. Re:Ease of XML Document Formats by aminorex · · Score: 1

      The sole reason why my employer dropped my MSDN
      universal is that the money was going to Microsoft.
      Now I get an Alienware laptop every year instead,
      so I'm MUCH happier.

      --
      -I like my women like I like my tea: green-
    7. Re:Ease of XML Document Formats by Serapth · · Score: 1

      Thats rather strange, as most corporate CIO types are thoroughly in bed with Microsoft. Sending money to microsoft doesnt normally bother them... well... much more then sending money to anybody that is...

      That said... I wish my company bought me a sweet ass gaming laptop every year aswell :)

    8. Re:Ease of XML Document Formats by 1010011010 · · Score: 1

      Having suffered through an application (written by a contractor hired by now ex-employees) that used Word 's API to generate files, I can say this:

      Word sucks. Its API sucks. It is slow as shit. Hours to generate reports because of all the retarded back-and-forth with Word, when an HTML version (or even an RTF version -- RTF is just text) of the same report gets generated in seconds.

      XLS file from a web interface? Here ya go:

      <table/> -- add content as needed.

      --
      Napster-to-go says "Fill and refill your compatible MP3 player", which is a lie. It's not MP3. It's WMA with DRM.
  31. Web-Document Templates; Charts; Presentations... by Anonymous Coward · · Score: 2, Insightful

    There a many uses, besides simply having a format that multiple programs can open. Besides, when new features are added to the format, the older software could ignore those tags, somewhat like HTML has been doing. Then you get the ability to still open newer variations on the format. Not to mention make it easier to covert between them, and add an XSLT to an older app to "update" it to support the newer fomat better.

    few off the top of my head:
    online services generating template documents; such as online resume creating websites.

    Draw charts in a GOOD charting program instead of the crap these office programs have.

    Generate presentations from outlines or databases, create videos from presentation files

    For the small-time database software, the database could be imported into other database software, or converted to SQL or be translated into just about anything.

  32. Is it just me... by cca93014 · · Score: 3, Funny
    or is XML good for the following things:
    • moderately useful at providing a very basic cross platform information transport.
    • very useful when being mentioned by PHBs in meetings with CEO/Investors in an attempt to look knowledgable, bleeding edge and worthy of their job/salary
    • exceedingly useful when being mentioned by stock analysts to pump a company

    I mean, come on. It's just a standardised file format. That's all it is, OK?
    1. Re:Is it just me... by wellFormedEntity · · Score: 1

      It's also good for the following:

      • Generating documentation in multiple formats from a single source document. Need online help and printed help, but don't want to change documents in two places when the program changes? Enter XML.
      • Controlling input for documents. Slap a form on the front, and get structured output out the back. Easy.
      • Change tracking and document management. Much easier to track changes when a file is stored in a text (not binary) format. You can even use the same source control as the programmers (ClearCase), or you could graduate to the clusterfuck known as Documentum.

      and that's just off the top of my head. It's an enterprise solution to an enterprise problem, not meant for Joe Average on the desktop. But if you are managing thousands of pages of constantly changing documents edited by multiple users, XML is worth the hype.

  33. Re:I hate XML on Mac's by Anonymous Coward · · Score: 0

    Yeah it's flamebait... I couldn't resist...

    You sir are an ID10T! Cutting shielding, removing the drive, bending the case! I would sue the crap out of you if you F*cked up my PowerBook!

    I've installed plenty of Airport cards into Mac laptops and yup, it's a bit of a pain; but if you can't get it installed without mucking it up then you are complete hack.

    Besides you should have read the installation document! It's extremely clear and has lot's of pictures and even videos for half-wits like you!

    Customer Installable Parts Reference:
    http://www.info.apple.com/usen/cip/ind ex.html

  34. staroffice has xml?? by tucolino · · Score: 0

    I downloaded staroffice 7 yesterday. however, i cannot save nor open xml documents (no option). Also, a friend of mine got a preview of the upcomming office 2003 and that one did save as xml. Of course, I could only open the xml and view the tags, but taht was about it. No other word processor was able to view the document (StarOffice, OpenOffice, Abiword). Tuco

    1. Re:staroffice has xml?? by denny_d · · Score: 1

      yeh, that's a definite limitation of OO... I think I posted that RFE years ago... the data is compressed... you have to uncompress it to see the xml... for a simple one word document there are 5 xml files in the compressed file...
      content
      manifest
      meta
      settings
      styles
      I haven't bothered with creating an xsd or dtd but the files can be viewed as xml... but clearly not as easily as it should IMHO.

    2. Re:staroffice has xml?? by La+Camiseta · · Score: 1

      StarOffice's and OO's formats are XML, always. It's just that they zip them up afterwords to save on space. Just feed the files through any unzip program (you may have to change the file endings, but you usually don't), and you can get at the styles and the raw XML data.

  35. XSLT Stylesheets by connsmythe96 · · Score: 1

    Here's someone who actually did do something with these. This proof-of-concept shows that you can easily convert the xml files to a browser-readable format.

    --
    if(!cool) exit(-1);
    1. Re:XSLT Stylesheets by connsmythe96 · · Score: 1

      Errr, the page he links to as an example is not there anymore (he's moved servers). So you can't actually SEE the results. But the process and the description are still valid.

      --
      if(!cool) exit(-1);
  36. lyx/latex by sewagemaster · · Score: 1

    why dont they just build openoffice from latex/lyx? i just apt-get'ed it yesterday and it seems to have everything i need for documentation....

    1. Re:lyx/latex by Daengbo · · Score: 1

      Lyx, at least, still has problems with asian input. Getting Thai to work for me is an ongoing project. I won't give up, though, because we are translating and updating "Grokking the Gimp" into Thai, and it should be in latex.

  37. Agreed.. by msimm · · Score: 4, Insightful

    And before anyone try's to point out the cost/open source issue: In business that doesn't mean squat. Trying to sell something for free is the wrong attitude, businesses don't want to rely on good will. Kudo to all the dual licensed project out there that have learned how to play both sides of the fence.

    --
    Quack, quack.
    1. Re:Agreed.. by William+Baric · · Score: 1

      And before anyone try's to point out the cost/open source issue: In business that doesn't mean squat.

      I agree for the open source part but for the cost issue I have one word for you : piracy.

    2. Re:Agreed.. by msimm · · Score: 1

      Kind of goes along with my point. I mean why pirate commercial software if you could use open source without exposing yourself to unnecessary risk?

      --
      Quack, quack.
    3. Re:Agreed.. by William+Baric · · Score: 1

      They don't switch to OpenOffice because the bottom line is the only thing that matters. Switching from pirated versions of MS Office to a free OpenOffice cost money. I charge only $35CDN an hour for OpenOffice (installation and documents conversion), I offer 2 days of free formation on site and a special 16 hours (in 4 hours blocks) of support for $320CDN (mainly for psychological reasons). It's not much but when you add the cost of a temporary productivity loss you end up with a bottom line of $5000 to $20,000 for 20 employees (depending on their job, their salary, their ability to learn and, most importantly, their ego). Sure, in the long run switching to OpenOffice is a good choice for a lot of small business but there is very few "decisions makers" who care about the long term, particularly when a pirated MS Office is "free".

    4. Re:Agreed.. by msimm · · Score: 1

      Pirated copies of Office are popular for all the regular reasons.

      1) Well supported through traditional (read: idiot proof) channels.
      2) Reasonably well designed UI and functionality.
      3) Client and customer familiararity.
      4) Expected longevity (Microsoft will be around for a while).
      5) If something *really* goes wrong, you've got a company to blame (why would the average PHB go out on a limb?).

      The value may be there in other places for OS, but businesses will continue to be more conservative and rely on old and proven (relatively) methods.

      --
      Quack, quack.
    5. Re:Agreed.. by William+Baric · · Score: 1

      Well supported through traditional (read: idiot proof) channels.

      People *think* it's well suported but is it? When XP came out I had some problems with the combination Windows XP, Office 2000 and NT 4 server. Calling Microsoft was a waste of time (as usual) and I had to wait about 3 months before the problem was solved.

      I don't know for big business, but for small business, support means the tech guy (me)... which means support for OpenOffice is as good as the one for MS Office. I agree that right now there's only a few people who can offer support for OpenOffice but here in Montreal it's not that hard to find and it doesn't cost more than MS Office support.

      This idea of support is only an illusion.

      Reasonably well designed UI and functionality.

      There's a few minor differences between MSO and OOo but nothing extraordinary. It's certainly not a reason for using pirated copies of MSO.

      Client and customer familiararity.

      Not sure what you mean by that. I agree most PHB have a sheep mentality (which is not surprising as they don't know squat about computers) and that's a problem for OpenOffice. But most of my clients have trust in me so it's not THE problem.

      Expected longevity (Microsoft will be around for a while).

      Do you expect OOo or Linux to die next year? Once again it's a perception problem not something real and PHB could understand this if they were forced to think.

      If something *really* goes wrong, you've got a company to blame

      Yes and in my case I'm the one who get to be blamed. But let's be honest... even when all those macro-virus were a serious problem did you see any PHB blaming Microsoft? The blame game works when you blame one of your colleague but not when you blame a company that YOU chose.

      The value may be there in other places for OS, but businesses will continue to be more conservative and rely on old and proven (relatively) methods.

      As far as I know Linux is OS... and there's a lot of business who are using Linux servers. You're right about the "old and proven methods" (this what I call the sheep mentality) but if tomorrow all pirated copies of MSO stop working, you can be sure a lot of people whould switch to OOo. Most PHB who use pirated copies of MSO will come to the conclusion that all those reasons are not worth the price of a licence.

  38. OMFG someone with sense by DrSkwid · · Score: 4, Funny

    Ron Minnich at lanl described this one also (though we weren't talking about XML)

    -----
    You want to make your way in the CS field? Simple. Calculate rough time of
    amnesia (hell, 10 years is plenty, probably 10 months is plenty), go to
    the dusty archives, dig out something fun, and go for it.

    It's worked for many people, and it can work for you.
    ----
    if you must

    So get ready for all the gee whizzery now the new kids have "found" plain text.

    --
    There are places where the networks are not touching,and there are places where they are-Boeing's Lori Gunter
    1. Re:OMFG someone with sense by Anonymous Coward · · Score: 0

      Certainly true. What are web services and thin clients but a resurrection of time sharing with dumb terminals?

    2. Re:OMFG someone with sense by 2short · · Score: 1

      But it need not be looked at so cynically. Sure, a lot of amazing new things are really just finding things that worked well in the past. But the key is finding the right things, and doing them in ways that take advantage of advances made in between.

      People using "plain text" found that they needed to express more of the structure of the data, and invented various ways to do this, but they lost the advantages of plain text because they used different markup strategies, or different binary formats. So some people realized they could get most of the advantages of plain text, plus the advantages of markup, if they could get people to standardize on a particular markup strategy. Hence XML.

      XML really is nicer for many things than "plain text". If by "plain text" you mean really plain text (no markup), then XML is more expressive. If you mean text using some markup scheme, then it is nicer because it's standardized. If I never have to figure another one-off markup scheme, that alone will qualify as "gee whizzery".

  39. You must manage, force use of limited metadata by g8orade · · Score: 4, Interesting

    I helped spec out a document management metadata database 18 months ago for an engineering firm that wanted to catalog its files. They started out wanting just to categorize their CAD drawings, then decided to include all types of project files.

    Our solution was a tcl front end that forced the entry of a minimal amount of metadata *during file creation,* to be picked from preset categories and subcategories. We also provided for free text entry but that was to be used only after the other fields.

    The points are
    a) The general metadata categories were known; the engineering tasks weren't new.
    b) No one is going to go back after the fact and enter the metadata. You have to integrate its entry into the new file work procedure.
    c) It's got to be as easy as file/new in a GUI.
    d) Its utility has got to be very very apparent when juxtaposed with a subdirectory / filename scheme.

    1. Re:You must manage, force use of limited metadata by yuri+benjamin · · Score: 1

      You've already done what I suggested in an earlier post.

      I should have read more of this thread. You posted that 4 hours before I did.

      Thanks for that - at least now I know my suggestion wasn't way off the mark.

      --
      You make the mistake of thinking you can educate the fundamental stupidity out of people. You can't.
  40. It's called troff by DrSkwid · · Score: 2, Informative

    and we've had it since most /.ers were born

    then there was postscript

    now XML

    whee, I have candyfloss in my hair

    --
    There are places where the networks are not touching,and there are places where they are-Boeing's Lori Gunter
  41. Command Line Ski11z0rz by Eberlin · · Score: 1

    A bit like the PDF to Text command line stuff that already exists. Lots of power there if it can be tapped.

    Office document gets parsed by a script, the images extracted and run through mogrify for scaling and branding, then the text gets translated to xhtml for posting on a DB-driven site somewhere.

    Even better -- a BOFH can scan through the network of shared documents, catalog any and all confidential information, grep them for anything particularly interesting, and maybe post a few names into alt.social.deviants or whatnot. All from a small script instead of half a day wading through mundane memos and accounting info. That's efficiency!

  42. Bullshit - I use OO all the time at home. by The_Dougster · · Score: 1

    How in the hell am I going to use MS Office in Debian Linux? When I need to print out an envelope or mailing label or little letter, I fire up OO and get the damn job done. Its great, and it doesn't have all the idiotic quirks that MS Office has which presumes that I am some kind of moron like you who needs my dick held for me every time I need to take a piss. Get back under the bridge, you evil MS ass-troll.

    --
    Clickety Click ...
  43. How about XML to troff by DrSkwid · · Score: 0, Flamebait

    and go full fucking circle

    --
    There are places where the networks are not touching,and there are places where they are-Boeing's Lori Gunter
    1. Re:How about XML to troff by p_tweak · · Score: 1

      Dude, I think you don't get enough Milk.

  44. haha by Anonymous Coward · · Score: 0

    obviously, if you havn't noticed, the entire "new world" promise is an empty promise made by some stupid overzealous open source freak who thought it made a difference. First of all, if you need dynamic shit, you don't use a staroffice or MS Word document to spit out dynamic content, you use HTML like the smart people do. Secondly, why the fuck would you need to have open standards for word processing? There is absolutely no good reason -- only the anti-MS zealot who says "competition!" . But really when you think about it, there is absolutely no good reason in the world to need competition for word processing formats, its the the frontend that you need its the god damn features like line spacing and other aesthetically specific needs. If you want an open format, look into HTML you dumb shits.

  45. Microsoft Dropped the Ball? by Carnage4Life · · Score: 5, Interesting
    Now that MS has dropped the ball on the XML Office front,
    I'm curious, how did Microsoft drop the ball with respect to other XML-based Office suites? The linked article points to a report that the ability to import user-defined XML formats into a form that can be understood by the primary Office products is an Enterprise feature. However loading or saving documents using a default XML format is in the base versions of Office and in fact was in the last version of Office given that Excel had a documented XML Spreadsheet Format.

    Is anybody out there writing Perl/Java/whatever programs to take advantage of StarOffice XML?
    Not me but I am writing C# apps that make use of Excel's XML format. I wrote about using XSLT on the Excel XMLSS format in my blog a few months ago when I had to update date values in certain columns. I also posted the XSLT stylesheet.

    Disclaimer: I work on the XML team at Microsoft but not directly with Microsoft Office.
    1. Re:Microsoft Dropped the Ball? by Anonymous Coward · · Score: 0

      well goodie for you. are you just the M$ weenie man of the year. how much fucking money have to spent on buggy, bloated, and in-secure microsoft products?

      one word you: *SUCKER*

    2. Re:Microsoft Dropped the Ball? by Cid+Highwind · · Score: 1

      I'm curious, how did Microsoft drop the ball with respect to other XML-based Office suites?

      A few months ago there was an posted article here about the upcoming Word2k3 xml file format. According to that, the XML-based format did not have all the information normally present in a Word2k .doc. There was either information lost when saving to XML, or some of it was still in a proprietary binary format. (I don't remember which, and I can't find the link right now, sorry...) If either of those is true of the release version, I would say the MS has dropped the ball on XML, or at least missed the point. XML is (well, should be) about making exchanging information between different platforms easy, not just another buzzword to get the PHBs to pony up $500/license for the latest version of Office!

      Disclaimer: I work on the XML team at Microsoft but not directly with Microsoft Office.
      Hint: Never ever tell anyone on slashdot you work for Microsoft. :D

      --
      0 1 - just my two bits
    3. Re:Microsoft Dropped the Ball? by YouAreATool · · Score: 4, Informative

      At this point, people should realize /. articles are mostly fretards talking out their ass. I too read this article, thinking: wft? As I am writing this comment, I'm looking at my (beta) Word 2003 file save dialog and an example XML doc I just made. It round-trips all formatting and junk in the XML format. It has a "save data only" checkbox in the saveas dialog, and can support xsl transforms (you supply the xsl) on export. If I cared, I think I could make it export OpenOffice format pretty easily. The high-fidelity XML file has a lot of junk, but it's all XML.

    4. Re:Microsoft Dropped the Ball? by Anonymous Coward · · Score: 0
      I'm curious, how did Microsoft drop the ball with respect to other XML-based Office suites?
      Presumably leaving OASIS's standardisation of an XML format.

      Microsoft are big enough to do it on their own so I don't know about 'dropping the ball' but everyone else is moving to standardisation.

    5. Re:Microsoft Dropped the Ball? by Anonymous Coward · · Score: 1, Interesting

      That article

    6. Re:Microsoft Dropped the Ball? by Anonymous Coward · · Score: 0

      The point is: you are using a BETA version of Word 2003. OpenOffice and StarOffice have been out for quite some time now, support XML fully, were developed with XML as a core requirement (not some new industry-requested feature like MS Office bloatware), and have already been revised, tweaked, and re-tested a couple of times. Your Word 2003 is still in a beta-test stage, which means it's far behind the "competition." I can't explain to you what pain and suffering all of us 'typical' Word users go through at my office because you're too far removed from reality. You really think Word 2003 is good? Most likely it's just more bloat than the last version with a couple buzzword technologies thrown in to make it look important to Joe Schmoe User.

    7. Re:Microsoft Dropped the Ball? by 4of12 · · Score: 1

      I'm curious, how did Microsoft drop the ball with respect to other XML-based Office suites?

      Possibly by keeping the XSD proprietary?

      You and others with unfettered access to those schemas through the latest MS applications like Office, Excel and C# are quite able to run the XML through any XSLT as much as you like. You can even use openly published XSLT to transform Word XML into other, more open XML based formats, as long as you buy the application and access to MS schema.

      What's more problematic is whether others, without access to XSD, will be able to make a Word document saved as XML be able to look the way Word presents it. Or, to edit those XML files with anything but an MS application.

      This is analogous to the long-standing problems with the .doc "standard": Strictly, RTF is documented, but it's value is disputed: presentation rules for RTF by Word are not completely documented publicly and different versions of Word can change how a document appears.

      --
      "Provided by the management for your protection."
  46. Why bother? by lurker412 · · Score: 1

    If Microsoft is successful at deploying its DRM scheme, then interoperability will likely go out the window, no pun intended. Just as planned?

  47. XML for Office is overrated by tjstork · · Score: 0, Flamebait


    It's an overrated system with way too many features and having it be scriptable should be held up as proof of that.

    --
    This is my sig.
  48. Yup, peeople are by amblin · · Score: 4, Informative

    Take a look at Axkit's, OpenOffice filter.

  49. I use OOo you insensitive clod by Anonymous Coward · · Score: 0

    perhaps a free (as in beer) Word plugin?

    I use OpenOffice.org suite for Windows instead of Microsoft Word for Windows, you Insensitive Clod(tm).

  50. XML and MS Office by Mr.+Ophidian+Jones · · Score: 3, Informative

    I guess there's XML and there's XML and getting between them is not necessarily easy.

    Microsoft made a big deal about the most recent versions of Office writing out XML, but that was because XML was a buzzword, sounded as if it might be more open than ".doc", and was essentially a selling point.

    From what I've read, people have been underwhelmed with the XML coming out.

    If only a similar set of transformations could be developed for OpenOffice to import and export the XML of the latest version of Microsoft Office. From what I understand, the schema is not documented and the formatting and rendering rules for documents are still kept a private affair, just as it has been for .doc files.

    You're still locked-in, dude!

  51. Working on it. by FooMasterZero · · Score: 1

    I am working on the ability to read the Solver files and import them into database via JDBC

    so be patient :-P

  52. Zope CMF + StarOffice by Anonymous Coward · · Score: 0

    There are plugins ("products" in Zope-speak) that let you save star/open office documents to a zope server, and automatically make them into content for your web site and integrate with content management workflow (if you have one).

    Just like... oh.... Microsoft sharepoint portal server and Microsoft Office...

    Only infinitely cheaper....

    Now, I'm not too keen on Zope (I HATE its OODBMS - why not just use a relational backend? The relational model can do everything OO does, and more, then again Zope APE might make the point irrelevant...), but the content management framework is pretty sweet, anyway.

  53. That quote only works for MS Office by Publicus · · Score: 1

    And it would pan out, too, if MS didn't drop the ball.

    If MS didn't drop the ball, we'd have offices full of non-IT people creating XML documents without realizing it. A mass of structured data would build and become grist for the mill that is the office geek.

    Unless OpenOffice/StarOffice has some huge market share that I'm not aware of, I'm not expecting to see any remarkable perl scripts for parsing office docs soon.

    --

    My Karma was at 49, then they switched to words. All that work for nothing!

  54. "cost", not "costed" by Anonymous Coward · · Score: 0
    1. Re:"cost", not "costed" by mabhatter654 · · Score: 2, Insightful

      Hey SlashLords! I humblely request We need a "-2 GrammerNazi" to get rid of these!

  55. Docbook XML OOo Filters by Evangelion · · Score: 2, Interesting

    I've been using these XSLT OOo <-> Docbook-XML filters for a little while.

    They work pretty well (if you can manage to get them installed with the broken install instructions) but only for a limited subset of Docbook. There's no support for the programlisting tag, and lists are currently broken.

    If anyone out there has superior XSLT kung fu, getting those two things working would be most appreciated : )

    (I know the basics, but I don't yet have time at work to justify it. Maybe if this project gets done on time...)

  56. Useful Scripts for XML by sankeld · · Score: 0

    What kinds of new and wonderful things can you come up with?

    rm *.xml

    1. Re:Useful Scripts for XML by Anonymous Coward · · Score: 0

      Really. That's an un-funny troll. Kick it up a notch next time.

  57. It's the presentation, not the format by Alan · · Score: 1

    The issue with MS office files has been more with the ability to present it back to the user the same, not reading the file. Various programs have been able to "read" (grab text from) ms office formats for ages, the issue is that noone has been able to write a word processor that shows a moderately complex document/spreadsheet/powerpoint back to the user closely enough to the same. Don't get me wrong, some are close, but if you're tweaking your fonts and whatnot for say, an investor, you don't want OOo to go and convert everything so that it mucks up the tables and converts all the fonts to 12pt arial.

    For a programmer or geek, or even someone just using it (OOo or similar) as a word processor to write letters to mom, not a big deal. But in a corporate environment, it's gotta be exact. At work (in the education industry, and therefor with lots of macs) everyone uses PDFs, but in the non-mac world, it's .doc/.xls/.ppt that is the "standard"

  58. Office Automation by merlin_jim · · Score: 3, Interesting

    Well I don't know about Free/Open/Libre or XML development for Office... but I do know about the proprietary APIs Microsoft distributes for Office.

    If you wanna give them a try sometime, assuming you got Windows, VB5+, and Office installed... just add Office to your references (try Microsoft Office in the Project References menu) and give it a whorl. It's fairly easy to program in if you've used Office... most of the concepts that make for a good Office user translate directly into programming concepts for the Office object model.

    And yet Office Automation programmers are in scarce supply.

    Microsoft even offers a cert specifically for Office Automation programmers!

    But I haven't seen too many well written Office applications. My speculation is that its not for lack of tools, but that its for lack of concepts. Other than the obvious reporting needs that any large organization has, are there any compelling reasons to spend an afternoon coding an office application?

    I think it is this lack of compelling reasons, and not a lack of easy-to-use programming tools that causes the lack of good free open add-ins...

    --
    I am disrespectful to dirt! Can you see that I am serious?!
    1. Re:Office Automation by ReelOddeeo · · Score: 1

      Got OpenOffice.org? Want to see something more amusing than boring business reports?

      Well, I've programmed a working Digital Clock and also a Calculator as an OOo Drawing.

      See here: Digital Clock and Calculator .

      For something completely different see this.

      Danny's Draw Power Tools .

      --

      Those who would give up liberty in exchange for security and DRM should switch to Microsoft Palladium!
    2. Re:Office Automation by merlin_jim · · Score: 1

      Want to see something more amusing than boring business reports?

      No offense, but who cares? Fulfilling business needs pays the bills. I post asking about compelling uses and you give me a clock? I mean really. The goal here is to meet the needs of people who pay for technology. If I want to buy a clock I go to Walmart, not my friendly neighborhood consulting firm...

      --
      I am disrespectful to dirt! Can you see that I am serious?!
    3. Re:Office Automation by ReelOddeeo · · Score: 1

      I'll tell you who cares. If you were trying to learn to program OOo which has a non-trivial API, then you might care. My experience on OOoForum suggests that people care very much about having working example programs. You might notice that the folder the links led to were titled "Examples".

      When I first started learning OOo's API, there was a scarcity of examples and documentation. It was very difficult to learn. I frequent OOoForum.org and spend a great deal of time answering questions, providing working code snippets to perform useful -- gasp -- business tasks. I figure the more people who learn to program OOo, the more examples there are, the more documentation and HOWTO's are written, the more successful OOo will become.

      One of the recognized values in MS Office is the automation. The ability to build end-to-end automated business systems using the office components.

      Obviously, all this technology doesn't exist for our personal amusement. It all developed around the needs of business and the military.

      At least the military was not so short sighted that they would invest in the development of computers, and other basic research like DARPA, or the Internet. Much of business, as your post so clearly illustrates, are only focused on the immediate gratification of profits. Blind to the several intermediate steps it takes to get to a larger goal.

      Of course someone who just uses free software to help themselves profit with no interest in contributing to a community, probably wouldn't care.

      --

      Those who would give up liberty in exchange for security and DRM should switch to Microsoft Palladium!
  59. Wouldn't client side be better? by yerricde · · Score: 1

    "Apache module"? Can't XML-supporting web browsers use some sort of XSLT filter and do this displaying on the client side?

    --
    Will I retire or break 10K?
    1. Re:Wouldn't client side be better? by Anonymous Coward · · Score: 0

      Yeah, but why rely on client support? (limiting it to Mozilla1+ and IE5.5+).

    2. Re:Wouldn't client side be better? by Rogerborg · · Score: 1

      Pah, don't talk sense. It's far sexier to abuse X/HTML by shoehorning an XML schema into it. Don't you know anything about the challenge of abusing the wrong tool for the wrong job? ;-P

      --
      If you were blocking sigs, you wouldn't have to read this.
  60. Actually, WordPerfect has supported XML for years by Karl_D_Schroeder · · Score: 2, Interesting

    ...Of course, not very well--but it's pretty easy to compile, say, the Docbook 4.1 DTD in Wordperfect and edit moderately complicated documents. Or import... The limitations are that it uses its own formating system, rather than XSLT; and it uses DTDs instead of schemas, because the technology derives from SGML (which wordperfect also supports). Arguably, WordPerfect has better support than any of the alternatives within the word processing space (i.e. discounting pure editors such as EMACS).

    --
    Author of Permanence and Ventus, co-author of The Claus Effect and The Complete Idiot's Guide to Publishing SF.
  61. Searching for files with *.sxw in the name... by Chordonblue · · Score: 1

    "There were 2469 documents found. Did you find what you wanted?"

    I should say we have.

    Linden Hall School converted completely to OOo and StarOffice two years ago and haven't looked back since.

    Maybe you should consider your rising taxes and the cost of MS product before blindly recommending our schools continue using it, eh?

    --
    "...Well, there's egg and bacon; egg sausage and bacon; egg and spam; egg bacon and spam; egg bacon sausage and spam..."
  62. Putting the cart before the horse... by EricTheGreen · · Score: 4, Insightful

    Bemoaning the lack of XML-based magic goodness in corporate document processing assumes that a corporate document base exists which a) follows predictable content and structural patterns to allow automated processing, and b) is structured and rigorous enough to do meaningful processing against, an assumption which frankly doesn't hold water in too many places.

    For most of the office document world (at least the world I work with regularly), most documents are unique in both structure and content and I as a programmer can make only the most basic of assumptions regarding what a program can expect to find within the content bundle. Sure the XML gives me a nice set of rules to rely on for breaking the document into parts and reading it in. But it doesn't do a whole lot to ensure that, say, two spreadsheets follow similar content assignment conventions. Most places can't get two managers to agree on the form and structure of a basic memo, or even get the same individual to repeatedly use a consistent structure in all his/her business communications.

    Most organizations need to work on a few things before this type of processing will be useful in the large. Two particular areas would be: a) consistent use of metadata within document definitions to facilitate querying and filtering, and b) more sophisticated use of template functionality beyond just ensuring every page has the same graphic in it's header.

  63. No, I'm New Here by New+Here · · Score: 0

    No, I'm New Here

    1. Re:No, I'm New Here by New+Here+too · · Score: 1

      I'm New Here too.

      What a strange day!

  64. Assumption by m00nun1t · · Score: 2, Interesting

    I still don't get this thing about MS dropping the ball. I've played with Office 2003, and the XML features in particular (mostly Word & Infopath, not the other programs) and I think they are quite well done.

    Word has two different modes. One is where you can save an ordinary word document in an XML format. This is the one /. goes on about mostly. Yes, it's pretty ugly XML, but you are trying to represent non-structured data in a structured format - of course it's going to be ugly. But it is documented & there is a publicly available XSLT from Microsoft to work with it. The other mode is to import and XSD and tag up the document as you like. You can save this in "rich" mode (with all the office formatting - unstructured again) or "clean" mode in which the XML is as pure as your XSD is.

    InfoPath simply rocks. Where else can you create a end user friendly UI that outputs clean XML (with XHTML islands if you choose) and will submit directly into a web service & make the whole thing start to end in a few minutes (for a simple form, of course).

    I just don't get it. Seems like mindless MS bashing to me.

  65. Stopgap solution until CSS3 by yerricde · · Score: 1

    If you want an open format, look into HTML

    Print comes on pages. Few if any HTML viewers support the CSS extensions for paged media. Until CSS3 support becomes widespread, word processing programs' data formats fill the gap.

    I won't answer the rest of the troll.

    --
    Will I retire or break 10K?
  66. Creating docs from PHP by steveoc · · Score: 1

    I have an accounting system for SME's using Apache/PHP/MySQL intranet model.

    I am currently adding OOffice classes so that the accounting system can generate nicely formatted invoices and other customer related documents by generating an sxw document with the correct letterhead and layout. Fairly simple and effective, but I would not bother trying this without XML.

    The client can edit their standard template for each document, and PHP just fills in the blanks.

    Another one is for the debtors and creditors aged balances to generate an OOffice spreadsheet, complete with formulas, for projecting cashflow. I have yet to see any accounting software provide cashflow budgetting as simple and effective as a spreadsheet - so spreadsheet generation it is.

    Anyone else developing PHP functions to read/write OO docs ?? If so, we should create a sourceforge project and collaborate.

  67. The two stages we haven't reached yet by Anonymous+Brave+Guy · · Score: 4, Insightful

    The parent post is right on the money here.

    Right now, I don't want flashy, XML-driven power apps. I'd settle for a word processor where I can produce my document with minimal fuss and good quality results. Apparently the vast majority of other word processor users agree with me, because I don't see any big uptake of ueber-powerful macro systems, manipulation tools based on super-flexible file formats, or any of the other much-promised stuff.

    The simple truth is that usability is nowhere near the point where these facilities add value yet. Before you can develop powerful extra tools, you have to get the basics right:

    • a clean but powerful UI (no, this is not impossible)
    • good basic navigation and editing capabilities
    • good basic structure and formatting controls
    • good basic tools (spell check, word count and mail merge would probably do for a very large subset of WP users).

    These are essential for a serious document preparation system, yet no currently popular WP, commercial or free, even comes close to doing them all well. The serious people universally use either DTP packages or typesetting systems, and there's a reason for that.

    When we reach the stage where a word processor can do these things well, without the user ignoring stylesheets because they're too awkward, having to look up the help every time they do a mail merge or finding that limitations in the document structure support prevent you doing what you want to at all in a non-trival document, then we'll be getting to the stage where more powerful "workflow" tools might be of real benefit.

    The second stage, of course, is developing the tools to create those workflow tools, and making them sufficiently usable themselves that people actually take advantage of the advanced capabilities. Right now, we have some awesome-sounding automation tools available, but who really uses them? Not many people, IME. Much of the problem is that the automation tools themselves are, like the applications within which they live, simply too much effort to bother with.

    Give me a usable basic WP and usable tools to automate it (XML-based or otherwise) and I will move the document creation world. Until then, don't call us...

    --
    If you disagree, post your argument. (-1, Overrated) isn't your personal censorship tool for views you don't like.
    1. Re:The two stages we haven't reached yet by Pfhreakaz0id · · Score: 4, Interesting

      I agree with you SOOO much. Often times, it seems applications are written by programmers/computer geeks FOR computer geeks. I work on a workflow-based web application (It uses oracle workflow). We recently completely redid the app to do away with the Oracle-generated web pages for "notifications" (stages in the workflow) to do our own and send messages to the engine via API. Why? Our users just didn't "get" the workflow concepts and we had to design vastly more complicated UI that had pictures, etc.

      and yet we met with massive resistance from the other IT groups... "Why are you doing that, workflow does that" "that's a training issue (code phrase for 'the users are stupid') and "don't you know how to say no?" and (getting to your central point) "you've dumbed it down. Your application doesn't any of the powerful search, etc, features the workflow web interface has" (never mind NO ONE used these things).

      I think it was a piece from Douglas Adams who told a story of someone he knew using word who wanted all the junk removed from Word's menus that he didn't use. He showed him how to remove menu items thru customization and he ended up with just Open, Save, Bold, Italic, Print and Spell check.

    2. Re:The two stages we haven't reached yet by mcdesign · · Score: 2, Insightful

      Word processing programs are still far to stuck on the typewriter way of doing things. They will never improve until they ditch that metaphor. Page layout programs have a much better approach. If you want to put that text box 10.123mm from the top of the page that is just fine in a page layout program. If you want to overlap you text boxes fine as well. Many Word users seem to be spending far too much time wrestling with the word way of doing things rather than getting on a producing the document.

    3. Re:The two stages we haven't reached yet by WoTG · · Score: 2, Insightful

      I generally agree. However, I am one of those users who at one time or another uses a lot of those weird features that "nobody ever uses". Macros, comparing documents, embedding stuff, mail merges, etc. I just did a quick browse of my Word 2000 menu bar, and the only things that I don't recall using are various wizards like auto-summarize, auto-format, and letter-wizard. The thing is, that I don't think I'm that unique in using a wide swath of features. True, most of the time 90% of the features are not used in a particular document, but over the course of 20, 30, or 50 documents, a whole lot of features are used.

      One idea that I've been thinking about lately, is having 2 or 3 basic modes of operation; something like Novice, Intermediate, and Expert. And make it VERY obvious how to switch between modes. In Novice mode, lock down all the toolbars, don't auto-hide menu options (not that I care for that feature at all!), maybe make the help features come up quicker(?). For the other modes, let varying amounts of the features get displayed.

      Eventually everyone would end up in Expert mode, but it would be a nice and gradual transition. This doesn't have to be too hard to setup... theoretically, someone could probably create an add-on to MS-Office or other suites to customize the appearance...

    4. Re:The two stages we haven't reached yet by Anonymous Coward · · Score: 0

      While people may claim that VBA is ultra-powerful, let me say that I use EMACS and have programmed that to the n-th degree to suit my needs.

      Call me strange, but I personally find the complex LISP language easier to program in than visual basic. I don't think Microsoft take programmers seriously, just as they don't take users seriously. No one is taken seriously!

      Consider this: if you're a real programmer, would YOU work for Microsoft??

    5. Re:The two stages we haven't reached yet by horza · · Score: 1

      I think it was a piece from Douglas Adams who told a story of someone he knew using word who wanted all the junk removed from Word's menus that he didn't use. He showed him how to remove menu items thru customization and he ended up with just Open, Save, Bold, Italic, Print and Spell check

      This is why I like Abiword, it's so simple and does the job without all the clutter. The first thing I do with a new browser or email client is rip out all the options to a bare minimum. How about a menu option for wordprocessors which contains: Simple menu, Editing Stage, Power User. The first option would be similar to the menu you mentioned, and I think the preferred option for many!

      Phillip.

    6. Re:The two stages we haven't reached yet by TheToon · · Score: 1

      You brought back memories of the DeScribe wordprocessor... originally on OS/2 (later also on Windows) it provided you with many page layout features. When you started with a "blank" sheet, you were actually in a text-frame.

      That's a word processor I miss...

      --
      //TheToon
    7. Re:The two stages we haven't reached yet by Thomas+Miconi · · Score: 1

      Open, Save, Bold, Italic, Print and Spell check

      I suppose the funny part was that he forgot "Close" ? :-)

      Thomas Miconi-

    8. Re:The two stages we haven't reached yet by Sri+Lumpa · · Score: 2, Funny

      "Open, Save, Bold, Italic, Print and Spell check"

      "I suppose the funny part was that he forgot "Close" ? :-)"

      Nah, it's Word, it's got the automatic shutdown feature (otherwise known as crashing).

      --
      "The obvious mathematical breakthrough would be development of an easy way to factor large prime numbers." Bill Gates,
    9. Re:The two stages we haven't reached yet by Kazin · · Score: 1

      DeScribe is one of the few software packages I've ever purchased. Very much worth it, I used it exclusively for years.

      (The other OS/2 software I bought includes Dualstor, Watcom C 10.0, and Zoc)

    10. Re:The two stages we haven't reached yet by Anonymous Coward · · Score: 0

      You should take up a command line editor.

    11. Re:The two stages we haven't reached yet by TheToon · · Score: 1

      eFTP, KWQ, PMMail, PMView, PMJPEG, ObjectDesktop, GalCiv, RSJ CD Writer... Several of these are available on Windows too (PMView, PMMail, ObjectDesktop, GalCiv, RSJ) ... and PMView could pop up on Linux too in a year or two. Worth waiting for imho.

      Sorry about this walk down memory lane, but I couldn't help it.

      --
      //TheToon
    12. Re:The two stages we haven't reached yet by BSD+Yoda · · Score: 1

      I had a DEC Rainbow.

      I've always waited for an appropriate place to post that on /. I may never have a better opportunity.

    13. Re:The two stages we haven't reached yet by Anonymous Coward · · Score: 0

      there should be no menus! only keyboard shortcuts.

  68. xml - pdf by jefu · · Score: 2, Interesting
    XML to PDF can be done with the XSLT outputting FOP and then a FOP to PDF translator.

    That probably sounds icky and scary, but should not be all that hard.

    I don't know what the formats are, but there's a whole pile of flexibility in XSL and FOP so building a very accurate version could take some fiddling. But producing a close approximation is probably very straightforward.

  69. Well, this: by Anonymous Coward · · Score: 0

    >> What kinds of new and wonderful things can you come up with?

    Plugins.

    Not just like Mozilla plugins (but, hey, good idea!)... ...more like [The Gimp] ones, including script-fu things.

    Dewd, it's gonna be a party. It will be like Word macros, but, this time, done right.

  70. What would you translate it to? by yerricde · · Score: 1

    A sizable percent of WWW users use IE 6. Most of those who refuse to use IE typically use something with a bit more XSLT-fu than Netscape 4.x.

    Look at the statistics from Google Zeitgeist. Red, blue, and lavender lines indicate IE 6, IE 5.5, and Gecko respectively. Notice that except for IE 5.0 (orange), the three CSS-savvy classes of browsers I mentioned dominate the client side. The lavender isn't very high yet, but it's getting there, behind only IE.

    Solution: Sniff user agents and point IE 5 users at Mozilla Firebird and Windows Update.

    Besides, you seem to suggest some sort of mod_xslt, but what would you translate it to?

    --
    Will I retire or break 10K?
    1. Re:What would you translate it to? by pacc · · Score: 1

      Actually, Microsoft don't make XSLT available to mortals. Upgrading IE to MSXML 4 isn't as easy as it could have been.

      But since there are no differences at all in rendering XML to XHTML clientside or serverside this opens up a lot of possibilities with the right XSL templates. You wouldn't even need to have staroffice installed to print out a document or edit some typo and change the default font.

      This could be as good as TeX as easy to read as plain text or at the same time obfuscated by an even worse syntax than TeX and transformed into unreadable code, but at least there are no need to mix in pearl or java to make it work...

    2. Re:What would you translate it to? by stoborrobots · · Score: 1

      The users who use IE6 probably already use MSOffice. They are not who we are talking about. The point is to support everybody...

    3. Re:What would you translate it to? by stoborrobots · · Score: 1
      And if you didn't know it, you should
      (Use the Preview Button! Check those URLs!)

      the link...

    4. Re:What would you translate it to? by yerricde · · Score: 1

      The users who use IE6 probably already use MSOffice.

      What about the users whose computers came with Internet Explorer but did not come with Microsoft Office (an expensive extra)? Though IE and its MSHTML engine are bundled with Windows, MS Office isn't. Yet.

      The point is to support everybody

      <sarcasm>Then let's all just use .txt files. That way we can support people who don't even have a web browser installed. Better yet, use paper, to cover people who lack a computer.</sarcasm>

      --
      Will I retire or break 10K?
  71. Missing some of the points by evil_roy · · Score: 2, Informative

    Formatting can be handled by whatever.

    The strength is in the meta-data. By using XML the doc can be formatted by anything that can understand it. But formatting is not the point.

    The docs can then be referenced in a relational database - searched,indexed & importantly shared and migrated to other indexing systems or stripped.

    The XML 'magic' is very simple. The use of the data is whatever you want it to be. Do you want to restrict access, provide access, record access, implement version control and X-referencing - then using this technology is for you.

    It has sfa to do with troff/groff/cat/echo/print and everything to do with document collaboration and sharing.

    1. Re:Missing some of the points by Doug+Merritt · · Score: 2, Interesting
      Formatting can be handled by whatever. The strength is in the meta-data.

      True! But it is widely under-appreciated that this can and was done even with troff, and still is today in an important way: the "apropos" command that scans for relevant man pages works by looking at a DB built by searching for semantic tags in the man pages.

      This very handy feature would not be possible if troff just did presentation style.

      It's true that this is not the main emphasis of troff, and that one is at the mercy of whoever wrote the macro package, etc, but that's true of XML sublanguages too.

      I'm playing devil's advocate. I realize (and posted elsewhere here) that there's a difference in that XML, when used as intended, is supposed to be primarily about semantics, with style as a secondary transformation, whereas it's the other way around with troff...it's intended for presentation, but people have nonetheless done handy things with it at a higher level of abstraction.

      But still, the point is that nothing XML does is brand new. It just represents new industry awareness of some old good ideas.

      Similar to how Java has popularized the 40 year old notion of doing garbage collection, so now people say GC is "new technology". Not at all. Just being more widely used.

      --
      Professional Wild-Eyed Visionary
    2. Re:Missing some of the points by Rogerborg · · Score: 1

      Or put it this way: if you spot a tool being used in a useful but unexpected way, and write another tool that does much the same thing but doesn't call it "abuse", you can shine it up and sell it as the Next Big Thing. I wonder if a lot of it comes down to documentation.

      --
      If you were blocking sigs, you wouldn't have to read this.
  72. Me! Me! by jefu · · Score: 1
    Pay me $10,000 and I'll generate the pretty stuff for you and build you a spiffums-as-all-shit web interface for the dodo-heads to use.

    1. Re:Me! Me! by Rogerborg · · Score: 1

      Friend, for $10,000 , it's barely worth your while to blow your nose with their specifications, let alone design, implement, test and install it.

      --
      If you were blocking sigs, you wouldn't have to read this.
    2. Re:Me! Me! by jefu · · Score: 1

      Firstly I'm a professor type, so $10K is a fair pile of cash to me. And secondly it would be a good way to push myself to learn all the details of of the FOP. Not that I want to spend a lot of time mucking about in it, just want to play with it once.

    3. Re:Me! Me! by Rogerborg · · Score: 1

      >Firstly I'm a professor type, so $10K is a fair pile of cash to me

      I think we've identified a problem right there. If you're good enough at what you do to teach it, why are you getting paid so little to do it?

      --
      If you were blocking sigs, you wouldn't have to read this.
  73. OO's XML to PS without OO is the missing key by rsd · · Score: 1

    Is there a way to generate a PS file from OO's XML?
    (without using OO)

    I have being looking for this for a while.

    Messing with OO XML format is not difficult and if you just
    play with OO saved file to see what changes in the XML, it is easy
    to create reports from a scripting language (aka perl).

    The problem is that (AFAIK), there is no way to direct print
    (or generate a PDF) without entering OO itself.

    It should not be difficult to write a command line utility to do so,
    if someone who knows the API points to what would have to be done.

    So, IMO, this is the missing key to Office XML perl Heaven!!!

    1. Re:OO's XML to PS without OO is the missing key by wellFormedEntity · · Score: 1

      use apache FOP.

      it's not without its bugs, but it works

      http://xml.apache.org/fop/
  74. You missed the point by spotteddog · · Score: 2, Interesting

    I don't want an "Office Suite" shoved down my throat. I want to use the graphing tool I think is best, I want my favorite email app, I want to use the word processor I like, and the spreadsheet I like, etc. I want to be free to try the newest software without converting everything I might need in the future. If the "office productivity programs" all used xml file formats, I could interchange files for one app to the next easily. I would NOT be locked into a single vendor's "suite" or programming HELL.

    If the apps were using XML, easy migration would be a given, and programmers could spend time "enhancing" the user interface.

    --
    . there used to be a sig here.....
  75. Can almost feel the envy in you... by Anonymous Coward · · Score: 0

    Nice troll, bet you are still living in your momma's basement. Hope someday you come out to the real world.

  76. Bug tracking and closing and report generation by Anonymous Coward · · Score: 0

    At the company I'm working for, they use as antiquated a system as you can use electronically to track bugs, etc: a spreadsheet. Since I use Linux, my spreadsheet is modified in StarOffice. I have brought in CVS and my development produces .deb files, so my dream (in my oh-so-prodigious free time) is to write a script that parses my CVS commits for debian/changelog entries that say "Closes SCR#xxx) and automatically modifies the bug spreadsheet. Ideally, I'd like to write a bug tracking software program (something not as complex as Bugzilla and more engineering-quality based) to also generate documentation, reports, test procedures, etc. This can pretty much only be done now by modifiying XML directly (which is really kinda ugly) or something like an Excel COM object (which I am loathe to do, can't do using Linux, but would be relatively easy and generally very cool to do...)

  77. Hell.. by Kwil · · Score: 1

    ..it's not even that standardized.

    It's a meta-format, giving you means to *create* a standardized format, once you start communicating with the other people in your industry who might want to use the same standards.

    Not really any different from any other EDI format, except now coders can move their skill set from one corporation to another.

    --

    That Jesus Christ guy is getting some terrible lag... it took him 3 days to respawn! -NJ CoolBreeze

  78. I am too. by Nailer · · Score: 1

    I wanted to save some time documenting servers, so I wrote Accudoc to automatically generate server documentation for (currently Red Hat) Linux systems.

    Its written in shell, and just uses a bunch of shell functions I made to create the documents.

    You can download a copy here if you want. It's open source, and if you're a SysAdmin you might find it useful to produce written reports of servers you manage.

  79. Coincidentally, I was. by Anonymous Coward · · Score: 0

    Over the summer I assisted the editing staff at the local university press with some of the more mundane/trivial aspects of assembling an encyclopedia they're putting out next April. Although all their documents were in a variety of versions of Word, I was still able to whip up some perl scripts to do some text processing; enough to suit my purposes. Although my script crashed on certain documents for no apparent reason.. And it took way too long to code considering how simple it was..

  80. Users Expect ... by jefu · · Score: 1

    MS Office is required because users expect MS Office and will fight like hell the smallest changes in their comfy environment.

    But the right kind of XML editor or even WYSIWYG (though I think most WYSIWYG editors are really WYSIWYGBYTATGSAAAAFTB (what you see is what you get but you've thrown all the good stuff away and added awful formatting to boot)) with a meaningful XML back end (with markup like "to", "to-address", "date", "title", "section", and so on and with domain specific markup as needed) would really change how documents are used, stored and generally manipulated.

    One document style could really be used across an organization (in large companies this sometimes happens - in small to midsize ones rarely). Documents could be indexed meaninfully or even stored with minimal indexing, but fancy XQuery based search capabilities. Layouts could be changed to accomodate different types of paper (and I'm not just thinking letter vs legal or A3 vs A4, but things like letterhead changes etc). Documents could be stored (even transmitted) as the minimal xml markup needed to regenerate that (meaning maybe a couple Kb rather than a dozen or three Mb).

    Documents could leave out boilerplate since large chunks of boilerplate could be inserted with a single tag (<patent-claim-from-hell> could expand to three pages of standard write-once legal nonsense <sig> could expand to your signature...).

    It ain't gunna happen. Users love their Word, they love being able to set up their own ugly-as-shit document layouts, being able to lose documents easily, being able to spend hours tweaking a font here or there instead of doing real work.

    And even the vaguest mention that XML would be good is enough to generate a storm of protest. "Oh, but you can do that in MS Word already." But few people do and XML makes much more possible.

    I sometimes think it would be interesting to make secretaries pay for MS Word themselves (not unreasonable - mechanics usually pay for their own tools) and for the disk space used by its documents.

    1. Re:Users Expect ... by yuri+benjamin · · Score: 2, Interesting

      I've been thinking about a document management system that has an integrated word processor.
      To create a new document users would first be presented with a DE screen asking for some meta-data (perhaps with some manditory fields) before being dropped into the more familiar wordprocessor gui.

      Someone with admin rights to the document management system might define the fields that go into the initial DE screen.
      Users might have to choose beforehand whether the document will be emailed, faxed or printed (eg for snail-mail), and the document would be "attached" to a client record, along with any replies (eg by email).

      The "save" feature would be replaced by "save draft" and "save final", because once the document is sent to an external party you need to "freeze" the document as a record of what's been sent.
      Maybe some kind of versioning & rollback would be useful too (something more powerful than undo/redo).

      I'd do it myself, of course, but I don't have the time or the skills.
      If I ever see a patent application for this idea, I'll point to the /. archive of this post as prior art (although I'm sure this kind of system is already in use in some organisation somewhere).

      --
      You make the mistake of thinking you can educate the fundamental stupidity out of people. You can't.
  81. "humbly", not "humblely" by Anonymous Coward · · Score: 1, Funny
  82. Structure Structure Structure by unfortunateson · · Score: 2, Interesting

    Having an XML representation of a Word (MS, Open, whatever) document as a stream is really no more useful to me than RTF: I can parse them both.

    The better part is when you can structure your document. Not just a heading surrounding a bunch o' paragraphs, but a (to use the stuff I have to work with) Research Report contains a Title Page, a Synopsis, an Introduction, Materials Section, etc. You can't put tables and figures on the title page or Introduction, you can in the Synopsis and Materials Section. TOCs and things like that are created as part of rendition, between the Synopsis and Introduction, without the user messing with it.

    Now even more than storing those sections (which would, in the HTML world, be DIVs and SPANs), I want control over the UI: disable that table button in the title page, even down to where bold and italics can be used.

    Office 2003 has some facility to implement this, but it's kind of awkward -- it's an extension of how their SmartTags work. Generally pretty ugly, to control everything.

    I don't want to use an XML editor, my users know Word, are used to Word Processors, and they cost 1/5 of XML editors, less in bulk licenses.

    I'd be implementing this now, if it weren't for two things: a) I work for a big corporation that never buys into new releases for a couple of years, and 2) they're laying me off -- closing all the facilities in Chicago (sigh).

    --
    Design for Use, not Construction!
  83. Re:PHP - Can u print automagically? by Anonymous Coward · · Score: 0

    I'd really like to send the open office docs straight to the printer for hardcopy output. We've been struggling with that forever.

  84. XML *is* a selling point by takasuz · · Score: 1

    Users always seek a better user interface. And as far as you use an office suite with a proprietary format, you will have little choice.

    With an open standard format, you can pick another office suite with suitable features (including interface of course) at any time. Faster the migration to an open standard, less closed-format documents you will have and brighter the future.

    I do not say XML is the best open standard. I even do not think that the most of users care what XML is. What matters is that it is an open standard. It may not be the best but it is easy to make conversion from one open standard format to another later. What dissapointed me most is MS's unwillingness to compete in an open ground and to make users choose a better office suite. I would like to say, "let users decide."

    The point of the migration in time depends on what a user needs an office suite for. In most of tasks, OpenOffice.org is quite sufficient. Its interface is surely going to be further improved, and it is about time for an average user to consider freeing oneself from MS.

  85. XML Scripts.. Not quite yet by Anonymous Coward · · Score: 0

    It seems to be the hot topic. Where are all the neat toys that go with XML format files for office suites? Well, the simple answer is that they are not needed yet. Sure, I love the idea. Even something as simple as using a spreadsheet you can edit in your handy convenient office application to generate dynamic web-based content including graphs and summary datasheets through a php script sounds like alot of fun, but the honest truth is how often do the end users actually USE XML as their format? You have millions and millions of documents already out there in the WordPerfect and Microsoft Office formats (and believe it or not, some still in Microsoft Works formats) and when you open one of those documents, your application doesn;t tell you "this format is old - you should convert it" so it stays as it was.
    Once the bulk of active documents are XML data, the scripts that parse them will become more prolific, and I think most of those scripts will be web based, such as php, perl and java.

  86. Yeah like THAT will ever happen by tjstork · · Score: 1


    Let's see, and, companies are going to make it easy for everyone to compare their prices without adequately describing the subtleties of their value proposition.

    It's not even utopion, it's stupid. The relentless march to standardization for the sake of standardization is the Unix crowd's lemming version of everyone just buying Microsoft. You aren't changing the way of thinking, you just want people to think the same way about your open source stuff rather than Bill Gate's closed source stuff. There's no difference between Torvalds and Gates, except one begs for money and the other earns it.

    --
    This is my sig.
    1. Re:Yeah like THAT will ever happen by otis+wildflower · · Score: 1

      OK, I'll bite..

      Let's see, and, companies are going to make it easy for everyone to compare their prices without adequately describing the subtleties of their value proposition.

      Four words: Securities and Exchange Commission.

      Maybe there isn't enough outrage left from this last corporate corruption cycle, but next time around we may very well see a codified, detailed accounting standard which would mandate structured reporting file standards. Who knows, there may very well be an ISO accounting standard, once the rest of the industrialized world has their own accounting revelations.

      (ObOffTopic: Then again, I thought there'd be enough outrage about the 2000 election that we'd see a push for the abolition of the electoral college, but it's just degenerated into whining and sniping..)

      There's no difference between Torvalds and Gates, except one begs for money and the other earns it.

      More like one works for his money and the other uses his powers as a convicted monopolist to extort it.

    2. Re:Yeah like THAT will ever happen by Loundry · · Score: 1

      push for the abolition of the electoral college

      The Democrats were really mad that they lost, I know. Keep in mind that if the roles were reversed, they would probably be very happy with the electoral college. Democrats and Republicans are about political power over every other issue. For the record, the electoral college functioned exactly as it was intended to: to remove power from population centers. You might want to read Hamilton's writings in Federalist #68.

      More like one works for his money and the other uses his powers as a convicted monopolist to extort it.

      The parent poster was wrong. He claimed that Linus begs for money when, in fact, he is gainfully employed, exchanging value-for-value. Gates willfully uses fraud to make money. He is not a capitalist; instead, he is a con-man.

      --
      I don't make the rules. I just make fun of them.
  87. Translation? by Anonymous Coward · · Score: 0

    Microsoft maintains dominance to their office suite by controlling the file formats behind it. Opening that up, without reason would be absolutely stupid from a business point of view.

    I think you meant to say something like

    "Microsoft maintains their illegal monopoly by controlling the file formats behind it. Opening that up, with good reason, would be the ethical and economically competitive thing to do."

    I don't want MS to open up their standard just because I believe in open standards. I want MS to open up their standard because they have an illegal monopoly and have therefore stolen my money, your money, and the business of better competitors everywhere.

    Let's get that fact straight.

  88. It's all about the parsers. by SuperKendall · · Score: 3, Insightful

    XML can more easily represent complex data structures than CSV, but that's not the main benefit.

    Nope, the real revolution was in creating standardized parsers. I spent many an hour with LEXX and YACC churning out parsers for many custom file formats. Even though XML may not seem the most efficient way to represent things, it's great not to have to write a new parser every time we have a new bit of information to represent in a file. It frees you to think about what data you want in a file instead of directing your file contents to things that will be easy to parse.

    That's why XML is every bit as valuable as it is made out to be, just not for the reasons usually given...

    --
    "There is more worth loving than we have strength to love." - Brian Jay Stanley
  89. Crystal sucks monkey's ass by melted · · Score: 1

    Come on, people. I've never seen anything worse than Crystal in my life. The reports are mediocre, and the charts suck so hard, I fail to convey it verbally.

    Anyone who paid $10K for this junk needs immediate psychiatrical attention.

    1. Re:Crystal sucks monkey's ass by Jonner · · Score: 1

      I haven't used Crystal Reports to make reports, but I have tried to deal with its DLL hell: it wasn't fun. Like many things in the Windoze world, it involved following arcane, unexplained instructions, guessing at the correct order to register COM components and restarting apps. It was pretty much magic when it worked. Of course, the reports in question were an unsupported add-on from the developer of a proprietary app my employer had bought, so everything was a little dodgy.

  90. We used it by PeterBecker · · Score: 1
    When writing our little personal document managment tool based on Apache's Lucene, we wrote an indexer for OOo documents. Two classes: one is shared with the general XML indexer, one does the OOo specific stuff, including the extraction of metadata. In total maybe 200 SLOCs. It should handle all OOo formats if they contain text -- actually the metadata extraction should work even without.

    The program also indexes Word and Excel files using Apache's POI library. I haven't looked at the size of that, but something makes me think it is a bit bigger than out little hack.

    I know there is much hype around XML and in the end it is only half a syntax. But there are good applications of XML around and I think OOo is one of them.

    Peter

    --
    -- CAUTION: Don't read this posting.
  91. Document Properties Manditory by AShocka · · Score: 2, Interesting

    You can configure most office suites to display the document properties dialog on save. I'm sure you could also build templates with macros that would check and update these. Yes, it's a real problem and most businesses do not have strategies to address it. It's a document management issue very few address.

    It's a similar problem with web publishing; there is little or no metadata to identify documents. I've always thought that the Dublin Core set would serve as a very good repository for a kind of CVS on the status of documents. Have wanted to build a back end to something like Apache/Cocoon using this model, which would also serve as the data repository for populating both the metadata in the web documents and also all the other data for semantics and accessibility, all done on the fly out of a DC metadata repository.

  92. Simple: use CVS to store document revisions by drbart · · Score: 1

    Being a text format, XML would at least bring documents out of the binary world and allow diffs and things that use diffs, like CVS.

    Imagine actually being able to use source control to track documents!

    Unfortunately OO defaults to gzipping the XML, which brings us right back to binary.

    1. Re:Simple: use CVS to store document revisions by Dilaudid · · Score: 1
      Absolutely. Instead of storing each revision of a file as a whole new binary document, you simply store the changes. Result - it's now easy to see the changes in each doc. In any media or publishing firm they usually have large expensive systems to track changes...

      On another topic - as a desktop developer (working with M$ Office - joy...) - it would be so much easier if docs were stored as text. Don't really care much about the XML side, it would just be fantastic to find out what formatting had been applied to the document when you've been asked to fix it. Just my two cents...

  93. I worked for a shareholder lawsuit firm... by tjstork · · Score: 1


    The Securities and Exchange Commission is essentially powerless. Any stock governance will be pointless until shareholders of companies have real rights.

    Companies are not necessarily designed to be responsive to their shareholders, and, they are not designed to be competitive, and, they do not have to be honest.

    --
    This is my sig.
  94. Open standards for XML forms by mdubinko · · Score: 2, Interesting

    One thing the Open Source office suites don't (yet) have much of an answer for is an XML data collection/management system along the lines of Microsoft Office InfoPath. A natural standard for such applications is W3C XForms.

    Read all about it--fullly GFDL and online now--from the O'Reilly book at my site.

    .micah

    --
    --- Learn XForms today: http://xformsinstitute.com
  95. Mozilla by merphant · · Score: 1

    With a bit of polish, Mozilla composer could be a good word processor. It generates XML (xhtml), and it's available for a bunch of platforms. Plus it comes with a web browser and an email client. That's most of an office suite right there. Most non-technical users don't use spreadsheets, databases, or presentation programs anyway. They want word processing, web, email.

  96. Word HTML Cleaner by starvingartist12 · · Score: 1
    The Textism site has a Word HTML Cleaner that seems to do a good and comprehensive job, from previous experience.
    This utility strips proprietary Microsoft tags and artefacts from Word HTML documents, leaving basic formatting and typographic entities intact.
  97. MS isn't the only one with a proprietary format. by Anonymous Coward · · Score: 0

    Why is no one complaining this much about Adobe Acrobat?

  98. XSLT by wwi · · Score: 3, Insightful

    In about .5 hrs, I was able to
    extract the content from an
    OpenOffice text document, as
    well as a presentation, and feed them
    into other tools. This without
    trying to read any DTD's. Applying
    more effort would have yielded more
    functionality, but I was in a hurry,
    just trying to get some information
    out with some heirarchy to it.

    Now, extracting the style is a different
    challenge, and of course style
    means different things to different
    people. But it is simply madness to try
    to extract content from Word
    and Powerpoint files for use elsewhere.

    Oh yes, I used Saxon. Nice product.

  99. XML Could have been great by POds · · Score: 1

    Ahhh bugger... i must have missed the article on MS dumping XML support. I saw this as a good thing because the content of the document could be morphed to fit into any display the user wanted it to. Like a browser, PDA, Phone or some other mystical device.

    I must say, its less interesting now that MS have droped it, and im not sure if the fact that Star ofice has it, pleases anyone. I Say that because, MS Office has the most market share, and enabling XML documents would have allowed better operibility amoung other wordprocesses and MS document Apps. I was hoping Star OFfice, Gnome Office and KDE office could all contribute to a set of libraries to parse the XML Word Documents, which would benefit everyone, but looks like that will never happen.

    But other than displaying XML data differently, XML also has other advantages. I've read several articles on how it could help the searching of ducments for specific combinations of text etc...

    Theres so many cool things that i cant even imagine would have been able to be done with XML word documents. The fact that Star office is doing it doesnt interest me as much as if MS word was doing it.

    MS Office is a better product, which much more market share in my opinion.

    --


    Giving IE users a taste of their own medicine since 2005 - http://pods.-is-a-geek.net/
  100. "any programmer with a Perl script..." by eclecticIO · · Score: 2, Interesting

    "and a bit of intelligence"

    Using a MS Word template, ActiveState Perl, and a number of modules including Win32::OLE I created a documentation generation system that pulled information from a database and created a Word document with dynamic headers, footers, formating, content, etc. I used it to created 1000+ password protected, pre-formatted Word documents that we provided to the client. Anytime the format needed updating or any data needed to be changed all I had to do was rerun the Perl script rather than update all of those docs.

    I'm not going to say that this was easy by any means, it took quite a bit of research and tweaking to finally get right. XML would, no doubt, make this task easier but I don't necessarily think it is the panacea that will FINALLY permit us to automate docs and reports that need to be generated and shared. My point is that with "a Perl script and a bit of intelligence" document automation is something that can be done now.

  101. Research by Anonymous Coward · · Score: 0

    I'm a doctoral student. The output of my experiments are converted via a messy perl script into OpenOffice XML format. When I'm done running my experiments I simply pop them open and see a table and a graph.

    Doesn't sound like a big deal, but I run hundreds of experiments and this just saves me a hell of a lot of time. I can easly convert these to XL format so that I can share them with my advisor etc.

  102. I just don't understand... by pjrc · · Score: 1
    ... how exactly did StarOffice fulfill the "XML promise", er... "the huge universe of MS Office documents becomes available for processing by any programmer with a Perl script..."

    Did Sun get all those people using MS Office to convert their documents to StarOffice / OpenOffice XML format, which they can't even use with MS Office ?

    It just doesn't make sense. Maybe it has something to do with Chewbacca ?

    1. Re:I just don't understand... by Zero__Kelvin · · Score: 1


      " ... how exactly did StarOffice fulfill the "XML promise", er... "the huge universe of MS Office documents becomes available for processing by any programmer with a Perl script...""

      The various Open flavors of Office run on Linux and Windows. Furthermore, they can import M$ formats and export them to XML. Therefore anyone with the skill can use a Perl script to automate the entire import/export and post-processing. Understand now?

      --
      Guns don't kill people; Physics kills people! - John Lithgow as Dick Solomon on Third Rock From The Sun
  103. How about now? by xixax · · Score: 1
    Wow, I defined my plot and then pointed it at a new data frame using R.

    my_plot(my_data)

    Xix.

    --
    "Everything is adjustable, provided you have the right tools"
  104. Migrating file formats by SgtChaireBourne · · Score: 3, Interesting
    I'm currently working on a project to retrieve documents accross a company's backed-up data from the past 10 years, and there is very very little metadata available for us to do any searching on.
    Yes, but you can't claim that an absence of metadata is due to a failure to write metadata: I myself used to keep a lot of metadata in my text processing documents and found that if you migrate periodically to new versions of the MS-Word format suite, you will periodically lose the metadata. No errors, no warnings, it's just gone. XML in the MS-Office suites is not going to come to the forefront. Microsoft, an Oasis member, backing out of the Oasis standard shows where they are heading. The misdirection about the schema should remove any doubt.

    On the other hand, OO.o's XML format + schema will be available even to competitors and theoretically beyond the life span of OO.o. One way for OO.o to encourage users to think in a structured is through style sheets. Style sheets and document templates can save a lot of wasted time and effort. But again, what would people do with the spare productivity if formatting were done in 5 minutes, instead of spending 2 days formatting manually and re-formating manually various reports and presentations?

    --
    Beta is broken and the link to classic doesn't work. Stop wasting our time or there won't be anybody left here.
  105. I'm doing it right now by bertilow · · Score: 2, Interesting

    > Is anybody out there writing Perl/Java/whatever programs to
    > take advantage of StarOffice XML?

    Yes, actually I started doing that yesterday: I'm using Perl and XSLT to build documents in StarOffice XML (or actually OpenOffice.org XML), converting some 500 XHTML pages into one huge OpenOffice.org document. It's amazingly easy!

  106. Re:Docbook XML OOo Filters by JMemmert · · Score: 1
    One application that I developed recently makes, imho, great use of OpenOffice (and thus XML). In an application for knowledge management I've used the following setup:
    • Conversion from Word to OpenOffice. This is currently done in manual mode until I get someone to write me a batch process based on the OO APIs.
    • Conversion to Simplified DocBook with OOo2sDbk. Works perfect for me.
    • Analysis with Lucene to find often / rarely used words
    • Presentation of a subset of these words to the user for definition as important or unimportant for the project
    • Based on the user decision, the documents are connected in a structure remotely similar to a mindmap.
    There are a few more steps possible but these are currently only in planning or not fully implemented, so I'll ignore them here.
    Once the map's done, it's all refinement of the mappings through user interaction, gradually refining the map by adding of abstractions (WebSphere here, WebLogic there, abstract to ApplicationServer, etc.) and adding or removing relations, documents, etc.
    The result is a hyperindex of the documentation.

    It's not really revolutionary in that such a thing as never been done before, but I shudder at the thought to do that with Microsoft Office as a base.

  107. CSV is standard, XML isn't by Anonymous Coward · · Score: 0

    Just take one example: mail-merging. Every word processor on the planet can import mail-merge data in CSV, especially if you put field names in the first line. CSV is a pain because there's no way of representing fields that contain more than one line and there's no consistency on how to deal with the quotemark within a field.

    I had hopes for XML. But no-one's designed an XML-based format (remember, XML isn't a data format, just a basis for designing them) for the most common single data transfer operation. I installed the Word 2003 beta - "XML support", great, I thought, MS will have invented an XML-based data transfer format and everyone can standardize on it. But... nothing. Mailmerge import is still CSV or ODBC.

    So don't knock CSV: it really does have a standard way for transferring tabular data, and XML doesn't.

  108. Goldfarb's Conjecture by RobotWisdom · · Score: 2, Interesting
    People need to wake up to a simple fact-- XML is for databases, not for documents. (I first pointed this out in 1998.)

    The gigantic propaganda campaign about the "wonderful new things" that semantic markup would make possible was always just a masturbatory fantasy by people who'd never implemented anything, encouraged by SGML contractors who saw an opportunity to broaden their target market.

    At the root of this delusion is what I call "Goldfarb's conjecture"-- the claim that document styles are superficial representations of underlying semantics. If Goldfarb were right, then tagging document semantics would be no harder than tagging styles, so this sort-of-works for titles and highlighting.

    But hardly any other semantics have associated styles, so tagging them becomes sheer drudgework for almost no payoff. It's absurd to have to tag every name as a name, every place as a place, etc. This metadata belongs in headers, not as embedded tags.

    So the real outcome of the XML-scam is that the effort to add metadata to webpages has been set back at least five years. What should have been emphasized was META headers for: Yahoo topic-category, DMoz topic-category, list of persons, list of places, list of companies, list of things, dates discussed, document type (eg timeline, image gallery, biography, etc).

    1. Re:Goldfarb's Conjecture by Isofarro · · Score: 2, Insightful
      The gigantic propaganda campaign about the "wonderful new things" that semantic markup would make possible was always just a masturbatory fantasy by people who'd never implemented anything,

      So, what have you implemented that's being used by thousands of businesses across the world? Pot. Kettle. Black, Mr failed AI expert.

      So the real outcome of the XML-scam is that the effort to add metadata to webpages has been set back at least five years.

      Adding metadata to webpages is deceased. It has been for over half a decade (Yes it is 2003 this year). Its a dead donkey, no need to flog it any more.

      What should have been emphasized was META headers for: Yahoo topic-category, DMoz topic-category, list of persons, list of places, list of companies, list of things, dates discussed, document type (eg timeline, image gallery, biography, etc).

      Utterly useless. Listing a series of dates does nothing a simple perl script can extract. Now linking a date to an actual place - now that's something useful. And your above example fails that simple relationship. Screenscraping ain't gonna save you - its far too brittle for practical real world use.

    2. Re:Goldfarb's Conjecture by Anonymous Coward · · Score: 0
      The gigantic propaganda campaign about the "wonderful new things" that semantic markup would make possible was always just a masturbatory fantasy by people who'd never implemented anything,


      Straight from the lips of the chief masturbating non-implementor.
    3. Re:Goldfarb's Conjecture by RobotWisdom · · Score: 1
      Adding metadata to webpages is deceased

      No, really, it's just, uh... resting.

      Listing a series of dates does nothing a simple perl script can extract.

      I can't imagine why you'd say this. If I'm interested in WW2 I ought to be able to specify a date-range 1939-1945 in my searches.

      (Warning to innocent bystanders-- Isofarro is more-a-less a stalker of mine who generates lame counterarguments for the sake of arguing.)

    4. Re:Goldfarb's Conjecture by MattRog · · Score: 1

      And I, and others, have pointed out XML is a horrible data storage format.

      --

      Thanks,
      --
      Matt
    5. Re:Goldfarb's Conjecture by RobotWisdom · · Score: 1
      And I, and others, have pointed out

      Your link leads me to a site with extra-wide pages in extra-tiny type, and extra-opaque linktext so I didn't know what to try and read.

      XML is a horrible data storage format

      In general, yes. But for small shared databases it's pretty harmless, I imagine...?

      (PS-- did any of the followups in this topic give real examples of nifty working XML apps?)

    6. Re:Goldfarb's Conjecture by Isofarro · · Score: 1
      Listing a series of dates does nothing a simple perl script can extract.

      I can't imagine why you'd say this. If I'm interested in WW2 I ought to be able to specify a date-range 1939-1945 in my searches.

      Events within the above date range would include the birth of Stephen Hawking and Al Pacino - not something related to WW2. Elementary evidence that a list of dates by itself without any related data is useless.

    7. Re:Goldfarb's Conjecture by RobotWisdom · · Score: 1
      a list of dates by itself without any related data is useless

      So, are you a true moron, or just a lying creep?

    8. Re:Goldfarb's Conjecture by Isofarro · · Score: 1
      [In response to a refutation of his argument]: So, are you a true moron, or just a lying creep?

      An ad hominem, a false dichotomy and a fallacy of interrogative presupposition. Now _that's_ a weak argument. Presumably for the sake of continuing an argument that's been refuted? You do yourself no favours acting in this way.

    9. Re:Goldfarb's Conjecture by dublin · · Score: 1

      And I, and others, have pointed out XML is a horrible data storage format.

      Major counterpoint:

      It doesn't matter. The value of XML is not its utility as a data storage format for either databases or documents, but rather its utility as a data communications format.

      In this role, XML (often along with Java) is the modern Ligua Franca of applications in the digital world, and will be with us for years to come, because it solves a very real problem in a relatively easy way. The fact that it's not suitably elegant or shaped for your tastes in no way diminishes its usefulness. Rail on against XML if you want to - the rest of us are using it, warts and all.

      (Lingua Franca is Latin for "Frankish Language" - the mish-mash of European and Mediterranean toungues that was the universal language in ports throughout the known/trading world several hundred years ago, much as "broken English" is today.)

      --
      "The future's good and the present is nothing to sneeze at." - Roblimo's last ./ post
    10. Re:Goldfarb's Conjecture by MattRog · · Score: 1

      Did I talk about 'communications'? No. Read the grandparent -- "XML is for databases".

      Couldn't resist: So if everyone else was jumping off of a bridge, you would, too?

      --

      Thanks,
      --
      Matt
  109. StarOffice and WebDAV by bfandreas · · Score: 1

    My employer and Sun currently have a cooperation up and running. Basically it's storing StarOffice XML data via WebDAV into our XML database. You can search structured documents in the DB and on the publishing side there ain't nothing a Coccoon framework couldn't do. I'm talking about an XML based document repository. No binary data means we're wide open to any kind of server side application.
    Dunno why this solution isn't neither marketed by Sun nor us since all you need is our DB and Staroffice out of the box.

    --
    20 minutes into the future
  110. Re:MS isn't the only one with a proprietary format by nagora · · Score: 3, Interesting
    Why is no one complaining this much about Adobe Acrobat?

    Maybe because its not a closed format, hence all the open-source pdf generation programs.

    Frankly, I'd rather see more PDF generation than XML. If I sit down and spend hours designing a book or report it's more important to know that it will appear as designed than that it can be converted into a mass of raw data and presented in any half-arsed way by someone so primative that they still think PowerPoint is a pretty good idea.

    TWW

    --
    "Encyclopedia" is to "Wikipedia" what "Library" is to "Some people at a bus stop"
  111. Yeah but by Moderation+abuser · · Score: 1

    .TH Who understands troff, or postscript now? .B Maybe if there were nice WYSIWYM troff editors it would be a different matter, with XML you have something easily parsable and transformable with easy availability of good programming language parsing and manipulation libraries, which is a dream to integrate with RDBMSs.

    You want to make troff useful? Write a GUI or TUI editor which uses it as it's default format. Then write a set of free to use libraries for use with all of the major languages which make it a doddle to parse, generate, manipulate troff data.

    --
    Government of the people, by corporate executives, for corporate profits.
  112. Creeping featurism and redundancy killed xml by Anonymous Coward · · Score: 0

    There has been such an explosion of dtd's that it's far more easier to get the job done the old way than with an xml based approach.
    Most people don't have the time to learn 50 new ways to reformat their data, let alone learn 50 new api's to process each of those ways when the old way worked just fine.
    Unless xml is applied to a totally new domain, and not just a re-doing of what is already done in a different way, it's pretty much a waste.

    1. Re:Creeping featurism and redundancy killed xml by silverbax · · Score: 1

      Nearly every business application I have worked on in the past year uses XML in a large way. It's usually the most convenient, efficient and cross-language method for transferring pieces of data. Most of the applications I use it for are .NET to Java and vice-versa. Quick and painless iplementation.

  113. We use it for migration of legacy data by DollyTheSheep · · Score: 1

    We have an application, that recently underwent a major GUI update. Unfortunately, we had to drop an old and seldom used feature.

    Now older clients would loose information, if they had used that feature. So we developed some filter via XSLT (which is builtin in our application now) to migrate that info to XML, that is OOo and M$ office formats.

  114. DocType! by Letch · · Score: 1
    Is this maybe the kind of thing he was thinking about?

    Its great; you write documentation in xml markup, and then you can run a host of tools to generate pdfs, webpages, text files ... any format you want basically. And everything is taken care of for you; a contents page is generated automatically, all the text formating (bold, italic, headers) is done for you ...

    I see no reason why someone couldn't write a word processor to edit doctype; instead of applying bold, etc you would have menus to make selected text "A command line", "A file name", etc

    I think Sun use it for a lot of their documentation; PHP and others use it for their web documentation.

  115. [OT] Re: to refund 50% of the purchase price by hany · · Score: 1
    I personally think they should have been made to refund 50% of the purchase price of all Windows licenses, as ... that advantage was gained through illegal monopoly ...

    50% is not enought because MS' margin on Office is more than 70% and if there are users which were forced to buy Office because "other have it" then they should get 100% refund plus damages (loses caused by working with Office - for example loses caused by viruses?).

    [Note: This is offtopic to "Fulfilling the Promise of XML-based Office Suites?".]

    --
    hany
    1. Re:[OT] Re: to refund 50% of the purchase price by Gaijin42 · · Score: 1

      Nobody was "forced" into buying windows because "others" have it.

      Thats like saying The gasoline companies should have to pay money to car owners because they were forced into buying a normal car insteal of a diesel car

      Or that whoever invented SMTP should have to pay everyone in the world that uses email, because they forced us to use SMTP rather than CCmail or something like that.

      There is a synergy involved with people using the same things. People listen to the same music, wear the same clothes, go to the same movies, and yes, even use the same OS and word proccessor.

      The idea of something being open and standardized is very new. MS can not be held accountable for not following ideas that didnt exist at the time that they weren't following them.

      Corel wasn't open, Wordstar wasn't open, AppleWorks wasn't open. Nobody was open. Yes, MS is the one that everyone settled on. MS "won". But you can't go back in time to punish them.

      If they aren't open now, and they should be, the market will punish them all on their own.

    2. Re:[OT] Re: to refund 50% of the purchase price by perlchild · · Score: 1

      But the gas companies weren't caught and indicted with illegally preventing other companies from offering alternatives either. Microsoft's monopoly means that they in ways that are illegal, in the spirit of a free market, CREATED for itself an illegal advantage... and boosted it's value. Think of it as insider trading... That's another way people create illegal advantages for themselves... It's not that the value isn't there... Its that the value's "value" has to be assessed in a free market. Anyone playing with the rules should expect to get burned...

  116. VI or EMACS anyone? by Anonymous Coward · · Score: 0

    Hey Linux masters: what about adding yet ANOTHER layer of complexity to the "simple" and "quick" editors every Linux-system on this planet has installed.

    As you now have managed to make it an utterly frustrating experience to edit one fcking .conf with bloated, feature-laden but useless "editors" (some would call them "operating systems") that no one can use without ten man months worth of reading documentation.

    This is my only serious criticism about Linux. Have been using that for quite a year, I still couldn't figure out how to use these damn tools properly. I would BUY a decent simple editor that resembles the plain old dos-edit.com in terms of simplicity and stupidity. So if you like it or not: I AM stupid and therefore I want a stupid editor. I don't want anything else in a text-editor in console mode than a string search mode and some kind of clipboard to copy&paste some lines of text.

    I don't want to do fancy macro-stuff, I don't need to have supercomplex features, I just want to locally edit a stupid .conf in console mode now and then. And yes, I know Google and I used it trying to find my perfect stupid editor but I found none.

    Changing graphic hardware is a royal pita when you find out the driver's not working and you have to reenable the old one in xf86.conf when all you have available is a bloated VI. Call me Joe (L)User if you like, but I don't want to learn VI, I want to replace it. It doesn't matter if VI is installed on a million other *nix-systems or if Linus Torvalds' mother could understand vi, I don't. And I want to scrap this program as I have a personal hate against it.

    I just wanted to tell you that, you vi-pimpz :)

    1. Re:VI or EMACS anyone? by Kredal · · Score: 1

      Use pico. It's installed alongside pine, the mail viewer... It works almost exactly like the dos edit program, and all of the commands you need are listed at the bottom of the screen. Really easy.

      --
      Whoever stated that signature sizes should be limited to one hundred and twenty characters can just go ahead and kiss my
  117. OOo can create/use DBF natively by TonyGreene · · Score: 1

    This is an even better example of why Star Office and OpenOffice.org will overtake MS Office, as Sun only now bundles a cripple-ware database app, and OpenOffice has none at all.

    OpenOffice can create and use DBF files natively, but this functionality is not obvious. You have to create an empty directory to hold the DBF files (the database) then setup that directory as a data source. You can then right-click on "Tables" in the data source navigator and select "New Table Design". It will allow you to design the DBF using an interface similar to MS Access.

    Besides, most desktop "databases" are actually spreadsheets. Most users don't know enough about databases to be able to take advantage of them, even if they had enough data to make it worthwhile to learn.

  118. Re:Word to HTML to XML to HTML by bWareiWare.co.uk · · Score: 3, Informative

    I find the easiest way of getting usable XML out of Word is you use Word's save as HTML function and then running W3C TidyLib to get rid of all (most) of the M$ crap.

    This leaves you with a HTML-esq document that you can feed to an XSL:T and get whatever XML you need.

    I did consider using OO to open the Word document and to save them as XML however I had trouble with its API (I also had trouble with automating Word but here I had plenty of biter experience to draw on.).

  119. Searchable .pdf's - that's the one by Anonymous Coward · · Score: 0

    Our customers generate .pdf's and then, later, have to report on the data therein contained.
    Then, on reporting, they use more cycles to take it (a second time) back from the *RDMS* paradigm to the business rules paradigm. The .pdf's represent the reportable data already satisfying the business rules. To be able to get to _that_ with xml.... and not have to hire AcroEinstein....

    -b

  120. German government by Anonymous Coward · · Score: 0

    Sorry that I have to post anonymously and be a bit vague about this, but some branches of the German federal government are starting to automate document generation using the OpenOffice/StarOffice APIs.

  121. .xml .pdf would work for me by sporb · · Score: 1

    Our customers generate contracts in .pdf format. Later on, they want to generate reports. These reports go back to the database and (for a second time) transfer from the *RDBMS* paradigm to the user paradigm. If I could get to the data on those contracts (without having to find AcroEinsteins to do it).... -b

  122. Internal format is irrelevant by bWareiWare.co.uk · · Score: 0

    To me their seems to be two reasons to use XML with a word processing application:

    1. To facilitate interchange of documents between different systems.
    2. To allow automatic processing and formatting of documents.

    These have differing requirements and are unlikely to be met by one XML file.

    To facilitate interchange you need to use standards, and the first rule with standards is you CAN'T create your own because you don't like the existing ones. DocBook, HTML, RTF, .DOC have their problems but are a lot more interchangeable then OO's format which can't yet be opened by anything.

    To facilitate procession and automatic formatting is much more tricky. You really want to extract the schematic structure of the document not its current formatting. OO's goals don't (yet?) seem to be to create a 'tagless editor' that allows the WYSIWYG editing of true structured XML documents (Using your own DTD or Schema).

    This sounds more critical then I mean as I think OO have made the correct decision in going for a proprietary (even if they now want it to become 'the' standard) document format and concentrating of then needs of the vast majority of users who just want to be able to save and load richly formatted documents.

    If I want to interchange documents then I use RTF, if I want to edit XML then I use an XML editor, if I want to convert a document to XML for further processing then I export it to XHTML (from whichever word processor). I use OO (well StarOffice) because it is the best word processor not because somewhere behind the scenes it is using the latest buzz technology

  123. Actually, I have written a perl script by davidbailey · · Score: 2, Informative

    I recently wrote Perl script to download multiple congregation church membership directories from our churches website and manipulate them into comma-delimited, tab-delimited, and nicely formatted OpenOffice Calc (spreadsheet) and Writer (word-processor) formats directly from the Perl script. Because the Microsoft formats are closed, I could not output into those formats directly from the script, nor do I feel like reverse engineering the formats to figure out how.

    I then used OpenOffice to save the files as Word and Excel formats for those who don't have access to OpenOffice, but I included a reminder that OpenOffice is free and included a link to the website.

    This would have been impossible without OpenOffice, and I thank them for their work. The final output has headers, footers, special formatting and prints out like a professional document, not roughly formatted text output in courier.

  124. I am doing this in gcompris by Anonymous Coward · · Score: 0

    I am leveraging XML all around in gcompris
    We have inline documentation in xml and is is being translated to HTML or OO with xslt.

    Look at:
    html version
    oo version

    Yes it is great, yes it works.

  125. writer2latex by Anonymous Coward · · Score: 0

    I like the idea to convert every document readable by Openoffice.org to Latex. writer2latex will do the job but is still beta. Rememeber that Openoffice.org reads Microsoft Word documents also.

    PS: KOffice 1.4 will use the openoffice file format by default.

  126. Automatic meta-data generation by gr8_phk · · Score: 1

    One could analyze a document to generate meta-data about it. This could then be fed into "Storage" - the file system with natural language querys. One big problem with Storage would seem to be creating the database, but making it easier to read documents could help.

  127. Re:PHP - Can u print automagically? by BigBadBri · · Score: 1
    If you've got OO on your server, call

    soffice -headless -p

    It runs OO in the background, prints and exits.

    HTH.

    --
    oh brave new world, that has such people in it!
  128. Here's why your wrong by Overly+Critical+Guy · · Score: 2, Informative

    "MS won't stand for an XML file format -- it's human-readable. the last thing MS wants is for their file format to be easily convertible and transformable. it's a pity, because switching Office files to XML would quickly make them insanely useful."

    You people are so biased. Now Office has suddenly "dropped the ball." Of course, that meme will permeate through all Slashbots' thinking, whether or not they've even tried Office 2003.

    Here is a sample XML file. The original message said "This is a <b>test</b> of <b><i><font face="verdana" size="24">XML</font></i></b>."

    NOTE:&nbsp ; Slashcode adds random semicolons and other garbage for some reason.

    <?mso-application progid="Word.Document"?>
    <w:wordDocument w:macrosPresent="no" w:embeddedObjPresent="no" w:ocxPresent="no" xml:space="preserve">
    <o:DocumentProperties>
    <o:Title>This is a test of XML</o:Title>
    <o:Author>Preston Sumner</o:Author>
    <o:LastAuthor>Preston Sumner</o:LastAuthor>
    <o:Revision>1</o:Revision>
    <o:TotalTime>1</o:TotalTime>
    <o:Created>2003-09-18T15:29:00Z</o:Created>
    &nbsp ; <o:LastSaved>2003-09-18T15:30:00Z</o:LastSaved>
    <o:Pages>1</o:Pages>
    <o:Words>3</o:Words>
    <o:Characters>20</o:Characters>
    &nbsp ; <o:Company>White Goat Studios</o:Company>
    <o:Lines>1</o:Lines>
    <o:Paragraphs>1</o:Paragraphs>
    <o:CharactersWithSpaces>22</o:CharactersWithSpaces >
    <o:Version>11.5604</o:Version>
    </o:DocumentProperties>
    <w:fonts>
    <w:defaultFonts w:ascii="Times New Roman" w:fareast="Times New Roman" w:h-ansi="Times New Roman" w:cs="Times New Roman"/>
    <w:font w:name="Verdana">
    <w:panose-1 w:val="020B0604030504040204"/>
    <w:charset w:val="00"/>
    <w:family w:val="Swiss"/>
    <w:pitch w:val="variable"/>
    <w:sig w:usb-0="20000287" w:usb-1="00000000" w:usb-2="00000000" w:usb-3="00000000" w:csb-0="0000019F" w:csb-1="00000000"/>
    </w:font>
    </w:fonts>
    <w:styles>
    <w:versionOfBuiltInStylenames w:val="4"/>
    <w:latentStyles w:defLockedState="off" w:latentStyleCount="156"/>
    <w:style w:type="paragraph" w:default="on" w:styleId="Normal">
    <w:name w:val="Normal"/>
    <w:rPr>
    <wx:font wx:val="Times New Roman"/>
    <w:sz w:val="24"/>
    <w:sz-cs w:val="24"/>
    <w:lang w:val="EN-US" w:fareast="EN-US" w:bidi="AR-SA"/>
    </w:rPr>
    </w:style>
    <w:style w:type="character" w:default="on" w:styleId="DefaultParagraphFont">
    <w:name w:val="Default Paragraph Font"/>
    <w:semiHidden/>
    </w:style>
    </w:styles>
    <w:docPr>
    <w:view w:val="normal"/>
    <w:zoom w:percent="100"/>
    <w:doNotEmbedSystemFonts/>
    <w:proofState w:spelling="clean" w:grammar="clean"/>
    <w:attachedTemplate w:val=""/>
    <w:defaultTabStop w:val="720"/>
    <w:characterSpacingControl w:val="DontCompress"/>
    <w:optimizeForBrowser/>
    <w:validateAgainstSchema/>
    <w:saveInvalidXML w:val="on"/>
    <w:ignoreMixedContent w:val="off"/>
    <w:alwaysShowPlaceholderText w:val="off"/>
    <w:compat>
    <w:breakWrappedTables/>
    <w:snapToGridInCell/>
    <w:wrapTextWithPunct/>
    <w:useAsianBreakRules/>
    <w:useWord2002TableStyleRules/>
    </w:compat>
    </w:docPr>
    <w:body>
    <wx:sect>
    <w:p>
    <w:r>
    <w:t>This is a </w:t>
    </w:r>
    <w:r>
    <w:rPr>
    <w:b/>
    </w:rPr>

    --
    "Sufferin' succotash."
    1. Re:Here's why your wrong by butane_bob2003 · · Score: 1
      Well, only an Office product could handle that initial processing instruction.
      <?mso-application progid="Word.Document"?>

      And where are the o and w namespaces declared? While its very XML like, it can't really be parsed by a standard XML parser without some modification. It still looks like this format was adopted because

      A: Microsoft is pretty good at marketing.
      B: Word's old format was krufty. Might as well XMLize it, wouldn't want to be behind the times.

      Really, couldn't they have gone the extra small step to make it readable by other parsers?
      --


      TallGreen CMS hosting
    2. Re:Here's why your wrong by Overly+Critical+Guy · · Score: 1

      The schemas are fully available and documented online at Microsoft's website. I don't know how much more you want them to hold your hand over this.

      I doubt very highly that ANYTHING will convince you that it's not that bad at all. You need Microsoft to have done wrong here.

      --
      "Sufferin' succotash."
    3. Re:Here's why your wrong by butane_bob2003 · · Score: 1
      I never said it was BAD. I don't really believe in the existence of BAD. I think it is BETTER than before, just not what I was looking for. Maybe an
      <?xml version="1.0"?>
      would get me on my way. If I am not mistaken, an XML document requires a version tag to be calling it XML. Otherwise most parsers reject it.
      --


      TallGreen CMS hosting
    4. Re:Here's why your wrong by Froqen · · Score: 1

      The hello world word generated xml document I have have both the standard xml version header and namespace declarations.

  129. Re:Word to HTML to XML to HTML by oaksey · · Score: 1

    I find in word saving as RTF then using AceHTML and saving as results in pretty clean HTML.

  130. Better idea to standardize file formats by failedlogic · · Score: 1

    Amongst the Open Source word processor projects, I think KWord and AbiWord should standardize on one file format (OO already has an XML format) or maybe they should all share the same format.

    Over the years of writing essays for University, I've written documents in Word Perfect, Word and Open Office. While I first compose them in plain text and save the final draft in plain text, there's nothing worse than trying to open a document in a different word processor and having - all - the formatting thrown off.

    Since the KWord, AbiWord and OO are all open source it would be nice to have standard file formats. Makes sharing much easier. If not for me (blantant selfishness ... and there's nothing better ;) ) than do it for everyone else. Corporate environments dread not being able to read client's files which are e-mailed to them.

  131. A year is not enough. by pmz · · Score: 1


    Ask again in three.

  132. Au Contraire by andy_geek · · Score: 1

    Micro$oft didn't "drop the ball." It popped the ball.

    Face it, Ballmer and Gates never had any intention of actually going to an open spec with their office documents: they have too much to lose. They simply wanted to gain some goodwill from the tech community, so they feigned an interest in that direction.

    Sorry, but without Redmond on board and willing to hand over the keys to the store, this idea's another interesting-but-fruitless venture, doomed to fail (or, in this case, doomed to be adopted by such a small percentile of users as to render it useless).

    --
    "Don't matter how New Age you get, old age is gonna kick your ass." - Utah Phillips
  133. Development Environment by MarkKnopfler · · Score: 1

    Well I for one would really really welcome such a thing. One of the biggest problems I have faced during the development cycle is the PHB insisting that all documents should be in M$ Word and latex is a strict no no ( for whatever braindead reason ) I see a very welcome atmosphere where I write a bunch of perl scripts to actually generate templates for my requirements Func specs, and design documents and actually derive one document from another.
    I also could deploy a bunch tools to actually derive requirement tracing matrices and other metrics for a particular project directly from my documents and also maintain the documents in CVS. And the PHB gets everything in M$ Word. Its a perfect world !

  134. XML is Proprietary Anyway, So Why Use It? by CyNRG · · Score: 1

    XML allows you to create your own tags in your own format. Just because it's ascii doesn't mean anything. If you don't know what each element is meant to do, then who cares?

    It makes storing it in a database easier, but so what. XML structure is standard, but not the definition extentions.

    Just something new for programmers to do. ;->

  135. Doing it Without StarOffice and XML (sort of) by Anonymous Coward · · Score: 0

    We had a requirement for some of our applications to do "all sorts of wonderful new things" with MSWord documents using a web application.

    We basically had MSWord documents that need to be uploaded and downloaded to and from a web application. In the document itself, we needed some areas to be editable by user and other areas rendered by the application, like a report. We also needed to extract parts of the document to populate a database. Obviously we could not use a document in MSWord format.

    We also could not use StarOffice because this is an application that will be used across the enterprise. That is potentially tens of thousands of people and MSWord is what is used across the enterprise, like it or not.

    As a result we opted to use RTF documents, which are read and writeable by MSWord. It seemed like a logical choice.

    What we did is mark areas of the document using XML like tags, using word's hidden text feature. As long as they typed between the tags, we could extract the text. We also marked other areas with similar tags to be rendered by the web application.

    There are actually two applications that do this: one uses Java and JSP, the other uses ColdFusion. Because RTF is text, we were able to render portions of it like we would HTML, using JSP or CFML. To process the tags we would simply use pattern matching using regular expression capabilites of either language.

    Does it work? Well, mostly it does. It can be problematic because the RTF that MSWord generates is about as ugly as the HTML it generates. Also, if a user accidently removes one of the tags, it breaks. However, for the most part it works, I guess.

  136. OT: Electoral college by Prior+Restraint · · Score: 1

    (ObOffTopic: Then again, I thought there'd be enough outrage about the 2000 election that we'd see a push for the abolition of the electoral college, but it's just degenerated into whining and sniping..)

    The only real problem with the electoral college is that most states implement it poorly. Nowhere does it say that the votes have to be "winner take all". There's no constitutional reason why California's 54(?) votes all have to go to one candidate. A more sensible scheme would be that each candidate gets one electoral vote for each district where they win, with the last two going to (say) the candidate who takes the most districts.

    Of course, Republicans and Democrats have no incentive to do that. Under the current scheme, most of the states are assumed to go one way or the other, and they only have to campaign in a few "swing" states. It's a lot more efficient for them, and it makes it all the harder for third-party candidates to get taken seriously ("S/he never got so much as a single electoral vote!").

  137. And agreed.. by msimm · · Score: 1

    I pretty much would have to agree with all your points. I'm not saying the OS stigma is well reasoned (although it is still a little ways from being childs play to find reliable support) and I do believe that open source is (and will continue) finally begining to find acceptability in the mainstream business place (thanks in part to the large corperations lending credibility and in part to the OS community starting to 'get it').

    With functioning DRM on the horizon and Microsoft's determination to stamp out piracy its going to continue to get a whole lot more interesting in the open source community.

    --
    Quack, quack.
  138. Use case of generating OpenOffice.org files by dolmen.fr · · Score: 1

    I've searched for a while how to generate documents from a template and data in a database to publish on paper the schedule of my hikers association. A requirement was to be able to reformat the document with a word processor to tweak the page setup after the extraction (so PDF was not a solution).

    Finally I developped a solution generating OpenOffice.org Writer documents using Java and Velocity (jakarta.apache.org) -based templates for content.xml/styles.xml of the OOo document. The Java code expose an object model (built from Java classes that handle database extraction) to the Velocity templates.

    For more information you can contact me: dolmen bigfoot com (email).

  139. MS Roadmap for XML and Office by Anonymous Coward · · Score: 0

    The MS Roadmap for XML and Office is here: http://www.microsoft.com/office/using/column21.asp

  140. TeX anyone? by dh003i · · Score: 1

    every time I hear about XML as some kind of standard, all I can think about is TeX.

    Not only is TeX vastly more simple than XML, and superior to anything else out there in terms of the quality of output it produces, but it is also very compressable.

    Futhermore, TeX is clearly the best piece of software in existence. It is essentially bug-free. Despite the author of TeX offering a reward for any bug in TeX found, no bug has been found in it for a very very very long time.

    Finally, TeX has a superb record of backwaards compatability, and will always have a superb record. Something written in TeX today will output the same now as it does 100 years from now, because the TeX engine has been frozen.

  141. Re:XML... User Interfaces by rtb61 · · Score: 1

    To allow a computer to provide a user interface that can provide an interprative level of interaction ie: what it "thinks" it wants you to do requires a considerable amount of power plus an extended period of interaction with the user. Catch 22 - use the computer a lot so that it can "learn" how you want to use it, so that you can use the computer. Not to mention most end users I know, dont react well when the computer fails to do what they wanted it to do, regardless of how illogical the instructions they gave to it in the first place. How many times have you had to do more than just say you were sorry to your wife for failing to understand what she expected you to do regardless of what she told you to do. Computers dont need better interfaces just a credit card so when they muck things up they can buy you dinner, regardless of whose fault it was.

    --
    Chaos - everything, everywhere, everywhen
  142. Pay by jefu · · Score: 1
    Professors in the US don't (in most colleges and universities at least) get paid a whole lot. There are exceptions - mostly those who do the right kind of research and sell the results to industry or who gather honoraria for lots of speechifying.

    You don't get paid well for teaching. Or for learning or doing. And industry looks on time spent teaching as wasted time even though I (personally) could easily code circles around many industry geeks.