Slashdot Mirror


Docvert 3.0 Lessens Reliance On Microsoft Office

An anonymous reader writes "After 10 months of development Docvert 3.0 was released today. This open source web service converts DOC files to Oasis OpenDocument 1.0, and then to HTML, RSS, or any XML format. Try the ODF demo or download the source and install it on your own box. Version 3.0 comes with an MS Word Plugin, FTP/WebDAV upload, and an in-browser document editor."

22 of 108 comments (clear)

  1. It promises to be an interesting battle by 0racle · · Score: 3, Insightful

    Ya, I'm on the edge of my seat. It will get adopted as a standard or it won't. Office will use it either way and anyone wanting to interoperate with Office will have to try to implement it as well.

    --
    "I use a Mac because I'm just better than you are."
    1. Re:It promises to be an interesting battle by Zaiff+Urgulbunger · · Score: 5, Insightful

      All true, but if it does get adopted as a standard, then MS can use this to ensure the continued use of MS Office by government agencies around the globe. If it doesn't get adopted, MS will be under pressure to provide a supported, native, OOD format.

    2. Re:It promises to be an interesting battle by truthsearch · · Score: 4, Insightful

      In my opinion there are two reasons Microsoft is trying to create their own standard: PR and government contracts. The PR aspect is obvious. The US government is Microsoft's largest customer (by far) and also the most likely to demand open document standards. Other governments will likely do the same long before corporations demand it. So Microsoft needs to have their own standard which they implement first in order to get the contracts.

      They don't have to implement it correctly. They can claim support for a standard for years without actually following it (e.g. CSS, Kerberos, etc.) and still get the contracts. They were actually involved in creating some CSS standards and still didn't follow them.

      It's all about the money. Get the big contracts and nothing else matters.

    3. Re:It promises to be an interesting battle by mrchaotica · · Score: 3, Informative
      Can anybody implement for free?

      No, because bits of it are patented (especially the "legacy compatibility" parts that basically just say "emulate old versions of Office").

      Can MS get fined for saying they support the standard when in fact their software actually does not (ala, Java, CSS, HTML, Kerberos, and others).

      In this case it won't matter, because the OOXML "standard" is effectively defined as "whatever MS Office does." In other words, MS basically documented Office's behavior down to the smallest detail, and submitted it to ECMA and now ISO.

      --

      "[Regarding the 'cloud,'] ownership was what made America different than Russia." -- Woz

    4. Re:It promises to be an interesting battle by FireFury03 · · Score: 3, Informative

      In other words, MS basically documented Office's behavior down to the smallest detail

      They didn't even do that. A lot of the document states that when you encounter certain tags you will emulate a Office bug, but never specifies the details of that bug because that is "beyond the scope of the document". So even if you have the standards document, you can't fully implement the standard without getting all the old versions of Office and reverse engineering their behavior.

  2. I originally read OOXML ... by alispguru · · Score: 5, Funny

    as:

    Object
    Oriented
    X
    M
    L

    and whimpered at the thought...

    --

    To a Lisp hacker, XML is S-expressions in drag.
    1. Re:I originally read OOXML ... by Knuckles · · Score: 3, Insightful

      That is exactly what I thought at first as well!

      And i wonder how you could. Even just reading the the /. blurb makes it clear that the "standard" as proposed is non-implementable.

      --
      "When I first heard Daydream Nation it quite frankly scared the living shit out of me." -- Matthew Stearns
  3. Please recommend compliance validation tools by statusbar · · Score: 3, Interesting

    One of the things that bugs me are these 'enormous specifications' that are inconsistent. What we need is not just a document, but the tools necessary to verify a generated file. Not just for valid XML, but for all the little microsofty-bits hidden inside.

    --jeffk++

    --
    ipv6 is my vpn
    1. Re:Please recommend compliance validation tools by Zaiff+Urgulbunger · · Score: 4, Insightful

      Microsoft isn't doing this for you silly! The whole intent is likely that it is *hard* for anyone to implement.

  4. Describing exceptions doesn't make a standard. by splutty · · Score: 4, Insightful

    Despite what Microsoft thinks and how they're been acting in the past with all their 'standards'; Describing all the exceptions doesn't make something a standard. Describing them in the context of a non-standardized environment, makes it even less so.

    Although I'm quite sure that Microsoft really doesn't give a and will push this through as 'their' standard that everyone else will have to adhere to to be able to do anything with Mickyshaft generated content anyway.

    Whether ISO approves of this or not is inconsequential, the only thing that matters is that M$ can now say: Look, we proposed a standard, it's not our fault 'they' think it's not good enough.

    --
    Coz eternity my friend, is a long *ing time.
    1. Re:Describing exceptions doesn't make a standard. by Anonymous Coward · · Score: 4, Insightful

      Whether ISO approves of this or not is inconsequential, the only thing that matters is that M$ can now say: Look, we proposed a standard, it's not our fault 'they' think it's not good enough.
      It matters to governments, who are coming under increasing pressure to rationalize their MS Office upgrade cycle (and why they're not getting out, via standards)

      But yeah it doesn't matter much to the private sector / industry.
  5. Deja Vu Docvert by ei4anb · · Score: 5, Interesting
    Way back before the web I worked in a Unix shop that was a development lab for a big multinational. Head office kept sending us e-mail with large MS Word attachments. We got tierd of having to go down to the library, where we kept the only PeeCee in the department, just to see what was in the attachment.

    I solved the issue by writing a program that ran on a Windows PC (an old one that had been discarded and was gathering dust in the closet) that received SMTP mail, detached the Word attachment, started up Microsoft's Word Viewer to read the attachment, then "printed" it to a file in PDF format and finaly SMTP mailed it back to the sender.

    From then on all we had to do was forward the email to the robot and wait for a readable version to bounce back. As I used Microsoft's own Word Viewer there were no problems whenever a new version of Word came out, I just downloaded the latest viewer :-)

  6. And the sad thing is... by Durkheim · · Score: 5, Insightful

    ...Some people think its fine that way. A friend of mine, quite pro-ms, told me that all those little strange things in the specification where normal to have backwards compatibility, and that reading the specification was a waste of time. Instead, he directed me towards a preview of Ms office 2007. Because for him, as for many more, what's important is the final product, the cuteness of the buttons, the way it works and displays its own format. Why bother using a free program that displays word documents badly, when Office is already perfect huh? I feel so misunderstood sometimes. What makes me sad is that they don't see the use of a clear straight-to-the-point format. Maybe only geeks can be horrified by this one.

    1. Re:And the sad thing is... by westlake · · Score: 3, Insightful
      I feel so misunderstood sometimes. What makes me sad is that they don't see the use of a clear straight-to-the-point format. Maybe only geeks can be horrified by this one.

      The user cares only for the document he sees in print or on screen. The internal structure of the file interests him not at all.

  7. Divy it up? by plopez · · Score: 3, Insightful

    I wonder if you could get 60 people to review 100 pages each (or divide up chapters or sections in some logical manner). That may be feasible in 1 month. At least the glaring problems would be flagged. I have no idea how to organize this however.....

    --
    putting the 'B' in LGBTQ+
    1. Re:Divy it up? by Dragged+Down+by+the · · Score: 3, Funny

      Why, you could use Office 2007's Auto-Collaborate-Review feature, of course!

  8. Re:Open XML is a transliteration by TheRaven64 · · Score: 5, Insightful
    The design requirement of Microsoft's XML format was (obviously) that it be possible to convert existing Word documents to it without any loss. In order to do this, there must be a one-to-one mapping between the .DOC semantics and the OpenXML semantics.

    The second design requirement was that the spec be developed and released quickly, before ODF had time to gain much traction. Between these two objectives, it's hardly surprising that it ended up the way it did...

    --
    I am TheRaven on Soylent News
  9. Open Source community debugs MS code by RichMan · · Score: 3, Funny

    So it looks like the Open Source community is now debugging Microsofts Document format. I am sure Microsoft does not itself know what is going on in here half the time and much of this document was generated by code scrappers looking for structures and interfaces.

    Congrats to the world community but they should really submit a bill to Microsoft.

  10. Yeah, that's a Microsoft product alright by Master+of+Transhuman · · Score: 3, Funny

    "additional Microsoft technology that must be emulated (but is not covered by the Microsoft patent pledge); elements that can't be implemented without Microsoft technical assistance; dependencies on Windows itself; mandatory bugs; and more. And then there's also the fact that OOXML heavily overlaps ODF -- a platform-independent, already-adopted ISO/IEC."

    Pretty much like everything they do.

    Wait - where are the virus APIs? Did they leave those out?

    Naah...

    Gotta be there somewhere. Keep looking.

    --
    Richard Steven Hack - This sig is TOO GODDAMN SHORT TO DO ANYTHING USEFUL WITH! MORONS!
  11. Re:What, like... (Oops, forgot, no xml tags.) by Mythrix · · Score: 5, Funny


    <microsoft_word_document>
    (Content of .doc file)
    </microsoft_word_document>

  12. More info @ groklaw by mario64 · · Score: 5, Informative

    Check out the article on Groklaw Searching for Openness in Microsoft's OOXML and Finding Contradictions for further comments. The article also has links to a couple of wiki pages with further comments.

  13. Re:Open XML is a transliteration by 99BottlesOfBeerInMyF · · Score: 4, Insightful

    That's the reason for all the "render like WordPerfect 5.x" options that people have complained about, because they have to allow people to convert to the XML format and then convert back without reducing the document to an unreadable mess.

    There is no reason I know of why the XML format cannot support all the features of Word and round trip, without relying on nasty hacks like this, it just takes more work. The problem with "Open"XML that I've seen is the concentrate entirely on supporting only the features of .doc files and their interactions with other programs to the exclusion of anything else. Rather than "render like WP 5.x" you need to define how WP 5.x renders that feature, then incorporate it into your conversion script in a way that makes sense in general for documents.

    The whole format is built upon the assumption that only MS and Word will be using it and it is not designed to abstract word processing documents in general, but to kowtow to the eccentricities of Word.

    The alternative is to not support roundtripping and then wait for slashdot headlines like "Users find that the new Office XML format mangles their documents".

    No, the alternative is to do it right and build hacks like the ones you mention into the import and export routines, rather than embedding them, without any definition, into the format.