Slashdot Mirror


Is the New Microsoft Office Really Open?

joesklein asks: "From CNET, there is an article about the new Microsoft Office 11. In summary 'Microsoft says it's opening its Office desktop software by adding support for XML--a move that should help companies free up access to shared information. But there's a catch: It has yet to disclose the underlying XML dialect.' Could this be grounds for another anti-trust suit against Microsoft?"

241 of 485 comments (clear)

  1. sure it is! by Anonymous Coward · · Score: 5, Funny

    it supports .DOC, the de facto standard for documents. What's this XML you're talking about?

  2. That's still to be seen... by Eric+Damron · · Score: 2, Interesting

    "In summary 'Microsoft says it's opening its Office desktop software by adding support for XML--a move that should help companies free up access to shared information."

    Are we talking about true standard XML is Microsoft going to "embrace and extend" it?

    --
    The race isn't always to the swift... but that's the way to bet!
    1. Re:That's still to be seen... by C.+Mattix · · Score: 2, Insightful

      Aren't you supposed to "extend" it....
      eXtensible Markup Language...

      Just my $.02

    2. Re:That's still to be seen... by JebusIsLord · · Score: 3, Informative

      No because the dtd and/or namespace will have to be referenced in plain text in the xml document. so, even if they use absurdly complex element names, they have to use a valid dtd or namespace uri which can be easily referenced, or it just ain't xml at all. Also you aren't allowed to put binary data in an xml document, but even if they did reference their dtd by memory address for instance, its an easy task to just read that address. In conclusion they would have to break xml pretty hard-core in order to make their doc types proprietary. Besides, then what would be the point of going xml in the first place?

      --
      Jeremy
    3. Re:That's still to be seen... by ftobin · · Score: 4, Insightful

      Besides, then what would be the point of going xml in the first place?

      The same point that most technical decisions are based on. Buzzword compliance.

    4. Re:That's still to be seen... by Anonymous Coward · · Score: 2, Interesting

      MIME-encoded binary data, on the other hand, is perfectly happy in a ForeignData XML tag...and MS already ships a product that does exactly that.

    5. Re:That's still to be seen... by EnVisiCrypt · · Score: 3, Informative

      The hell you can't put binary data in an XML document. As long as it's base64 encoded you can put anything in there.

      --


      *everything* is Orwellian to cats.
    6. Re:That's still to be seen... by Eryq · · Score: 3, Insightful

      First, you don't have to reference a DTD to produce valid XML. SAX/DOM parsers will work just fine on a document without a DTD.

      Second, you can have "binary" data in an XML document. Just base64 encode it.

      Third: the point of going to XML if you're just going to produce a mess? Simple. You get to claim openness. Most PHBs probably don't know the difference between turly structured, stable, "open" XML, and syntactically-correct but semantically-useless XML.

      --
      I'm a bloodsucking fiend! Look at my outfit!
    7. Re:That's still to be seen... by mccalli · · Score: 2
      First, you don't have to reference a DTD to produce valid XML. SAX/DOM parsers will work just fine on a document without a DTD.

      You certainly do have to reference either a DTD or a schema. I'm aware that most parser implementations will operate on documents without them, but that doesn't make the original documents valid.

      Cheers,
      Ian

    8. Re:That's still to be seen... by 9jack9 · · Score: 5, Insightful
      But they can make it so massively complex that it is very difficult to implement interoperability with foreign tools, but that it is somehow much easier to implement with MS-centric tools.

      The registry in Windows NT/2000/XP is sort of like that. It makes a lot more sense from a Microsoft-centric viewpoint than it does from a non-Microsoft-centric viewpoint. Now that it's been around so long, there are lots of ways to get at registry data (for instance, using Perl modules), but when the registry was new the only way to do it was through the Microsoft API, but until many people went through the pain of encapsulating the MS API, the pain of accessing the registry from a non-MS-centric toolset was high.

      So maybe the XML format will be like that. If you're Linux-centric, for instance, the threshold of pain for accessing Word XML docs will be fairly high, but if you're Microsoft-centric, with all of their tools, code-snippets, documents, etc., then it won't be nearly as painful.

      This way MS gets to claim interoperability, make Word data easily accessible to MS-centric solutions, but put a damper on non-MS-centric solutions.

    9. Re:That's still to be seen... by Fnkmaster · · Score: 2

      You do know that the namespace URI is just that - it's a unique identifier for a namespace, NOT a URL that you can dereference to find anything. The topic of schema URIs (i.e. targetNamespaces) has been debated a million times and it's 100% possible to have a document that validates against a schema or DTD that is identified with a URI at which the DTD or schema CANNOT be downloaded, and which can be stored locally on a machine for validation purposes (the parser uses an internal map to correlate the URI with the schema/DTD).

    10. Re:That's still to be seen... by MrResistor · · Score: 5, Insightful

      No because the dtd and/or namespace will have to be referenced in plain text in the xml document. so, even if they use absurdly complex element names, they have to use a valid dtd or namespace uri which can be easily referenced

      I think an analogy to Frontpage is appropriate here. Sure, it produces HTML, but the result just doesn't look right unless it's viewed in IE. Maybe the dtd is referenced, but encrypted or otherwise proprietary. Maybe MSXMLVIEWER (whatever it may be called) doesn't need the reference to be in plain text.

      There are any number of things MS could do to ensure that the document just doesn't look right in other viewers. Since formatting is the whole point of XML, people will use MSXMLVIEWER and whatever it reads will be the de facto XML standard, just like whatever IE renders is the de facto HTML standard.

      or it just ain't xml at all.

      While technically correct, the point is sadly irrelevant. As long as MS is effectively a monopoly XML will be whatever they say it is, for the majority of people.

      Also you aren't allowed to put binary data in an xml document

      Not true. It's recomended that you don't put binary in an XML document, but nothing prevents you from doing so. This is exactly what will give MS the ability to hijack the standard.

      In conclusion they would have to break xml pretty hard-core in order to make their doc types proprietary.

      Only in spirit, I'm afraid, but that will likely be enough.

      Besides, then what would be the point of going xml in the first place?

      To make documents searchable. This is an ability which is extremely valuable to anyone who has a large amount of information they need to access. The upshot is that the actual content will likely be plain text, though important markups may not be. Sadly, format is more important than content for a lot of people.

      Of course, most people won't use the XML format at all, since it won't be the default.

      --
      Under capitalism man exploits man. Under communism it's the other way around.
    11. Re:That's still to be seen... by 4of12 · · Score: 2

      Heh, I've been thinking the same thing all along...

      <displayhintobject>
      982a2eba7a88a04d7b1132042d3f649b5fcd
      f8136ebcd3d700008f6fe2698df90feecfbe387c1551
      </displayhintobject>
      --
      "Provided by the management for your protection."
    12. Re:That's still to be seen... by JebusIsLord · · Score: 2

      Actually he's right, you can have valid xml without a dtd or schema, its just completely open and impossible to validate, which means I find it most unlikely MS would go that way because usually dtd-less documents are extremely simple.

      --
      Jeremy
    13. Re:That's still to be seen... by JebusIsLord · · Score: 2

      sorry I mistyped. You can have binary DATA but you cant for instance have binary-encrypted elements (tags, attributes etc.). So the document must remain parsable by a text viewer. They can't for instance put xmlns:0x16f53ea4 or something when referencing the namespace.

      --
      Jeremy
    14. Re:That's still to be seen... by mccalli · · Score: 2
      Actually he's right, you can have valid xml without a dtd or schema

      No - you can have well-formed XML. You can't have valid XML.

      Cheers,
      Ian

    15. Re:That's still to be seen... by CondeZer0 · · Score: 5, Insightful

      How does this misinformed crap get moderated up?

      As some others have pointed out:

      1) You don't need a DTD or Schema to have XML
      2) The url used in a namespace declaration doesn't need to correspond to a real document
      3) Even in case the document used a DTD or Schema, that DTD or Scheme where available, and the document actually validated against it, you still don't know what the hell the tags mean, the DTD or Scheme are just syntactical(and grammatical?) rules, and don't tell you how to interpret the tags or attributes.
      4) You can always include binary data in an XML document(ie., base64 encoded)
      5) The point of using XML is Buzzword compliance and *perceived* openness

      There are more reasons why XML not necessarily = openness. But this ones are more than enough.

      XML means nothing, it's just a way to define languages, is like an charset, just because I have a document that is ASCII doesn't mean that I understand what is written on it if I don't know the meaning of the words that are on it(eg., just because you know the name of each letter doesn't mean that you know the meaning of "lkasdertunxsjd", right?)

      Even if a language is in XML, you still need to *document it* to be able to *understand* it.

      Sorry if I was a bit rough, but I'm sick of people that assume that because something is in XML it's automatically open. That is one of the biggest myths the XML buzz-wagon is based on, and is spreaded by people
      that don't really understand what XML is.

      Please, before you post to /. make sure you know what you are talking about.

      Best wishes

      \\Uriel

      --
      "When in doubt, use brute force." Ken Thompson
    16. Re:That's still to be seen... by jonadab · · Score: 2

      > I think an analogy to Frontpage is appropriate here. Sure,
      > it produces HTML

      No, it doesn't. It produces something that looks vaguely similar
      to HTML, perhaps, but HTML it is not. You look at a FrontPage
      document's source closely, and you see a mishmash of deprecated
      HTML3 markup, newer markup that didn't exist in HTML3 but was
      introduced later, plus the occasional attribute that never
      existed in _any_ version of HTML, thrown in for good measure.

      It is only because of the long-standing practice of browsers since
      Mosaic (possibly before) to ignore any tag or attribute they don't
      understand that a FrontPage document will display at all in any
      browser. (This is fun to try sometime: make up a tag, completely
      out of thin air, and use it in a webpage, and see how various
      browsers handle the page.)

      <voice id="Linus" rate="slow">I pronounce Linux as Linux</voice>
      Any browser will display the quote as if the voice tags weren't
      there at all -- does that make it HTML?

      --
      Cut that out, or I will ship you to Norilsk in a box.
    17. Re:That's still to be seen... by NoMoreNicksLeft · · Score: 2

      Actually, it would be to break XML, as they have tried to do with other "standards".

    18. Re:That's still to be seen... by jonadab · · Score: 2

      I've worked up an even better demonstration

      --
      Cut that out, or I will ship you to Norilsk in a box.
    19. Re:That's still to be seen... by jonadab · · Score: 2

      > You don't need a DTD or Schema to have XML

      You can have wellformed XML without them, but there must be a
      DTD or Scheme in order to have _valid_ XML.

      > The url used in a namespace declaration doesn't need to
      > correspond to a real document

      Or, more to the point, the document at that URL can be an inside
      joke from the movie Ghostbusters, rather than having any actual
      declarations. (Those of you who think I am kidding on this point
      have never tried to access the document that the XUL namespace
      declaration points to.) This, however, is not really important.

      > Even in case the document used a DTD or Schema
      To be valid it has to... anyway, even if it doesn't, there
      is one implicit.

      > that DTD or Scheme were available
      The availability of the DTD or Schema[1] is really not important.
      It would be easy enough to write a program that analyses documents
      that are known to be valid and keeps track of which tags contain
      data, and which ones contain PCDATA, and which other tags they
      have nested in them. Analyse enough documents, and you have a
      subset of the original DTD that's good enough for creating
      documents that are guaranteed to be compatible and can use all
      the features used by the documents you analysed.

      > you still don't know what the hell the tags mean

      Of all the points you made, this is the important one. XML is
      by its very nature a very flexible standard. It's not like HTML
      where a formal standard specifies that <p> is a paragraph and
      that it is a block-level element with certain amounts of white
      space top and bottom and so on and so forth. The tags and
      attributes an the format can be interpreted in whatever way
      the application sees fit.

      In practice, that means another word-processing app can with
      relative ease use the same format in such a way that tools for
      searching and indexing will work on documents created by both apps,
      and it means that if you open a Word document in whatever other app
      that uses that format you can make minor changes (such as wording
      changes) and save it, and when Word opens it again it won't be
      munged (assuming the other app does things in a sane manner that
      preserves whatever markup it doesn't understand). But it does NOT
      mean that the doc will necessarily look the same in the other app
      as it does in Word.

      [1] And when did "schema" become singular, anyhow?

      --
      Cut that out, or I will ship you to Norilsk in a box.
    20. Re:That's still to be seen... by jonadab · · Score: 2

      > Both are originally ancient Greek, not Latin as one might think.

      But alpha is a plural suffix in Greek, too... neuter nom/acc...

      > The plural of schema is schemata

      Oh, duh, I see it now; it's third declension, and the a isn't a
      suffix at all; the root ends in t, which drops off in the nominative
      singular where there's no ending. Why didn't I see that before?

      I learned something today.

      --
      Cut that out, or I will ship you to Norilsk in a box.
    21. Re:That's still to be seen... by jonadab · · Score: 2

      > I bet you didn't know you can still format XML tags with CSS

      Yes, I did know; that's why in the demo I wrote this:
      > (Presumably, this is so the rendering engine for HTML and XHTML
      > can share a lot of information with the one for general XML.)

      But in theory, if we were being strictly specification-complaint,
      that would only work in XML. The demo is served as text/html and
      does not have an xml version declaration (one of those funny things
      with the question marks beginning and end before the doctype (which
      also isn't there in the demo)). So it ought to be treated as HTML
      (or SGML), not XML. In theory.

      --
      Cut that out, or I will ship you to Norilsk in a box.
    22. Re:That's still to be seen... by MrResistor · · Score: 2

      That is exactly the point of my analogy. Thank you for noticing it.

      I predict that MS will do similar things with XML.

      --
      Under capitalism man exploits man. Under communism it's the other way around.
    23. Re:That's still to be seen... by MrResistor · · Score: 2

      You must have a very simple site layout, then.

      --
      Under capitalism man exploits man. Under communism it's the other way around.
  3. LOL by Boss,+Pointy+Haired · · Score: 4, Funny

    Well if the way Microsoft Word saves out as HTML is anything to go by, then concise it most definitely will not be.

    1. Re:LOL by Anonymous Coward · · Score: 5, Funny



      <head>
      <META HTTP-EQUIV=3D"Content-Type" CONTENT=3D"text/html; =
      charset=3Dus-ascii">

      <meta name=3DGenerator content=3D"Microsoft Word 10 (filtered)">

      <style>
      <!-- /* Font Definitions */
      @font-face
      {font-family:Tahoma;
      panose-1:2 11 6 4 3 5 4 4 2 4;} /* Style Definitions */
      p.MsoNormal, li.MsoNormal, div.MsoNormal
      {margin:0in;
      margin-bottom:.0001pt;
      font-size:12.0pt;
      font-family:"Times New Roman";}
      a:link, span.MsoHyperlink
      {color:blue;
      text-decoration:underline;}
      a:visited, span.MsoHyperlinkFollowed
      {color:purple;
      text-decoration:underline;}
      span.emailstyle17
      {font-family:Arial;
      color:windowtext;}
      span.emailstyle18
      {font-family:Arial;
      color:navy;}
      span.EmailStyle19
      {font-family:Arial;
      color:navy;}
      @page Section1
      {size:8.5in 11.0in;
      margin:1.0in 1.25in 1.0in 1.25in;}
      div.Section1
      {page:Section1;}
      -->
      </style>

      </head>

      <body lang=3DEN-US link=3Dblue vlink=3Dpurple>

      <div class=3DSection1>

      <p class=3DMsoNormal><font size=3D2 color=3Dnavy face=3DArial><span =
      style=3D'font-size:
      10.0pt;font-family:Arial;c olor:navy'>

      I agree.

      </span></font></p>

    2. Re:LOL by commodoresloat · · Score: 5, Interesting
      Or anything close to "standard." The best we can hope for is code that is recognized as valid, and I wouldn't hold my breath for that either. I've seen HTML like the following come out of Word:

      <B><A HREF="http://whatever.org"> Link </B></A>.

      I'm not kidding, either. Seems like an easy thing to avoid in an HTML generator. Validator routinely reports hundreds of coding errors in simple short documents generated by Word. Ugh. What really sucks is when you're working on a web page for someone and cleaning out all the crap that Word generates, then at the last minute they send you the same document with some minor errors corrected.... and all the same major errors generated by Word. Fun.

    3. Re:LOL by Wolfier · · Score: 2

      How about

      SGFoYSwgaWYgeW91IHJlYWxseSBhcmUgdHJ5aW5nIHRvIGRl Y2 9kZSB0aGlzLCB5b3UgaGF2ZSB0b28gbXVjaCB0aW1lIG9uIHlv dXIgaGFuZHMh

    4. Re:LOL by Wolfier · · Score: 5, Funny
      <?xml version="1.0" encoding="base-64?>
      <!doctype MS_WORD
      <!ELEMENT WORD_DATA>
      ]>
      <WORD_DATA>SGFoYSwgaWYgeW91IHJlYWx seSBhcmUgdHJ5aW5nIHRvIGRlY29kZSB0aGlzLCB5b3UgaGF2Z SB0b28gbXVjaCB0aW1lIG9uIHlvdXIgaGFuZHMh<WORD_DATA>
      </xml>
    5. Re:LOL by JebusIsLord · · Score: 2

      wow you only used 2 elements and no dtd, and its STILL not well formed. congratulations.

      <?xml version="1.0">
      <!DOCTYPE doc PUBLIC "http://www.microsoft.com/xml">
      <doc xmlns="www.microsoft.com/xml" "xml:lang="en">
      <data>
      asdfafs%65356FG653$5#@$%6Asdtkasdt@%@#$%@#$%245
      </data>
      </doc>

      Would be more in line for all you paranoids out there.

      --
      Jeremy
    6. Re:LOL by loconet · · Score: 4, Informative

      I know exactly what you mean. Word spits out complete garbage when it converts .doc => .html . Microsoft attempted to address this issue by releasing an HTML filter plugin that you can install and cleans up the html word spits out. It does clean up the html but it's still kinda messy.

      --
      [alk]
    7. Re:LOL by Mike+Schiraldi · · Score: 3, Informative

      Dude: mmencode -u

    8. Re:LOL by Wolfier · · Score: 3, Funny

      damn, what happened? I was just trying to type random characters

    9. Re:LOL by Kragg · · Score: 2

      Whew, thanks. I nearly panicked when my SAX parser threw an ELEMENT_DECLARATION_MISSING exception as I parsed the parent.

      --
      If you can't see this, click here to enable sigs.
    10. Re:LOL by realdpk · · Score: 2

      yeah me too. phew! i was like, trippin. OMG OMG, my SAX parser!

      XML 4 EVER

    11. Re:LOL by smyle · · Score: 2

      The demoroniser is your friend.

      --

      Sleep is just a poor substitute for caffeine, anyway. -Bob Lehmann

  4. Reverse Engineer by timothy_m_smith · · Score: 2

    At least with XML it will not be very long until many software companies and project reverse engineer the XML. I suppose they could put some weird binary or encrypted data in the files, but that would defeat the purpose of XML.

    1. Re:Reverse Engineer by Mandi+Walls · · Score: 2
      maybe they'll do the opposite of the .doc format as it is now: encrypt the actual data of the document but let the xml tags hang out in text.

      running "strings" on a .doc xml file would dump just the tags.

      that would be funny.

      --mandi

    2. Re:Reverse Engineer by Phroggy · · Score: 5, Insightful

      I suppose they could put some weird binary or encrypted data in the files, but that would defeat the purpose of XML.

      The purpose of XML is to have buzzword compliance, and this doesn't defeat that.

      (Of course that's not the purpose most other people use XML for, but we're talking about Microsoft.)

      --
      $x='S24;r)>63/* h@<5+oZ)32"5cz';$me='phroggy'x$];
      $x=~y+ -xz+\0-Tx+;print$_^chop$me for split'',$x;
    3. Re:Reverse Engineer by Anonymous Coward · · Score: 3, Interesting

      No, of course MS wouldn't put the data in weird binary or encrypted format in their XML output formats ... like they did with Visio 2002's XML output (http://groups.google.com/groups?hl=en&lr=&ie=UTF- 8&oe=UTF-8&threadm=OiH2rn9nCHA.1808%40TK2MSFTNGP10 &rnum=3&prev=/groups%3Fq%3Dxml%2Bvisio%2Bmime%26hl %3Den%26lr%3D%26ie%3DUTF-8%26oe%3DUTF-8%26scoring% 3Dd) where they put all the really important stuff needed for interoperability in ForeignData elements.

      HINT: if you see MS use the phrase "full fidelity" when they talk about their new Office's XML output then you can be sure they're not giving you the data interoperability/portability you thought XML output was going to give you.

  5. Defaults by Snoe · · Score: 5, Insightful

    RTF has been in office for years and it is an open, portable standard readable on many platforms and with many programs. The problem is that Microsoft chooses to retain their obfuscated binary format as the default save type for documents.

    If the XML files office produce are not made the default save types or if the XML merely encapsulates large portions of binary code, it will not matter one lick that office can save these xml documents because the majority of people will be stuck on the default, unreadable formats.

    1. Re:Defaults by C.+Mattix · · Score: 3, Insightful

      Exactly. And as the maker of a software product it is thier perrogative as to what the default value is. I would hate to have the government telling me what the default values for things should be. If the user's don't use open standard type, yet they are given the oppurtunity to, then it is no longer the software manufacturer's fault.

    2. Re:Defaults by Planesdragon · · Score: 3, Insightful

      RTF has been in office for years and it is an open, portable standard readable on many platforms and with many programs.

      Obviously you haven't tried it. RTF has gotten more complaints from users than raw word Docs does!

      Replace "RTF" with "HTML" and you've got a winner, though.

      The problem is that Microsoft chooses to retain their obfuscated binary format as the default save type for documents.

      It's not "obfuscated" so much as it's "optimized." The whole idea seems to be for Word to save as quickly as possible--which the doc file is best at for Word for some reason, probably becuase it's derived from how the program structures documents, and not how some document spec says documents should be handled.

      If the XML files office produce are not made the default save types or if the XML merely encapsulates large portions of binary code, it will not matter one lick that office can save these xml documents because the majority of people will be stuck on the default, unreadable formats.

      1: It's HIGHLY unlikely that MS's XML implementation will be unnecessary binary code. They have a doc-to-HTML converter allready, and the XML converter will probably just be an update of that.

      2: You CAN change the default Office save format to RTF, HTML, old_doc_version, or just about any random 'save as' converter you have! (The only major feature I saw missing was the MHTML format.)

    3. Re:Defaults by EisPick · · Score: 5, Insightful

      It's not "obfuscated" so much as it's "optimized." The whole idea seems to be for Word to save as quickly as possible--which the doc file is best at for Word for some reason, probably becuase it's derived from how the program structures documents, and not how some document spec says documents should be handled.

      In an era of 2+ GHz computers with 7200+ rpm hard drives, it seems odd that Microsoft would be unable to write an application than can quickly save and open text files that, on average, run well under 50 kilobytes.

    4. Re:Defaults by dubious9 · · Score: 2

      Yes, make a table and some list in RTF and then open it up in a text editor. RTF is as verbose as it possibly could be.

      Also, microsoft doesn't say exactly how it interprets (i.e. whether this tag has to be before that tag, whether you can say just border instead of border-top, bottom, left right,) so I wouldn't exactly call it an open standard. RTF viewers/writer are very hard to implement.

      --
      Why, o why must the sky fall when I've learned to fly?
    5. Re:Defaults by MadAhab · · Score: 5, Insightful
      You are goddamned fucking lucky that the government tells you what the default values for things should be. That's what the government is there for, mostly; to tell you that the default value for a building is to have a fire exit and that it may not be locked. And without standards, there is no interchangeability of parts. And without that, every consumer and customer gets assraped by manipulative vendors. And since you can never tell precisely how this battery differs from that battery, you just have shit exploding battery acid all over the place.

      But if you really think they have no right doing these things, go live in a 3rd world country; they generallly have the government telling you less about what to do. Except once in a while when they kill your familiy. You could be armed of course. You know what a totally armed society with a weak government looks like? Afghanistan.

      That being said, it's hard to see what business the government has engineering document formats. They could, on the other hand, specify disclosure of formats as a remedy in an anti-trust case, but they generally fall into one of two categories which precludes this: stupid or bought.

      --
      Expanding a vast wasteland since 1996.
    6. Re:Defaults by Strange+Ranger · · Score: 2

      Corporations use custom installs all the time to change the default Save Type. A common example was to have everybody's default save type revert to Word 95 (.doc) because only half of the company was up and running on 97.

      So, why don't more companies make RTF or now XML the default save type? They're already doing custom network installs anyway. If a majority of Fortune 500 companies did this it wouldn't matter what Jane & Joe Home User had as their default. They'll be used to what they see at work.

      One might imagine there are many readers here who have some influence over their IT department. Shouldn't be that hard to just say No to default .doc?

      --

      Operator, give me the number for 911!
    7. Re:Defaults by Yi+Ding · · Score: 2, Insightful

      RTF has been in office for years and it is an open, portable standard readable on many platforms and with many programs. The problem is that Microsoft chooses to retain their obfuscated binary format as the default save type for documents.

      Even though RTF is and open standard, many programs which claim compatibility are still not 100% compatible, and can screw up things like embedded images. I supposed Microsoft's implementation of XML will be similar. It will be open, but the more complicated documents would still be displayed differently by non-Microsoft products. It would also force everyone to switch to Microsoft XML, or at least be compatible with it, retaining the dominance of Office.

    8. Re:Defaults by sparkz · · Score: 2

      It is not a particularly open standard - for example, pagenumbers in headers / footers (a pretty common thing to use) is not even mentioned in the specification - the only way to work out how to do it, is to do it in Word, save as RTF, and work out what it does.

      I know - I've had to do it! Even for a relatively simple document, the RTF Spec is not much use - you just have to do it in Word, and replicate that in your own code.

      Oh, and if Word decides it doesn't like the document, it doesn't return an error message, an ill-formatted RTF file is guaranteed to kill Word, and very likely to kill Windows.

      --
      Author, Shell Scripting : Expert Re
    9. Re:Defaults by tshak · · Score: 3, Insightful

      Most businesses do not build game machines.

      In an era of practicallity most offices are still running on 500mhz boxes with 128MB of RAM and 5400rpm HD's.

      --

      There is no longer anything that can be done with computers that is nontrivial and clearly legal. -- Paul Phillips
    10. Re:Defaults by interiot · · Score: 2
      • Most businesses do not build game machines.
      Hear hear. I work at a Fortune-100 company (well, it was last year anyway), and my current machine is sloooow and has very little memory. I've managed to make it resonably peppy by replacing Outlook with a remote Mutt (HUGE improvement, if only for the 30mb ram savings), and making it just be a dumb terminal for remote Solaris boxes. The only things I run locally are TeraTerm, VNC, Winamp, and Phoenix. Now if Phoenix wouldn't be such a hog, I'd be happy.
    11. Re:Defaults by dillon_rinker · · Score: 5, Informative

      Yup. Government standards are why you can buy screws and nuts from different manufacturers and have them work together. They are why you can buy "orange juice" at the grocery store and know that it's not "juice" wrung out of a pile of autumn leaves (hey, it's juice, it's orange, what more do you want?). Government standards are why you can fill fly in an airplane and know it won't crash.

      Sure, all these needs could be fulfilled by voluntary industry standards, if it weren't for those pesky human beings, fallible and greedy creatures that they are.

    12. Re:Defaults by g4dget · · Score: 2
      RTF has been in office for years and it is an open, portable standard readable on many platforms and with many programs.

      I would dispute that RTF is "portable" or "standard". However, whatever it is, it simply does not seem to preserve appearance and markup sufficiently well to be used as an interchange format. Perhaps it could in theory, but in practice, it doesn't seem to.

    13. Re:Defaults by Malcontent · · Score: 2

      Who said anything about the govt? Unless of course you mean that the govt should not provide courts so that people can sue each other or that there should be no laws so that people can't be tried.

      --

      War is necrophilia.

    14. Re:Defaults by Galvatron · · Score: 2

      But many home users do. It's not like people only run Office in the office, we can also see how well this software performs on hour home machines.

      --
      "The question of whether a computer can think is no more interesting than that of whether a submarine can swim" -EWD
    15. Re:Defaults by tshak · · Score: 2

      Run Opera instead of Phoenix. It's extremely lightweight especially considering how many features it has.

      --

      There is no longer anything that can be done with computers that is nontrivial and clearly legal. -- Paul Phillips
    16. Re:Defaults by donutello · · Score: 4, Insightful

      Government standards are why you can buy screws and nuts from different manufacturers and have them work together.

      Nonsense. Screw and nut sizes have been standardized without government involvement.

      --
      Mmmm.. Donuts
    17. Re:Defaults by donutello · · Score: 3, Insightful

      Amazing how many points you got wrong.

      You are goddamned fucking lucky that the government tells you what the default values for things should be. That's what the government is there for, mostly; to tell you that the default value for a building is to have a fire exit and that it may not be locked.

      That's a safety standard. The government does not tell you what color the walls should be, however. It doesn't tell you whether you should use carpet or hardwood on the floors.

      But if you really think they have no right doing these things, go live in a 3rd world country; they generallly have the government telling you less about what to do. Except once in a while when they kill your familiy. You could be armed of course. You know what a totally armed society with a weak government looks like? Afghanistan.

      Assuming you're talking about Afghanistan before the US bombed the hell out of it, you are wrong again. The government in Afghanistan told you exactly what you could or could not do. It told you what you could wear and how much. It told you how long to keep your beard. It told you whether you could study or not (if you were a woman). It told you what you could study. It told you who you could sleep with.

      --
      Mmmm.. Donuts
    18. Re:Defaults by Kragg · · Score: 2

      It told you who you could sleep with.
      Mmmm.. Donuts


      Dude, you're sick.

      --
      If you can't see this, click here to enable sigs.
    19. Re:Defaults by Kashif+Shaikh · · Score: 2

      Government standards are why you can fill fly in an airplane and know it won't crash. ...and have the wonderful assurance that hundreds of seagulls and other birds were literally used to test the wing propellors.

      I should know, since my father worked for GE back in the days when they were in Aviation building bird-resistant propellors and missile shells. Though, I don't know if they still do that stuff. But my father found the bird-testing sickening as do I.

    20. Re:Defaults by Kashif+Shaikh · · Score: 2

      Microsoft would be unable to write an application than can quickly save and open text files that, on average, run well under 50 kilobytes.

      You haven't ever stored pictures in word files, have you? Just having a couple of big pictures makes the size of doc file grow around 5 to 10 megs.

    21. Re:Defaults by kalidasa · · Score: 2

      In an era of 2+ GHz computers with 7200+ rpm hard drives, it seems odd that Microsoft would be unable to write an application than can quickly save and open text files that, on average, run well under 50 kilobytes.

      Problem is, that's 50 kB for a one-page memo inviting both colleagues in your department for lunch.

    22. Re:Defaults by Zordak · · Score: 2
      My mom is still running on a 450 Cellery with 96MB. Many people at home don't have the money to upgrade every 3 years.
      Ha, I've got you beat. My mother-in-law is running on a 75 MHz first generation pentium Compaq POS with a whopping 24 megs of RAM. About 4 months ago, I got in trouble with my wife for building us a system to replace our old 350 MHz box with 64M RAM and a 4G HD. Forget about having the money to upgrade every three years. When I built the new system, we tried to give the old one to the in-laws (I got frustrated when they bought a new HP printer, and it took like an hour to load the drivers on the old 75), and after one day, they told me to come back and hook up their old box again. It seems they didn't like not having their Compaq address book (some silly 2-bit app that came pre-installed), and couldn't be productive transitioning from MS Works to Word 2000. I tried to tell them that I could export the old address book and they'd have all the features plus many more if they would take the trouble to larn the new apps, but their mindset was that it was hard enough to learn how to use the stupid thing the first time around, and they didn't want to have to do it again. Figure out how to overcome that problem, and you've really got something.
      --

      Today's Sesame Street was brought to you by the number e.
    23. Re:Defaults by gvonk · · Score: 2



      Uh, OK.

      Swap the hard drives.

      Problem solved.

      --


      El Karma: excelente(principalmente la suma de moderación hecha a los comentarios de los usuarios)
    24. Re:Defaults by g4dget · · Score: 2

      And any design that mixes up the images with the text and thereby risks writing 5-10Mbytes every time you save a document is seriously broken. There are better ways of keeping images and text together than OLE structured storage.

    25. Re:Defaults by dublin · · Score: 2

      RTF has been in office for years and it is an open, portable standard readable on many platforms and with many programs.

      Obviously you haven't tried it. RTF has gotten more complaints from users than raw word Docs does!

      Replace "RTF" with "HTML" and you've got a winner, though.


      OK, Let's see you put a page break in that HTML document... Seriously, an extended HTML could make a very nice document format, some of the better ones, like the one used by HTMLDOC actually *do* let you put in line breaks and such. I've started using HTMLDOC to generate lots of my documentation now, because it does a pretty good job of retaining the gist of the formatting and produces very nice PDFs from the same web pages I have to generate anyway. This product has really improved lately. In fact, the only thing wrong with HTMLDOC, IMO, is that it uses the GPL rather than a truly free license.

      Now if only the Netscape/Mozilla team would add support for the HTMLDOC extended tags in Composer, and make HTMLDOC a standard output filter option (which would dramtically improve their ability to print web pages, anyhow...) we'd really have something.

      --
      "The future's good and the present is nothing to sneeze at." - Roblimo's last ./ post
    26. Re:Defaults by Planesdragon · · Score: 2
      OK, Let's see you put a page break in that HTML document...
      <br clear=all style='mso-special-character:line-break;
      page-bre ak-before:always'>
      MS extended HTML to accomodate page breaks and other features that Office supports but most standard web page editors don't. And the fun part is that, aside from the bloated document, they don't really impeded the HTML rendering.

      I remember HTMLDOC. It looked interesting to start off with, but it's missing a certain something to make it a worthy standard.

      I'd rather Mozilla support MHTML first--or even CHM, or even just the Moz-help system! (If it does and you know it, feel free to correct me with a link...)

      This product has really improved lately. In fact, the only thing wrong with HTMLDOC, IMO, is that it uses the GPL rather than a truly free license.

      If you don't like it, don't use it. Feel free to write you own, or buy Acrobat.

      The GPL is fine and dandy for standard, public-commons systems that run by themselves without amalgration with any other software. It only imposes on a justifiable freedom if it's used on a standard module, library, or format.
  6. Can you copyright/patent a schema ? by aron_wallaker · · Score: 5, Insightful

    The big question (to me) is whether Microsoft can put a legal encumbrance on the XML schema they use for a new file format. Could you publish a schema but have it so wrapped in legalese that (for example) open source projects could not be allowed to use it ?

    1. Re:Can you copyright/patent a schema ? by Mysticalfruit · · Score: 2

      That's exactly what I'm thinking they'll do. They'll be a bid disclaimer in the XML that says "These Schemas are for use the intellectual property of microsoft. Use of any program not licensed by microsoft to interpret the data stored within these schemeas is a breach of copyright..." or some other type legalese...

      --
      Yes Francis, the world has gone crazy.
    2. Re:Can you copyright/patent a schema ? by davmct · · Score: 2, Insightful

      I don't think MS is so worried about people making their own OpenSource software to interpret the XML as it will most likely not be as efficient as MS software.
      as far as content is concerned, anybody could write their own xml parser, what MS knows is going to sell more copies of Word et al. is the fact that it has a strong support for embedding ActiveX objects. So, the next time you want to embed a Rational Rose UML diagram in your word document, you'll most likely find that other software packages aren't going to interpret how this is stored in xml as well as the MS Office suite could.

    3. Re:Can you copyright/patent a schema ? by anonymous+loser · · Score: 2
      No, because reverse-engineering for interopability is specifically allowed by the DMCA:

      `(f) REVERSE ENGINEERING- (1) Notwithstanding the provisions of subsection (a)(1)(A), a person who has lawfully obtained the right to use a copy of a computer program may circumvent a technological measure that effectively controls access to a particular portion of that program for the sole purpose of identifying and analyzing those elements of the program that are necessary to achieve interoperability of an independently created computer program with other programs, and that have not previously been readily available to the person engaging in the circumvention, to the extent any such acts of identification and analysis do not constitute infringement under this title.
    4. Re:Can you copyright/patent a schema ? by anonymous+loser · · Score: 2

      WTF are you talking about? The question was whether the XML be copyright such that OSS projects are not allowed to use it. I showed the part from the DMCA that specifically says you are allowed to reverse engineer code in order to achieve interopability. It has nothing to do with MS being evil, it has to do with how the law is worded.

      BTW I personally would consider XML a computer program in this case (it is a compuer language describing/implementing a particular function...isn't that pretty much the definition of a program?), but I guess that's up to a judge.

  7. XML... sharp?!? by wikthemighty · · Score: 2, Interesting

    Once again MS will embrace a standard, only to warp it enough that you get stuck using their version anyway...

    --
    "There are people who do not love their fellow human being, and I _hate_ people like that!" - Tom Lehrer
  8. "XML dialect"?!? by TrevorB · · Score: 4, Interesting

    "XML dialect"?

    It's called a schema.

    Talk about embrace and extend. Sounds like this will be more "XML-like" than real XML... :)

    1. Re:"XML dialect"?!? by Frobnicator · · Score: 5, Funny
      Who died and made you incorrect corrector of common terms of speach?
      ahem. speech

      :-)

      frob.

      --
      //TODO: Think of witty sig statement
    2. Re:"XML dialect"?!? by kaphka · · Score: 2
      "XML dialect"?

      It's called a schema.
      No. A schema is a set of rules that defines which constructs are allowed and which aren't. A dialect is what you get when you implement a schema.

      Think about the word language and the word grammar. Many people are perfectly good at speaking the English language, even though they know very little about English grammar. (Quick, is English a head-final, head-medial, or head-initial language? You don't know? Yet you managed to read that sentence just fine.)

      It's a subtle distinction, but it's real. If you happen to know what language and grammar mean in the technical sense, then it should be even clearer.

      Of course, I don't know what a CNET reporter's alleged misuse of the word "dialect" could possibly tell us about Microsoft's plans for world domination, but I assume that part of your post was just a troll.
      --

      MSK

  9. My Guess..... by jamesdood · · Score: 2, Interesting

    Would be that it will be "open" to other Microsoft technologies. This has been their method of operation in the past. As long as you only have a Microsoft environment everything works well with each other..

    --
    *narf!*
  10. "Could this be grounds for another lawsuit?" WTF? by Wakko+Warner · · Score: 5, Funny

    Yes, mister Hairtrigger, we should sue Microsoft simply because they won't release trade secrets. We will surely win.

    - A.P.

    --
    "Remember when the U.S. had a drug problem, and then we declared a War On Drugs, and now you can't buy drugs anymore?"
  11. This illistrates the shortcoming of XML by Anonymous Coward · · Score: 4, Insightful

    I've always said the XML Emperor has no clothes: all XML is is a meta-framework for markup languages. No more, no less. And pointless if schemas are never disclosed.

    1. Re:This illistrates the shortcoming of XML by halftrack · · Score: 2

      I've always said the XML Emperor has no clothes: all XML is is a meta-framework for markup languages. No more, no less. And pointless if schemas are never disclosed.

      I think that's the way many programmers think (I know I do) it's just a way to avoid yet-another-file-parser for every project. And some - Norwegian SGML guy with a name comes to mind - is not a true, open format because the programmers can use schemas, you still need to know the schema. But then again, is it possible to create a open format which supports everything per default, is human readable and - to the extreme - does not require knowledge about the language. How are aliens going to crack the ASCII code in a binary radio stream from earth? Is there some formulae that makes it easier to decypher than hyeroglyphs?

      --
      Look a monkey!
    2. Re:This illistrates the shortcoming of XML by rob_from_ca · · Score: 2

      I mostly agree, but it can't be totally useless if you can define a Turing machine with it...:-)

      http://www.unidex.com/turing/tmml.htm

    3. Re:This illistrates the shortcoming of XML by Matts · · Score: 2

      Nonesense. As the author of one of the available OpenOffice to HTML (and DocBook) converters out there, I can honestly say we did most of the work without the Schema in front of us (especially since that Schema is a 400+ page pdf). We just used plain old reverse engineering principles most of the time. Works damn well, and XML makes it infinitely simpler than a binary format.

      --

      Matt. Want XML + Apache + Stylesheets? Get AxKit.
    4. Re:This illistrates the shortcoming of XML by g4dget · · Score: 2
      XML has some clothes. While you may not be able to understand the content of arbitrary XML documents, you can understand their structure. That enables a lot of things that would not be possible with formats like Word's native format or even other markup languages.

      For example, being able to understand the structure of XML documents makes reverse engineering much easier. It also lets you embed one XML document inside another and deals with the resulting namespace issues correctly. And there are many other things that XML helps with--it's not sufficient for a universal format, but it takes care of the nitty-gritty that, if not taken care of, can break portability.

    5. Re:This illistrates the shortcoming of XML by Dirtside · · Score: 2

      Why is it a shortcoming? XML was designed to be a meta-framework for markup languages. That's all it's designed to do, and that's what it does. It's not a shortcoming if something does what it's designed to do. :)

      --
      "Destroy science and religion. Science would re-emerge exactly the same; but not religion." - Penn Jillette, paraphrased
  12. NO! by halo8 · · Score: 2, Insightful

    Could this be grounds for another anti-trust suit against Microsoft?

    No it is not...

    The Bush administration made it clear on the first day they wanted this to go away. As long as Billy isnt taking your 401K im sure no one is going to bother him for a while..

    How many Millions were spent on this farce? and for what? a verbal reprind from the judge? think about it.. all that money could have gone into tanks and bombs to bomb other countries and free us all from "terror"

    --
    The More Knowledge you have the Luckier you Get- J.R. Ewing
    1. Re:NO! by WasterDave · · Score: 3, Insightful

      Y'know, before posting I thought I'd check to see if anyone else had put what I was going to put. Tadaa, problem solved.

      After years of work, hundreds of thousands of lawyer man-hours, what do we have to show for it? "Expose your API's unless they are to do with security, and don't be bad again". Honestly, this should have been a bitch slapping of biblical proportions. Not only should the company have been broken up, but a tier 1 deity should have rained down the wrath of the ancients in order to make it happen.

      Another anti-trust suit? I don't think anyone's going to be going down *that* road in a hurry.

      Dave

      --
      I write a blog now, you should be afraid.
    2. Re:NO! by schon · · Score: 2, Insightful

      all that money could have gone into tanks and bombs to bomb other countries and free us all from "terror"

      OK, so is this a good thing or a bad thing?

    3. Re:NO! by halo8 · · Score: 2

      What do YOU think it is?

      who are YOU going to be voting for?

      eather wich way.. im a Canadian in "soviet Canuckistan" so i dont really care... i just use words like "we" and "us" to SOUND like an american so i can karma whore...

      saying "WTF cares im a Canadian how dose this effect me?" didnt help my karma any.

      --
      The More Knowledge you have the Luckier you Get- J.R. Ewing
  13. what does it matter by greechneb · · Score: 5, Insightful

    No matter what microsoft does, all they will get is a slap on the wrist. Microsoft will just point to staroffice and openoffice and say, hey, there's compitition, its not a monopoly.

    Big deal if they don't open it up anyway (I don't really expect them to), staroffice/openoffice will crack it to a certain extent anyway. For most people's file conversions, its not that much of a difference to convert documents. Doesn't always look pretty, but it works fairly well.

    Wake me up when something Microsoft does is suprising...

  14. InfoWorld articles by andynms · · Score: 5, Informative

    There are a couple of good articles on this at InfoWorld. Try here and here.
    Good quote:
    THE GOOD NEWS is that Office 11 supports XML Schema. The bad news is that XML Schema has been described even by XML experts as "confusing," "impenetrable," "fuzzy," and "as user-friendly as a stick in the eye."

    1. Re:InfoWorld articles by frisket · · Score: 5, Informative
      I was at the launch presentation of Office-11 by Jean Paoli at XML 2003 in Baltimore MD last week, and I'm also a late sign to MS's extended beta list for the product (now closed).

      To clear up some points people have commented on (based on a very preliminary inspection plus a lot of discussion at the conference):

      1. The default save format is still .doc (ie you have to go the extra click to save in XML format)
      2. If you pick to click it, the default XML format is MS's own office-document vocabulary, which retains all the formatting, held in attributes. Hairy but processable, and they will be shipping their schema for it so people can reprocess it externally. But this format will (of course) only represent the appearance, not any structure.
      3. It will also let you specify your own schema (or an industry standard one) and let you supply a binding of named styles to your element types, so you can edit using what look like styles but actually get represented in the saved file as XML markup. There is some debate as to whether this constitutes "being an XML editor" or just "being a wordprocessor that saves data in XML" (my money is on the latter).
      4. It will not support DTDs, so you're stuck with W3C Schemas whether you like them or not*
      5. The discussion over a [more?] suitable schema/DTD for handling office documents (wordprocessing, spreadsheet, presentation) continues at the OASIS TC on Open Office XML Formats **
      With Office-11, Microsoft has nearly caught up with Corel's WordPerfect, (which has had a fully-fledged SGML and XML editor built-in for years) and XMetaL (which Corel took over from SoftQuad earlier this year). MS still has a long way to go to match industrial-strength applications like ArborText's EPIC or even Emacs with psgml-mode et al , but Office-11 will be a solution for the masses who believe the Word interface to be more desirable, or the Microsoft licensing régime to be more attractive, or the software to be more stable.

      * [Bias note] I think W3C schemas were a big mistake; provision for data content typing and validation, namespaces, and extended grouping could have been achieved by extending DTD syntax; and wimpy programmers who moan about having two syntaxes to handle should get a life - it's not a big deal, the code is free and has been in use for 15 years :-)

      ** Sun has donated the OpenOffice (aka StarOffice) XML file formats to the public domain. It's worth remembering that {Star|Open}Office has been saving in XML as its native format for some time now, and has a lot more experience at this than MS.

  15. well, of course by Planesdragon · · Score: 5, Interesting

    Could this be grounds for another anti-trust suit against Microsoft?

    Of course it could. But so could any bit of news about MS on /. in the past twenty years, from EULA alterations to Palladium.

    But "could" and "is" are differnent things. I suspect MS will decide that closing XML will render it useless, and make it at least as open and useable as their MS-HTML files.

    So, at the worst, we'll have a new "save as" option that's bit sloppy--but since MS won't have to extend XML to get their office functionality, they probably won't do it just to spite a few OSS coders who'll figure it out in a year anyway.

    1. Re:well, of course by Dirtside · · Score: 2
      Of course it could. But so could any bit of news about MS on /. in the past twenty years, from EULA alterations to Palladium.
      Twenty years! Damn, was /. running on a WWIV BBS back in the day, or something?
      what am I, a contradiction?
      No, just unnecessarily credulous. :)
      --
      "Destroy science and religion. Science would re-emerge exactly the same; but not religion." - Penn Jillette, paraphrased
  16. XML-COM by wowbagger · · Score: 2

    I will bet all they will do is create an XML schema for the COM serialize function, since that is pretty much all any Microsoft application does when you select File->Save - it just calls the COM serialize function with the output pointed at the disk.

    So, you will have a file that is nominally XML, but is nothing but memory dump of the COM object.

    Technically, XML. Actually, COM.

    1. Re:XML-COM by wowbagger · · Score: 2

      Then why are MS-WordXP documents not compatible with MS-Word 2000?

      The bad thing about COM streams is that if you change the methods of the object, you render the data incompatible with previous versions.

      If you represent a paragraph as <p>, then you needn't worry if you redefine WordDoc::BeginParagraph.

    2. Re:XML-COM by wowbagger · · Score: 2
      I haven't seen an Office XP document, but that's not the point in any case.


      That is exactly the point - Microsoft has repeatedly and for no good reason introduced incompatiblities on every upgrade of Word - a standardized XML schema would prevent that. Since incompatiblities are how Microsoft forces everybody to upgrade, it is unlikely they would change.

      I don't see how one can expect the previous version of an application to open a file created with a newer version.

      You mean like how every version of Wordperfect can open any WordPerfect document, as long as you don't use features added in the newer version? It is child's play for a competent software engineer to design a format that describes a document in chunks, and to specify that a conforming processor will skip over the chunks it does not understand. For example, the Amiga Information File Format described and audio file in chunks, and the first words of each chunk were the chunk type and length. Perhaps you are familiar with it - Microsoft stole the format for WAV files (although they DID reorder the words from big endian to little endian).

      Second, Microsoft can hardly break their own code by "changing the methods of an object" because you can access the data in those compound documents with a few, well documented interfaces that have absolutely nothing to do with Office.


      Of course, those methods exist only under Windows, and only if the appropriate DLL is present. Have you ever looked at the very files you claim to be an expert on, without a Microsoft supplied DLL between you and the data?

      Of course, once you get the data you need to know how to format it, which is the real problem.


      And that is the real nub - once you have recovered the data, how do you format it - information that is curiously undocumented - and that is my point. The odds that Microsoft will of their own choosing define an XML format that allows everybody to see how to interpet and format the data are approximately the same as Sarah Brady sending a sympathy card to Charlton Heston.

  17. Sure, it's XML, but... by phong3d · · Score: 2, Redundant


    <document>
    <content>
    kdjf348o0jOIJ*$)J@#ijfO34ijf9o84j2193
    )#_@#)UJfnwmejh082u-(U@)*#u08ur@)#RU@
    f934J#EJELKJF%GHWI#UJ(@*#)!)@#@)#(@IF
    fijsjhF*(WU(*@U#IOJWEFJW)*OEURWIOJO:W
    </content>
    </document>

  18. Sure it's Open! by Halo- · · Score: 2

    "Open? Sure it's open! Just click here... and *poof* your document is open. What's that? You mean you want to open it with something other than M$ Office? Oh, well in that case maybe not..."

  19. Excellent. by llamalicious · · Score: 2

    That great, wonderful even. Hopefully it's not Microsoft just using XML as a springboard for saying the equivalent of, "see, we're a good dog, and we're using open standards now," to cloud the judgement of any non-technical committee/court/public speaker that may attempt to point out their obvious monopoly.
    Meanwhile, myself, the company I work at, and the fire department I volunteer at will continue on with Office 97, happy as clams. Well, some Office 2000 too.

    Is there anything else of value they're going to bring to the table with Office 11? More speed, smaller disk footprint, free beer?

  20. Microsoft XML != XML by Grip3n · · Score: 4, Insightful

    But there's a catch: It has yet to disclose the underlying XML dialect

    Remember, you can also save a Word document as an HTML file, however the HTML is so digusting, so non-standard that the only things that could possibly read it are more Microsoft products. The same, I would presume, will be happening to their XML feature.

    Additionally, its not too far fetched that Microsoft would make their own DTD (Document Type Definition).

    --
    To make a pun demonstrates the highest understanding of a language
    1. Re:Microsoft XML != XML by Planesdragon · · Score: 2

      Remember, you can also save a Word document as an HTML file, however the HTML is so digusting, so non-standard that the only things that could possibly read it are more Microsoft products. The same, I would presume, will be happening to their XML feature.

      Do you have Word 97, Word 2000, or Word 2002/XP?

      97 had abyssmal HTML. Thankfully, I don't have to even touch it anymore.

      2000 and 2002 have, as far as I can tell, nearly identical HTML schemas. And, excluding the proprietary office tags ( and and the like), it's rather standard--if cumbersome--HTML.

      If you have Word 2000, you can even get an HTML filter that'll strip the custom HTML and CSS from the file, leaving an HTML file that really couldn't get much cleaner.

  21. Hello DMCA! by Wee · · Score: 2, Redundant
    At least with XML it will not be very long until many software companies and project reverse engineer the XML.

    And these other apps can cut into Office revenue. Which is as good a cease-and-desist argument as any.

    I suppose they could put some weird binary or encrypted data in the files, but that would defeat the purpose of XML.

    It defeats nothing if every app speaks the same binary/encrypted language. It prevents other apps from conversing with Office stuff, and that's probably seen as a good thing for MS.

    Anyone who thinks MS is using XML as their file format for the purpose of being "open" or playing well with others had better find another daydream. They're doing it because it helps them in some way, not because it'll help others. And there's actually nothing wrong with that. They're in business to protect shareholder value, after all.

    -B

    --

    Ash and Hickory, straight-grained and true, make excellent bludgeons, dandy for the cudgeling of vegetarians.

    1. Re:Hello DMCA! by ILikeRed · · Score: 2
      And there's actually nothing wrong with that. They're in business to protect shareholder value, after all.
      This statement presupposes that it is OK to cheat, and that cheating is in the best interests of shareholders. But then, personally, I would not invest in a company that I know has a culture of cheating, or that lacks ethics. Also, I find it sad that a company's lack of ethics is defended as helping shareholders, when the same company seems to care very little for it's shareholders. If it did care for it's shareholders, I believe it would pay dividends.
      --
      I have come to a conclusion that one useless man is a shame, two is a law firm, and three or more is a congress -J Adams
    2. Re:Hello DMCA! by Anonvmous+Coward · · Score: 2

      " They're doing it because it helps them in some way, not because it'll help others. And there's actually nothing wrong with that. They're in business to protect shareholder value, after all. "

      I think you're sort of on the right track. You have to remember that MS is branching out to other platforms like Pocket PC. Text is very easy to get around and is quite mobile. (Hence HTML's popularity...)

      I agree there's nothing wrong with what they're doing. I have no doubt that people'll have to sift their way through it to make sense of it, so what? If it's really that important, it'll happen.

      From the article: " It has yet to disclose the underlying XML dialect. Could this be grounds for another anti-trust suit against Microsoft?"

      Um no. At best, that comment was menat to stir up the trolls. Everybody acts like Microsoft owes them everything. All I can say is, grow up. MS is in the business of making money and it will always be like that. They're not required to explain their dialect. Nobody is. You wouldn't be saying that if Sun did that with Star Office.

    3. Re:Hello DMCA! by Wee · · Score: 2
      This statement presupposes that it is OK to cheat, and that cheating is in the best interests of shareholders.

      As long as a public company's main focus is to preserve shareholder value, then you will always have ethical problems. CEOs can get sued for not protecting investments. That can cause them to cut corners. I'd bet most large sharholders (the kind with lawyers and such) care primarily about money. What they don't know won't hurt them. Sure, some folks won't buy RJ Reynolds and Philip Morris and whatever, but lots of people do.

      For the record, if MS wants to keep their file formats obfuscated, then more power to them. It's their right to do so. Is it nasty? Yeah. Do I like it? No. Do I understand why they do it? Yes. They have a responsibility to their shareholders. The only way they know to fulfill that responsibility is by engaging in shady business practices (viz. "embrace, extend, extinguish").

      -B

      --

      Ash and Hickory, straight-grained and true, make excellent bludgeons, dandy for the cudgeling of vegetarians.

  22. Could this be grounds for another anti-trust suit by tmark · · Score: 2

    How - and why - should it be ? AFAIK, MS never disclosed their e.g. Word or Excel binary formats, so why should they be exposed if they fail to disclose, or even obfuscate, an XML schema ?

  23. Why? by maggard · · Score: 2
    Why would this be grounds for a suit?

    Insofar as I understand MS isn't under any court order to open their file formats, just not to continue with specific unethical tactics on others (wristslap.) So if MS claims they're using XML in Office v.11 (hey, didn't they claim that about Office v.10 too...) big whoop-de-doo, it's really their decision.

    Actually it's remarkable MS is even going for XML at all. MS's own internal formats are a terrible mess, the code that produces it apparently such a tangle MS has terrible trouble keeping on top of it, now trying to put this all into a new format has got to be a monster. Doing all of this while keeping all of the MS'isms and editing features and not breaking every other part (both theirs & third-party) that uses these services & components has got to be daunting.

    Yeah, it'll likely end up being idiosyncratic and quirky full of all the bugs MS is famous for but hell, a semi-legible format has gotta be better then the stuff MS pumps out now. Of course this whole "beta" process we're in right now has been pretty conclusively demonstrated to be a marketing sham with the significant decisions all made and the feature-set frozen long ago.

    --
    I don't read ACs: If a post isn't worth so much as a nom de plume to its author then I wont bother either.
  24. Inside information... by davidstrauss · · Score: 2
    Microsoft Word's new XML format is as follows:

    <xml><worddoc>
    klj49ja90235%@#U42LKJDS9@#&@#$%(@# $90u89oj456@#%#@*#()$*$@%(F5f65F6@#%(&@#%&$#(*%*lk jdsflkjsdh
    </worddoc></xml>

    Technically, it is standard XML.

    1. Re:Inside information... by ceejayoz · · Score: 2
      Actually, that's not standard XML...
      <?xml version="1.0"?>

      <worddoc>
      klj49ja90235%@#U42LKJDS9@#&amp;
      @#$%(@#$90u89oj456@#%#@*#()$*$@
      %(F5f65F6@#%(&amp;@#%&amp;$#(*%
      *lkjdsflkjsdh
      </worddoc>
      That is :-)
    2. Re:Inside information... by davidstrauss · · Score: 2
      Actually, well-formed XML just involves having all tags opened and closed for a perfect hiearchial structure. The
      <?xml version="1.0"?>
      is optional. Refer to XML Prolog Type Declaration before correcting someone. Even if you try to say mine's not "valid", neither is yours.
    3. Re:Inside information... by ceejayoz · · Score: 2

      However, any ampersand character must be written & in XML - otherwise it'll be treated a an entity and there can be problems.

      So, perhaps you should be checking the XML specs yourself before making witty corrections, eh?

  25. Points to remember... by MosesJones · · Score: 5, Insightful


    1) XML, SOAP and all these new technologies were pioneered by Microsoft

    2) They killed all the standards they didn't pioneer (CORBA anyone ?).

    3) There is NOTHING in the XML spec that _requires_ people to open up their schema definitions. Its purely a structure definition in the same way as Microsoft's old Word documents were stored, its just that now the markers are in Text format and any standard XML parser will be able to read the file.

    4) Open Office can already read word documents even though they aren't in XML.

    5) So can Word Perfect.

    6) Using XML doesn't stop you embedding binary into the document, often people do this to store data (images for instance), thus an OLE reference might still be binary.

    7) Pure XML and XSLT are great ways to use up all the power on your processor. Binary has previously been used here because its inefficient, if MS had opened the format up everyone would just complain that its too inefficient and its quicker to save using an older format. So MS are either trying to burn cycles or are customising the XML or their application for speed, is that wrong ? Would it be wrong if KDE did it ?

    8) People won't switch to or from Word because of XML, Open Office and other tools will be able to read the Word files because other tools (Google for instance) need the format and MS can see real business need to allow them to see it.

    9) XML is a meta-language as such anything can be written. Hell they could have a bitch of an external format and then a simple parser that makes it useful, but not tell anyone about the simple parser so everyone elses documents take years to load.

    10) XML is the buzzword of today, OLE to be replaced by SOAP as the buzzword for Office next ?

    Get off the high horse guys, whether its binary or XML is irrelevant, making something XML doesn't make it open. Thats like saying that everything you do makes sense, but just because people don't understand the Mayan Calendar and Ancient Greek they complain.

    MS will always use Mayan and Ancient Greek, and we _can_ understand them, its just easier for them as its their native language and calendar.

    --
    An Eye for an Eye will make the whole world blind - Gandhi
    1. Re:Points to remember... by NullProg · · Score: 4, Insightful

      1) XML, SOAP and all these new technologies were pioneered by Microsoft


      XML came out of "SGML for the Web" team sponsored by the W3C. I think this was back in 97/98.

      Enjoy,

      --
      It's just the normal noises in here.
    2. Re:Points to remember... by ryanvm · · Score: 4, Funny

      Get off the high horse guys, whether its binary or XML is irrelevant, making something XML doesn't make it open.

      You keep using that phrase, I do not think it means what you think it means.

    3. Re:Points to remember... by Danse · · Score: 2

      Being able to validate it is pretty much worthless if you don't also know how to interpret it. That's the key to the whole thing. Including a schema doesn't fix the problem. It needs documentation on just what the hell all of it means.

      --
      It's not enough to bash in heads, you've got to bash in minds. - Captain Hammer
    4. Re:Points to remember... by yomahz · · Score: 2

      1) XML, SOAP and all these new technologies were pioneered by Microsoft

      Really?

      --
      "A mind is a terrible thing to taste."
    5. Re:Points to remember... by KidSock · · Score: 2

      MS didn't pioneer XML, saying Open Office can read word documents is technological hair splitting, writing binary memory snapshots to disk is not inefficient, and I don't understand 9) but I'm not trying to make a point, I just don't think your message should be labeled 'Informative'.

    6. Re:Points to remember... by Kashif+Shaikh · · Score: 2

      Get off the high horse guys, whether its binary or XML is irrelevant, making something XML doesn't make it open.

      I believe people -- and this is my opinion -- think XML is more "open" because its a tangible format. i.e. you can open it in notepad.exe and see some logical structure(but you can't intrepret it). Where as all you see from a doc file is bunch of binary gibberish.

  26. Of course by nuggz · · Score: 2

    That is probaly what will happen.
    Technical compliance, while completely avoiding the spirit of the standard.

    Of course if I was MS, that is what I would do too.

  27. Open? by Grip3n · · Score: 4, Informative

    I'd say the title of this article (Is the New Microsoft Office Really Open?) is extrmely misleading. Microsoft isn't even trying to be open, they're just adding support for another opensource language. A true open program would have its source code available. What this article is about has nothing to do with that. Microsoft Office is closed. Period.

    --
    To make a pun demonstrates the highest understanding of a language
    1. Re:Open? by jpmorgan · · Score: 2

      XML isn't an 'open source' language. It has nothing to do with open source/free software. It's just a document metaformat based on SGML.

    2. Re:Open? by scm · · Score: 2, Insightful

      "Open" used to imply something different before "Open Source" because popular. It meant that file formats, APIs, ABIs, etc. were well documented. Many Unix venders used to call their OSs "open" not because they gave away the source, but because everything was documented and accessible to third parties.

    3. Re:Open? by leandrod · · Score: 2
      > "Open" used to imply something different before "Open Source" because popular.

      Yes.

      > It meant that file formats, APIs, ABIs, etc. were well documented.

      No! It meant that there was conformance to open standards, that is, standards estabilished by open organisations that congregated users and vendors from all over the world.

      > Many Unix venders used to call their OSs "open" not because they gave away the source, but because everything was documented and accessible to third parties.

      No, they were open because they conformed, and still conform, to POSIX, OSI and other relevant open standards.

      --
      Leandro Guimarães Faria Corcete DUTRA
      DA, DBA, SysAdmin, Data Modeller
      GNU Project, Debian GNU/Lin
  28. Adoption of standard no guarantee of interop... by Sigh+Phi · · Score: 5, Insightful

    Microsoft (and Netscape) essentially tried the same thing with HTML. Sure, we're using HTML, but to actually view our HTML, you have to use our browser.

    Adoption of a "standard" is no guarantee of interoperability. Understanding the conceptual underpinnings of the standard is just as important. The question is, when Microsoft says they are using XML as a document format, are they doing it because they believe in the principles underlying it, or solely for the cynical "this is what is selling now" aspect?

    The body of HTML out there is an paresable, babble of a mess, largely because the two dominant browser makers did not respect many of the underlying notions of markup and hypertext to begin with. The state of the art progressed, but not in the way a lot of people wanted it to go.

    This could bode poorly if the meme survives somehow that the Office format is now equivalent to XML. When it "doesn't work," who knows where the blame will fall?

  29. the new XML .doc file header looks like: by SirSlud · · Score: 2
    --
    "Old man yells at systemd"
  30. lets see which dialect of XML will they use? by myowntrueself · · Score: 2

    How about Microsoft Visual XML++?
    If it doesn't exist now it will...
    or something sufficiently based on XML
    that it can have XML in the name,
    but sufficiently different to XML that
    its incompatible with XML from other vendors and developers will need to learn a whole new way of working with XML.

    Just a wild guess.

    --
    In the free world the media isn't government run; the government is media run.
  31. XML can be as cryptic as binary by Jelloman · · Score: 5, Insightful
    All the hype about XML seems to skip over the fact that XML is never guaranteed to be any less cryptic than binary data formats. For example:
    <?xml version='1.0' ?>
    <wordDoc>
    <base64 value='kjkjKJ+kyRgMhiuI9KqU/hjkj'/>
    <base64 value='OlRg8LKp8UI883Jjk+krNhjkj'/>
    <base64 value='pRhjjhO9asdJiQ99kjkjU8j=='/>
    </wordDoc>
    XML was designed to be machine-readable, not human-readable, much less human-understandable, or easily-reverse-engineerable.

    The Office file formats will be open if M$ decides to:
    • Document them, and
    • Not change them with every update.
    I doubt they will do either of those things.

    1. Re:XML can be as cryptic as binary by haggar · · Score: 2

      You are quite right. But not only this: I will explicitly say that even if they publish the DTD, they can still have a format that is NOT represented correctly by any other office suite, but their's. That's because having the DTD does not help you in the representation of the content. And yet, you could still contain representation information in the XML document, but that content would be not documented. So, yeah, you have the DTD, you can validate the XML document, and still you have no f*cking clue how to represent it: how should this thing look like, how does it print?

      --
      Sigged!
    2. Re:XML can be as cryptic as binary by Dirtside · · Score: 2
      XML was designed to be machine-readable, not human-readable, much less human-understandable, or easily-reverse-engineerable.
      False. Point 6 of the W3's list of goals for XML is that it should be "human-legible and reasonably clear".
      --
      "Destroy science and religion. Science would re-emerge exactly the same; but not religion." - Penn Jillette, paraphrased
  32. Re:Lovely... by ceejayoz · · Score: 5, Funny

    Oh no! Heaven forbid someone extend the eXtensible Markup Language!

  33. Re:You've seen Microsoft generated HTML by JebusIsLord · · Score: 2

    IE does a pretty decent job of parsing xml already actually. Its perfectly strict. 6 Does make some errors though that Moz gets right.

    --
    Jeremy
  34. Details on Microsoft's new XML format by MillionthMonkey · · Score: 3, Funny

    <officeDocument>
    <base-64>
    R0lGODlhSwA3AIQAAA8NDZqHf5I/NldIQszKyEw qJ3ZoYcykoHBIOIZrYLannDEpJv39+244KW5Z
    T6uZj+dkb8i vo7xARPN0gZNRRSUeHOrY2MJSUfyVmIt4bvuwtLq4tl9UT9nX1 Ug5NAAAACH+Dk1h
    ZGUgd2l0aCBHSU1QACH5BAEKAB8ALAAAA ABLADcAQAX+4CeOZGmeptUpzxNkz0ZYaG3fuGg9DoVQ
    jqBj4 BkYiZ7kYrmoOJ/OQSdHrVkiFEFju0U4Ao9IREEWj8muVybj8Cy hFc+DVj11OA4FodOxaDQY
    GBMQFz4IBgkZCQZDRBwJkC0ZbgM clm5wFQMbU3UiBh4KnTcGBRUFTUwLAwYGMGgPGRweTqtCA021
    A3p0VR0eDjOeIg9uwBlgZCwvjJZIuU8GMr3DOgkDDzN0FhsBo BVMBaiqb3FJSqpw5AseBehMwI8K
    Mn3VNx0Bs5hQ6/1y9gAJI ODSBYGXBAESgonVykEzI27KwXEwCmCJHVkESNhIoaNBA2ASrsn QClGr
    RUf+8HAgkmlBBmHVAsiBedGChQNaxOlktySJq1csMhy p5OAStArSKuawsJLmjWLvesJo4c1hM3Qe
    8LDUxCugBwNOPW1 w1RChSJKMcGWq9fXBPAJ7qOFQEQCA3bt48+rduzfT3Qp89zKJE 2ADJ4soOvBA
    cq7xOVYK5CK2yCMBgiNJiAxeC+7JAgWTqxAI4 MCyQQeniw4QUjoBa8z74HggEBrHA4JcEMAwI4aq
    QmZDlKzdJ Dm0hYwSBGhpgMCswpFrklFd03DWUSdylNqzEKBBBhmAIECYcKE QEGSxEiWw1CoAwljG
    ira7rmkawDsO6PXyA4h88i06ifMYMq/ EQgRRQaz+lEoUh1VjwCb12EBAAwKOw8QAa7zXwjIcLLGa
    arR 4ltQwD1TwQIQ2KABNLj05p4AYyyiiT1YrhVgBB3qg6MtXtNWhg gJJcODKQjC2kIZrSECUijQ9
    2gNKWD5OIiR6L7LggiIGdJiJB 99BKdoCYBU3wgpvESDDaE18tUha50jEGTiVqBVHDHFRQUCaNdJ 3
    CipwBPZmLQu448RdqQQKzxIc0FnRBgMAFpifnv1ZS2dNPMo XOJXaFYN2tSn2oGOZgbRBbaTqoAAo
    74Cq6ialAsRdI8+suCA /bywQQKueRJABBUgOBVFE6gw3Kq4ocBNBAIac1spKR3jBgUEzA pvJrcT+
    ltAdcwZl60UQkERHUhDN5MmZS9WScBxuAzW30IYKM APJQxG5yWC5I9yWW7Zg8AaLkWmg1Ya8oYg5
    2QEJdLRcFyDxm y9vByyjBhuxPeFBAJwixp1yGCvHHDIhvbCmQwhRZ8nImqkzYm0 boBZABBdsVJ5H
    pYWEpRF4kATDJEFqRt9MAvtiAALZwPVHIIM U4gXHLyTkmkMkgTGJLVnKCQWEiBHAin46DF20AAO1
    E+A5IQc QlCMj6zOrBw1WY8GDOV70hwbk5RSgOD1xEJ1bBtL8Gp9OcJC2J wR4kECdNUyoE4WGeviT
    2GhAnFWCW9VyMuC2elnCAzphAk8ir rj1nAP+uXw4xFEu/W0njz2LsIEpnWG6RGlTmaEAG9YVoZKN
    C +RhOSkLbJC66iFKxA4MD0NHuzFxBl+YjjtysPtF3jxoSxFTvXg lJKodEYScuv+eGI/V3BEohgtV
    eaUi8rkjHDhMumrE8xJmwI4 DVDbc7yLBZYIj/DYEUIEoFiGASRyyrvOhhDPYsI+DQlGxmphgE mRh
    BCI+pqA/eehAGciR9z4QgMpJ5keNioM+jAANYDSkZJLyj DikNrG38KFna+OSHsz0NHLsxHVLyEs/
    NkOO9fUwM6ygk5h+Y ZcnFNFRAMDUoCyVlze5oYmRihSYFLWNsSCRiViEIg4XkEUdZio UMggBAQA7
    </base64>
    </OfficeDocument>

    1. Re:Details on Microsoft's new XML format by MillionthMonkey · · Score: 2

      Actually I was surprised by that too. Although I'm sure the XML format that Microsoft is using wouldn't get by.

    2. Re:Details on Microsoft's new XML format by Dog+and+Pony · · Score: 2

      Is that the specification or an example?

    3. Re:Details on Microsoft's new XML format by MillionthMonkey · · Score: 2

      It was an example. And a joke example at that.
      Microsoft would never specify the base "64" in plaintext. :)

    4. Re:Details on Microsoft's new XML format by Dog+and+Pony · · Score: 2

      Well, it was a joke question at that, too. Microsoft would never publish a specification in plaintext either. :)

  35. Re:Mod parent down by peterpi · · Score: 2
    "There are doorways I haven't opened... and windows I've yet to look through..."

    Well, I guess that's one less now!

  36. Boo Hoo Hoo by VividU · · Score: 2, Insightful

    The problem is that Microsoft chooses to retain their obfuscated binary format as the default save type for documents.

    Comments like this give me the creepies. As a software developer, the last thing I want is some entity telling me what my default format should be.

    It's also indicitive of the elitist attitudes of many Linuxites. In effect, the poster is saying that users will never have the capability to inform themselves and make a choice as to how they want to use their computers.

    1. Re:Boo Hoo Hoo by WasterDave · · Score: 2

      As a software developer, the last thing I want is some entity telling me what my default format should be.

      I used to agree with you, wholeheartedly. After all, it's a shitload easier to cook up your own spec and code to that. Or, more likely, just code. Many Linux apps only get away with this attitude because their files are primarily plaintext, and therefore a complete absence of formality is generally OK.

      But I've spent the last nine months going into battle, daily, with a video compression standard that's the hugest bastard in the whole world to work with. It's patent encumbered, it's not trendy, it doesn't have a pretty GUI, it won't enamour me to the OSS community and the whole experience has nearly killed me. So why bother?

      Metcalfe's law: The utility (usefulness, approximately) of a network is proportional to the square of the number of nodes on a network. When I've finished building my stuff based on this bastard standard, it'll be compatible with the umpteen million other devices that also use it. That's one whole shitload of value proposition, right there.

      Non standard format => Use to communicate with the other eight people who use the product.
      Standard format => Use to communicate with the other two million people who use the standard.

      See? Step 1, use standards. Step 2, ? Step 3, Profit!
      Dave

      --
      I write a blog now, you should be afraid.
  37. This is very simple by mao+che+minh · · Score: 4, Interesting

    If they really wanted to join the open market and truly compete, then they would just open the .doc format. This is nothing more then a pitiful pandering to open source advocates or those businesses that are interested in OSS. Any person with a shred of common sense and a basic knowledge of technology developments over the past 5 years can plainly see how pointless this is.

  38. Open but Secure by mugnyte · · Score: 5, Interesting

    Something in my gut tells me that beyond all the extraneous tags, attributes and data types, the XML is going to have a hash code built into it.

    Edit this file outside of MS Office (invalidating the hash code) and suffer the consequences: MS treats it as "untrusted" input and rips out only the text content, no formatting.

    The hash will be a giant number created through a secure portion of the Intel-ish hardware calls. Keys hidden where? That'll be interesting to see who posts 'em first. Perhaps on a .NET server at MS hosting? Nah, this cripples offline Office. Keyless hash?
    Curious Curious.

    mug

    1. Re:Open but Secure by el_chicano · · Score: 2
      Edit this file outside of MS Office (invalidating the hash code) and suffer the consequences: MS treats it as "untrusted" input and rips out only the text content, no formatting.
      So my word processor and spreadsheet will refuse to let ME, the user, do what I want with MY documents? I myself would refuse to run those apps.

      Is this really a big problem? Unless MS cripples the export function you can always Save->As a RTF or CSV file. You can then parse the file and then format the data using the XML schema of your choice. Or am I missing something?
      --
      A man who wants nothing is invincible
  39. XML-Dev thread on WordML by watchful.babbler · · Score: 4, Informative
    There was a fairly recent thread on this issue over at the XML-Dev list (see here). The upshot, according to W3C XMLWG member (and occasional Microsoft foe) Tim Bray, is that Word is capable of saving documents in a WordML format that is parsable even without a DTD:
    I didn't see anything that I couldn't pick apart straightforwardly with Perl, and if someone asked me to write a script to pull all the paragraphs out of a Word doc that contain the word "foo" in bold, well you could do that. Which seems pretty important to me.
    So, from a technical perspective, there isn't much to worry about right now. From a legal perspective, no, there's no grounds for another antitrust suit, any more than there's grounds for suing Quark for not disclosing their file format.
    --
    "Freedom is kind of a hobby with me, and I have disposable income that I'll spend to find out how to get people more."
  40. Are you paying attention? It's Microsoft. by burgburgburg · · Score: 4, Insightful
    Of course it isn't open. It's a silly question. Open is EVIL. Actually open would eliminate advantages. People would be able to create their own tools to interact with documents, instead of with MS tools. Where's the money in that?

    Dancing MonkeyBoy doesn't hop across a stage for his health. He "loves this company" because it makes money as only a monopoly can.

    Silly rabbit. Open is for kids.

    1. Re:Are you paying attention? It's Microsoft. by gmack · · Score: 5, Insightful

      That right there is one of the things that makes working with windows a pain.

      On any Unix or Unix clone you can just run standard tools or write your own.

      Unfortunatly with everything in a proprietary format you then end up having to build scripting languages into everything making all of your data files potential entry points for malicious code.

      The move to XML has the potential to eliminate that sort of brain damage once and for all provided they actually open their file formats.

      I hope they do it.. but given their past I'm not holding my breath given that the options are long term financial security for MS or Security for their customers and the risk of losing market share in the future.

    2. Re:Are you paying attention? It's Microsoft. by Sivar · · Score: 3, Insightful

      News flash: Hundreds of thousands of developers worldwide already developer their own tools to interact with MS documents. Some if not most serious developers have made a lot of money off writing programs for Windows/Office. Open your eyes and you will see that Microsoft makes business a lot of money. MS is a big help to the economy in that perspective.

      Really? Excellent! Please point me to the specification for the MS Office format, so I can write a cross-platform tool to open their files.

      --
      Computer Science is no more about computers than astronomy is about telescopes. --E. W. Dijkstra
    3. Re:Are you paying attention? It's Microsoft. by Axe · · Score: 3, Interesting
      ..so I can write a cross-platform tool to open their files.

      ifstream("MyOfficeFile.doc", ios::in);

      Crossplatform enough for you?

      Oh, you mean edit the files? I remember writing VBA code that did that just fine.. Good documentation how to do that - much easier then working with a crazy-ass XML schema?

      So what exactly are you asking for?

      --
      <^>_<(ô ô)>_<^>
    4. Re:Are you paying attention? It's Microsoft. by Sivar · · Score: 4, Funny

      ..so I can write a cross-platform tool to open their files.

      ifstream("MyOfficeFile.doc", ios::in);
      Crossplatform enough for you?


      As funny as it is useful. I can read the most thoroughly encrypted files that way, too. It's good to have a Windows programmer around...

      Oh, you mean edit the files? I remember writing VBA code that did that just fine.. Good documentation how to do that - much easier then working with a crazy-ass XML schema?

      It seems that between your first sentence and your second, you forgot the "cross-platform" part. Of course, if you're a VB programmer I can't blame you--you were probably born that way.
      (I'm just kidding, no personal insult intended)

      --
      Computer Science is no more about computers than astronomy is about telescopes. --E. W. Dijkstra
  41. Re:ms relies on office formats by Christianfreak · · Score: 2

    aahhhrrg! Every one says this "OpenOffice can't handle complex formatting of word docs"

    What complex formating? I've been using OO instead of Word for a long time now. I've converted tables, footnotes, tabstops, embedded images, bulleted lists, graphs and combinations of it all in the same document. I've never had a problem with formatting. More often I have problems with Word 97 or RTF documents opening in Office XP. Screws it up everytime. So please tell me what OO can't do I'm dying to know.

  42. Re:"Could this be grounds for another lawsuit?" WT by commodoresloat · · Score: 2

    I realize it's a joke, but it seems to me that mucking with an open standard and then closing it in order to extend their monopoly might just be a reasonable cause of action. XML is not a "trade secret," and making their version incompatible with the rest of the world's in order to force the world to adopt MS products is not "innovation." Reminds me of what they did with Kerberos a couple years ago. This may or may not be worth a lawsuit, but it would certainly be anticompetitive of them.

  43. Re:Embrace and Extend by Rick+the+Red · · Score: 5, Funny

    The difference between Microsoft and their competitors is that MS is willing to take a long-term view:

    1) Establish a monopoly on office productivity software
    2) Profit!
    3) See income drop once everyone has Office. Market saturation!
    4) Less Profit :-(
    5) Release new Office with new file formats; use monopoly to get it pre-loaded on all new PCs.
    6) Eventually everyone else upgrades Office in order to read new file formats they're getting from their co-workers.
    7) Profit!
    8) Release new OS with filesystem that looks like a database.
    9) Release YAO (Yet Another Office) [see 5 & 6] that only works with new database/filesystem in new OS.
    10) Now, not only do the masses have to upgrade Office to read co-workers files, they have to upgrade Windows as well.
    11) Profit!!!!!

    --
    If all this should have a reason, we would be the last to know.
  44. Re:Why not by mao+che+minh · · Score: 2

    This may all be true, but Microsoft never achieved the technological prowess and glory (IMHO) that Sun Microsystems enjoys with their achievement of making the computer the network.

  45. FUD alert by The+Bungi · · Score: 2, Insightful
    Rischel said. "Right now, Microsoft can set the price of Office products based on knowing their large clients don't have an alternative." Open formats "would create a market for other products" and competitive pricing

    Nope. Microsoft can set the price of Office because the applications fullfill the needs of its customers. The fact that the file format is propietary has little if nothing to do with it.

    The last time I saw StarOffice running on Windows, I damn nearly puked. It's written in something that looks like Java/AWT, the apps take bloody ages to load, opening a document takes even more bloody ages, the UI looks childish and the printing sucks. And I didn't really spend much time with it.

    OTOH, the Office apps load damn near instantaneously on even a PII 450, opening even ~50MB documents with hundreds of embedded images never takes more than a few seconds, the GUI is consistent and tight, and the things just work.

    Sun (and everyone else) has a problem if it thinks that it can compete with Office on Windows with that stuff, and unless they provide an alternative to VBA, they'll never even make a dent. There are hundreds of thousands of people who write full-fledged bussiness applications using VBA and aggregating Office functionality, and that's not something that a company will just throw away because the file formats are now compatible. w00t.

    If anything, opening the formats up will increase the popularity of office suites in Linux, because people won't have to dual boot or whatever to a) be productive; and b) read the stuff that the rest of the world produces.

    1. Re:FUD alert by cranos · · Score: 2

      I hate to disappoint you but VBA is one of the biggest loads of crap I have ever seen. People write full blown apps in this shit because PHBs don't give them the fundage necessary to use proper development tools. Trust me on this one, I had to provide support once for a company whos two biggest apps were written using Word Macros and MS Access 2.

      Seriously if someone in the company I work for said that they wanted a complete App written under VBA, I would have to tell them to think again. At the moment we're stuck with bloody excel macros for a lot of what we do and boy do they bite.

      As for functionality, I think you'll find that Open Office offers all the functionality that your average user is going to need.

      Have you looked at Open or Star Office running on anything but Windows? It runs fine on my Linux box and Im running a PII 450.

    2. Re:FUD alert by cranos · · Score: 2

      Alright Macros might != VBA but I still say that using what is essentially a hack of a hack of a weak language to run business applications on is not a good thing.

      I realise that you were talking about Windows, but in regards to Open and Star Office, one of the major benefits they provide is the ability not to get locked into the whole Dual Boot nightmare. By allowing people to access Word Docs, Excel SpreadSheets and so on, on other platforms, it opens up the arena in terms of competition on the desktop. We've already replaced a lot of MS Office licenses around here with Open Office and will be replacing more once the lack of Scripting Support is remedied.

      Remember Competition Good, unlawfully Maintained Monopoly bad.

  46. XML IS Office 11? Pah by TrippyZ · · Score: 2, Interesting

    Does everyone remember how Office 10 was promoted as the BIG XML release? And now Hailstorm has disappeared too.

  47. Yes it could be grounds. by GOD_ALMIGHTY · · Score: 4, Informative

    This is a monopoly. They have been found in violation of Anti-Trust laws and held up on appeal. The government has a legitimate reason to tell them how to conduct their business and every right to do so.

    Simply because the Anti-Trust trial focused on the OS rather than Office software, does not mean that the government has no reason to impose restrictions to keep MS from shifting their monopoly power. MS's monopoly has been under government scrutiny for almost 10 years, but we still get a bunch of posts on here about how the government shouldn't be able to tell 'a company' what to do. Either the trolls are really busy or you guys decided to skip Economics 101 for Libratarian Fanaticism 101.

    In order to maintain a capitalist system, we must have competition. Without healthy competition, we don't have capitalism. The government has an obligation to step into an otherwise free market to ensure that competition stays healthy. There is no magical 'Free Market Fairy' that is going to come along and restore health to the industry.

    So yes, depending on the result of the States' AG cases and the DOJ's settlement, MS could very much be liable for making their document formats some sort of completely bastardized XML. If you want to know the probability, then you should go read the settlements, and the grievences in the new filings against MS.

    --
    Arrogance is Confidence which lacks integrity. -- me
    1. Re:Yes it could be grounds. by poot_rootbeer · · Score: 2


      YANAL, STFU.

      IANAL either, but my understanding is that when a company is found guilty of monopolistic business practices, the remedies must specifically address those practices. Until Microsoft is found guilty of abusing its market status in the realm of application software, the government has no authority to tell MS how to run its Office division.

      I don't trust the business world to police itself enough to propose a true laissez-faire system, but neither do I think it's a good idea to give government unlimited power to meddle in business affairs.

      Besides which, there's nothing illegal about having a closed document format, even if it's encapsulated within an open structure like XML.

    2. Re:Yes it could be grounds. by Malcontent · · Score: 2

      "I don't trust the business world to police itself enough to propose a true laissez-faire system, but neither do I think it's a good idea to give government unlimited power to meddle in business affairs."

      But the govt does not have unlimited power. There has to be a trial and endless appeals first. Besides MS had 50 billion to spend on this trial which is much less then the budget the justice dept allocated for it.

      --

      War is necrophilia.

    3. Re:Yes it could be grounds. by runderwo · · Score: 2
      Why are AC's invariably clueless idiots?
      So MS was found to have a monopoly on i86 IBM Clone PCs, whoop dee do.
      Having a monopoly is not illegal. Using anti-competitive practices and product tying when you are a monopoly is illegal.
      MS has triumphed on 99% of everything since because all the judges see that there is in fact NO CASE.
      No, in fact, Jackson threw the book at them, and they got off on a technicality (he blabbed to the press before the case was over).
      Suing for inclusion of IE into Windows for free? Well Sue ALL of Open source for being free then!
      This statement is so idiotic, I don't know where to begin. First of all, Internet Explorer was originally a separate product. Microsoft then tied it to Windows when they saw that Netscape was remaining dominant in the browser market.

      Second, "Open Source" is not a monopoly. If, for instance, Red Hat somehow gained monopoly status in the next few years, then they would be subject to the same rules that MS was subject to, in that they can't take separate products and bundle them with the monopoly product.

      You are obviously giving away something of value in order to gain market share and destroy a competitor. THAT IS ILLEGAL, even when you don't have a "monopoly".
      Are you kidding? Businesses give stuff away for free all the time, and it's not illegal in the slightest. Unless you are a monopoly, in which case different rules apply; the reason is to prevent horizontal expansion, and to prevent the monopoly from erecting barriers to entry in the monopolized market.

      A monopoly, by definition, has no need to gain market share; and so anti-competitive acts and other things that were fair business when the weren't a monopoly are no longer fair, because they have no reason to use them except to maintain their monopoly status. And that's the whole point of antitrust law, so that monopolies are not indefinitely maintained.

      I find that the O.S. and Linux crowd in general to have far LESS integrity than MS has ever shown. You and your post are further proof of that.
      You have proven yourself to be such an idiot that I doubt anybody could care less what you think.

      Yes, IHBT, whatever.

    4. Re:Yes it could be grounds. by Arandir · · Score: 2

      There is no magical 'Free Market Fairy' that is going to come along and restore health to the industry.

      You're right, in the face of monopoly, only the ultimate monopoly can save us. After all, Linux didn't appear out of thin air, it was the government that created it! Linus Torvalds was sitting at home whining about DOS before the all powerful state stepped in and wrote an OS for him...

      --
      A Government Is a Body of People, Usually Notably Ungoverned
    5. Re:Yes it could be grounds. by runderwo · · Score: 2
      Or maybe MS thought that browsers were so useful that users would want a browser with their OS?
      You can bundle a product without tying it. Contrast the following scenarios for me, please:
      • distributing a copy of Internet Explorer for free, or free download, with each copy of Windows sold
      • integrating Internet Explorer into the very guts of Windows so that it would be very difficult if not impossible to remove, and present onerous licensing terms to OEMs that prevent them from shipping any other browser on Windows systems

      Guess which one's legal? Guess which one MS did?

  48. Draw you Own Conclusions by Alien54 · · Score: 5, Funny
    well, tongue in cheek

    the Love Caculator demonstrates that

    Draw your own conclusions. cute little widget.

    --
    "It is a greater offense to steal men's labor, than their clothes"
    1. Re:Draw you Own Conclusions by Anonymous Coward · · Score: 2, Funny

      Or the 47% "penis loves vagina." Its description isn't terribly promising for the human race!

    2. Re:Draw you Own Conclusions by EschewObfuscation · · Score: 2
      On a hunch I also discovered:
      • Microsoft Loves Virus 97%, and vice versa
      --

      (email addr is at acm, not mca)
      We are Number One. All others are Number Two, or lower.
      --The Sphinx
    3. Re:Draw you Own Conclusions by Alien54 · · Score: 2
      These are the results of the calculations by Dr. Love:
      Bill Gates Free Software Foundation 95 %

      Well, this is based on numerology somehow ... The usual industrial size grain of salt applies.

      But maybe it means that Bill Gates is desperately fighting against his inner geek, who would really love Free Software, etc.

      --
      "It is a greater offense to steal men's labor, than their clothes"
    4. Re:Draw you Own Conclusions by Alien54 · · Score: 2
      I'm not gonna comment on the 84% Love Calculator gave "Richard Stallman loves Microsoft".

      What you want is Richard Stallman loves Steve Ballmer 26%

      --
      "It is a greater offense to steal men's labor, than their clothes"
    5. Re:Draw you Own Conclusions by sharkey · · Score: 2

      Bill Gates Loves Janet Reno 92%, giving them a good chance of having a good relationship.

      On the other hand, Bill Gates Loves Spuds Mackenzie 63%. Maybe Bill is a closet necrophile who's obsessed with small dogs?

      --

      --
      "Outlook not so good." That magic 8-ball knows everything! I'll ask about Exchange Server next.
    6. Re:Draw you Own Conclusions by Alien54 · · Score: 2
      Also, for some reason the following query bugs out completely: http://www.lovecalculator.com/love.php?name1=Micro soft+Office&name2=a+retarded+sewer+rat

      The limit is three words per field [shrug]

      remember, this is not my widget [smile]

      --
      "It is a greater offense to steal men's labor, than their clothes"
    7. Re:Draw you Own Conclusions by alexburke · · Score: 2

      Likewise, from corporate and personal perspectives, the Love Calculator is [right on the money|pretty accurate|crazy|WTF?].

    8. Re:Draw you Own Conclusions by nicodaemos · · Score: 2

      Dr. Love thinks that a relationship between bill gates and osama bin laden has a very good chance of being successful

      In fact he gives it an 89% chance.

      Makes a lot of sense to me. They're both doing their part to kill corporate America. Bin Laden with bombs and Bill G with his monopoly. One has to wonder who is winning between them.

  49. Microsoft opening? Naw... Waiting for Palladium. by Anonymous Coward · · Score: 2, Interesting

    I seriously doubt that Microsoft is opening anything that they previously held private. This just isn't Microsoft's way. They've previously held .DOC, .XLS, etc private and obscured them to the point that 3rd party programs have a difficult time accurately opening them. This has worked fairly well for them, but it is also a thorn in Microsoft's side, as each new version of Office needs to hold compatible to all that legacy stuff, plus the new formats.

    What if they could scrap all that and have an easily read document format? They could tighten integration with IIS -> Office and web pages generated from saved documents, spreadsheets, etc. An XML file format can do it. This would be something MS would like to do.

    The problem is XML could be readable by anyone. Or at least it CURRENTLY could. But, what if, MS had a technology to transparently encrypt/decrypt files on the save/read? And, what if the keys to those files were then stored in a protected memory vault that only trusted apps could get to? A trusted nub could ensure that the apps weren't tampered with... You can see where this is going.

    As I understand it, with Palladium, MS could declare that the next Word format is PlainText, but documents still wouldn't be able to be opened by 3rd party software, as they aren't trusted by MS to hold the keys to decrypt the data files.

    It's a win/win for Microsoft. They get to dump legacy code and create something simpler, while gaining greater control over how people use their own files. It's a win/lose for the consumer, though. They'll get new functionality if they stay all Microsoft, but will be locked into an all/nothing choice of whether they choose the MS route, or not.

    THAT, to me, sounds like a typical MS business plan.

  50. Even grep replacing doesn't help by burgburgburg · · Score: 5, Informative
    Word HTML output was always atrocious. It failed everywhere from correct tag order (as is shown above), not properly quote parameters (sometimes it uses ", sometimes it uses ', sometimes nothing). Multiple tags, all with different styles one after another (actual example below)
    <b style='mso-bidi-font-weight:normal'><i style='mso-bidi-font-style:normal'><span
    style='f ont-size:12.0pt;mso-bidi-font-size:10.0pt;font-fam ily:Arial;mso-fareast-font-family:
    "Times New Roman";mso-bidi-font-family:"Times New Roman";color:black;
    mso-ansi-language:EN-US;mso-f areast-language:EN-US;mso-bidi-language:AR-SA'><br
    clear=all style='page-break-before:right;mso-break-type:sect ion-break'>
    </span></i></b>

    Even with grep replace tools, cleaning up this crap takes hours.

    1. Re:Even grep replacing doesn't help by HiThere · · Score: 2

      Did you insert those spaces, or did MSWord, or did SlashDot?

      At first I thought that it was impossibly bad, and then I remembered that slashdot filter.

      --

      I think we've pushed this "anyone can grow up to be president" thing too far.
    2. Re:Even grep replacing doesn't help by sgarrity · · Score: 2, Informative

      I use this Word HTML cleaner web service. Works well. Drop a penny in the paypal bucket if you like it.

    3. Re:Even grep replacing doesn't help by kazad · · Score: 2, Informative

      Dreamweaver has a "clean up word html" option. But then again, another proprietary solution.

  51. Re:"Could this be grounds for another lawsuit?" WT by Wakko+Warner · · Score: 3, Insightful

    XML, as a language spec, is most certainly open. It's what you do with the spec that makes it closed. C is also an open spec, but if I write a program in C, I'm by no means obligated to give everyone the source code to it (despite what some people here insist is the "right thing to do" in all cases.)

    - A.P.

    --
    "Remember when the U.S. had a drug problem, and then we declared a War On Drugs, and now you can't buy drugs anymore?"
  52. Could new .XML doc format be LESS open than .DOC? by NetShadow · · Score: 2, Insightful

    One thing that nobody seems to have considered yet is the possibility that, not only might this new XML Word Document format not be "open" as currently being assumed and touted, but it might be less open than the binary junk that Word spits out now.

    It seems from the context of the quotes in the article, Microsoft is very much concerned about how interoperable Word documents are now that they have been reverse-engineered and implemented from scratch in OpenOffice / StarOffice, WordPerfect, etc. .DOC is too open (meaning well-understood with a large base of source code to process it). They have stated as much in the article. MS Office is now becoming "just another Office Suite, same as the rest." They want Word to be "less of a commodity".

    Here's my theory:

    Besides value-added features, such as the internet calandar and workgroup features that have been dropped, the best way to achieve this differentiation would be to engineer an incompatible default format (an obfuscated XML DTD or binary encoding format) for new Word documents, leverage their massive installed base of desktop users, and fire up the good-ole FUD-o-matic 9000...

    Boom! Office 11 Ships, creating new, incompatible format with new, incompatible documents floating around the LAN, marginalizing the use of Open Source / "fringe" Office software.

    MS FUD: "But Open Source / Free Software Word Processors just don't work properly with the cutting-edge features of Office 11!". "They don't have the new whiz-bang features like 'Enhanced' XML, which Office depends on."

    No, Mr. Hacker, you can't use Open Office. The company policy is for everyone to use Microsoft Word, because we want everyone to be able to read everyone's documents. By the time the OSS hackers completely reverse engineer the file format, the damage will have been done. And the few glitches in compatibility in engineering compatibility into OSS Office Software will be more fuel for the FUD fire, emphasising how buggy open source software is, and Microsoft is the best choice for 100% correct display and authoring of Word Documents for your MS Office-Run Business.

    And until Office 11 ships and they're ready to roll with this new spin, they can take advantage of the hype regarding XML and how wonderful their new file-format will be, see, this Open Office package isn't so special! We can do you one better! XML is designed to be Open, see?

    Then, in reality, the new document format will be more closed to us, because we don't know how to read it. Trust me, they won't make it easy. They gain too much by closing up the new format and throwing away the key, profiting from the time it takes to pick and chisel away at the locks.

    --
    NetShadow
  53. Closed file formats are worse than closed apps by Anonymous Coward · · Score: 2, Insightful

    Business and personal users are starting to wake up to the fact that storing valuable, durable information and knowledge in proprietary file formats is not a good idea. Internet formats and communication standards illustrate the power of widely-adopted technical standards well. Business documents, technical documents, personal records, photographs, music, movies -- anything that may be of value and interest in the unforeseeable future must be stored in an open format to retain that value.

    I think this is a more compelling "pitch" for open source that the usual line of "if you can't get the source you can't fix the bugs".

  54. Doh! Ah here is the code above was the output by twistedemotions · · Score: 2, Interesting

    <?xml version="1.0" encoding="UTF-8" standalone="no"?>
    <?mso-application progid="Word.Document"?>
    <w:wordDocument xmlns:w="http://schemas.microsoft.com/office/word/ 2002/8/wordml" xmlns:v="urn:schemas-microsoft-com:vml" xmlns:w10="urn:schemas-microsoft-com:office:word" xmlns:SL="http://schemas.microsoft.com/schemaLibra ry/2002/8/core" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instan ce" xmlns:aml="http://schemas.microsoft.com/aml/2001/c ore" xmlns:wx="http://schemas.microsoft.com/office/word /2002/8/auxHint" xmlns:o="urn:schemas-microsoft-com:office:office" xml:space="preserve"><w:docInfo><w:tit le w:val="The dog ran up the hill"/><w:author w:val="Peter James Templeton"/><w:template w:val="Normal.dot"/><w:lastAuthor w:val="Peter James Templeton"/><w:revision w:val="1"/><w:appName w:val="Microsoft Word 11.0"/><w:totalEdit w:val="1"/><w:created w:val="2002-12-19T21:50:00Z"/><w:lastSaved w:val="2002-12-19T21:51:00Z"/><w:pages w:val="1" wx:estimate="true"/><w:words w:val="13" wx:estimate="true"/><w:characters w:val="80" wx:estimate="true"/><w:lines w:val="1" wx:estimate="true"/><w:paras w:val="1" wx:estimate="true"/><w:charactersWithSpaces w:val="92" wx:estimate="true"/><w:version w:val="11.4523"/></w:docInfo><w:docPr><w:vie w w:val="normal"/><w:zoom w:percent="175"/><w:doNotEmbedSystemFonts/><w:proo fState w:spelling="clean" w:grammar="clean"/><w:documentProtection/><w:defau ltTabStop w:val="720"/><w:defaultFonts w:ascii="Times New Roman" w:fareast="Times New Roman" w:h-ansi="Times New Roman" w:cs="Times New Roman"/><w:summaryLength w:val="0"/><w:punctuationKerning/><w:characterSpac ingControl w:val="DontCompress"/><w:optimizeForBrowser/><w:va lidateAgainstSchema/><w:saveInvalidXML w:val="off"/><w:compat><w:breakWrappedTables/><w:s napToGridInCell/><w:wrapTextWithPunct/><w:useAsian BreakRules/></w:compat></w:docPr><w:fonts><w:fo nt w:name="Wingdings"><w:panose-1 w:val="05000000000000000000"/><w:charset w:val="2"/><w:family w:val="Auto"/><w:pitch w:val="variable"/><w:sig w:usb-0="00000000" w:usb-1="10000000" w:usb-2="00000000" w:usb-3="00000000" w:csb-0="80000000" w:csb-1="00000000"/></w:font></w:fonts><w:lists><w :listDef w:listDefId="0"><w:lsid w:val="47EF5BD8"/><w:plt w:val="HybridMultilevel"/><w:tmpl w:val="7EE46F94"/><w:lvl w:ilvl="0" w:tplc="04090001"><w:start w:val="1"/><w:nfc w:val="23"/><w:lvlText w:val="h"/><w:lvlJc w:val="left"/><w:pPr><w:tabs><w:tab w:val="list" w:pos="720"/></w:tabs><w:ind w:left="720" w:hanging="360"/></w:pPr><w:rPr><w:rFonts w:ascii="Symbol" w:h-ansi="Symbol" w:hint="default"/></w:rPr></w:lvl><w:l vl w:ilvl="1" w:tplc="04090003" w:tentative="on"><w:start w:val="1"/><w:nfc w:val="23"/><w:lvlText w:val="o"/><w:lvlJc w:val="left"/><w:pPr><w:tabs><w:tab w:val="list" w:pos="1440"/></w:tabs><w:ind w:left="1440" w:hanging="360"/></w:pPr><w:rPr><w:rFonts w:ascii="Courier New" w:h-ansi="Courier New" w:cs="Courier New" w:hint="default"/></w:rPr></w:lvl><w:l vl w:ilvl="2" w:tplc="04090005" w:tentative="on"><w:start w:val="1"/><w:nfc w:val="23"/><w:lvlText w:val="X"/><w:lvlJc w:val="left"/><w:pPr><w:tabs><w:tab w:val="list" w:pos="2160"/></w:tabs><w:ind w:left="2160" w:hanging="360"/></w:pPr><w:rPr><w:rFonts w:ascii="Wingdings" w:h-ansi="Wingdings" w:hint="default"/></w:rPr></w:lvl><w:l vl w:ilvl="3" w:tplc="04090001" w:tentative="on"><w:start w:val="1"/><w:nfc w:val="23"/><w:lvlText w:val="h"/><w:lvlJc w:val="left"/><w:pPr><w:tabs><w:tab w:val="list" w:pos="2880"/></w:tabs><w:ind w:left="2880" w:hanging="360"/></w:pPr><w:rPr><w:rFonts w:ascii="Symbol" w:h-ansi="Symbol" w:hint="default"/></w:rPr></w:lvl><w:l vl w:ilvl="4" w:tplc="04090003" w:tentative="on"><w:start w:val="1"/><w:nfc w:val="23"/><w:lvlText w:val="o"/><w:lvlJc w:val="left"/><w:pPr><w:tabs><w:tab w:val="list" w:pos="3600"/></w:tabs><w:ind w:left="3600" w:hanging="360"/></w:pPr><w:rPr><w:rFonts w:ascii="Courier New" w:h-ansi="Courier New" w:cs="Courier New" w:hint="default"/></w:rPr></w:lvl><w:l vl w:ilvl="5" w:tplc="04090005" w:tentative="on"><w:start w:val="1"/><w:nfc w:val="23"/><w:lvlText w:val="X"/><w:lvlJc w:val="left"/><w:pPr><w:tabs><w:tab w:val="list" w:pos="4320"/></w:tabs><w:ind w:left="4320" w:hanging="360"/></w:pPr><w:rPr><w:rFonts w:ascii="Wingdings" w:h-ansi="Wingdings" w:hint="default"/></w:rPr></w:lvl><w:l vl w:ilvl="6" w:tplc="04090001" w:tentative="on"><w:start w:val="1"/><w:nfc w:val="23"/><w:lvlText w:val="h"/><w:lvlJc w:val="left"/><w:pPr><w:tabs><w:tab w:val="list" w:pos="5040"/></w:tabs><w:ind w:left="5040" w:hanging="360"/></w:pPr><w:rPr><w:rFonts w:ascii="Symbol" w:h-ansi="Symbol" w:hint="default"/></w:rPr></w:lvl><w:l vl w:ilvl="7" w:tplc="04090003" w:tentative="on"><w:start w:val="1"/><w:nfc w:val="23"/><w:lvlText w:val="o"/><w:lvlJc w:val="left"/><w:pPr><w:tabs><w:tab w:val="list" w:pos="5760"/></w:tabs><w:ind w:left="5760" w:hanging="360"/></w:pPr><w:rPr><w:rFonts w:ascii="Courier New" w:h-ansi="Courier New" w:cs="Courier New" w:hint="default"/></w:rPr></w:lvl><w:l vl w:ilvl="8" w:tplc="04090005" w:tentative="on"><w:start w:val="1"/><w:nfc w:val="23"/><w:lvlText w:val="X"/><w:lvlJc w:val="left"/><w:pPr><w:tabs><w:tab w:val="list" w:pos="6480"/></w:tabs><w:ind w:left="6480" w:hanging="360"/></w:pPr><w:rPr><w:rFonts w:ascii="Wingdings" w:h-ansi="Wingdings" w:hint="default"/></w:rPr></w:lvl></w:listDef><w:l ist w:ilfo="1"><w:ilst w:val="0"/></w:list></w:lists><w:styles><w:version OfBuiltInStylenames w:val="3"/><w:style w:type="paragraph" w:default="on" w:styleId="Normal"><w:name w:val="Normal"/><w:rPr><wx:font wx:val="Times New Roman"/><w:sz w:val="24"/><w:sz-cs w:val="24"/><w:lang w:val="EN-US" w:fareast="EN-US" w:bidi="AR-SA"/></w:rPr></w:style><w:styl e w:type="character" w:default="on" w:styleId="DefaultParagraphFont"><w:name w:val="Default Paragraph Font"/><w:semiHidden/></w:style><w:sty le w:type="table" w:default="on" w:styleId="TableNormal"><w:name w:val="Normal Table"/><wx:uiName wx:val="Table Normal"/><w:semiHidden/><w:rPr><wx:fon t wx:val="Times New Roman"/></w:rPr><w:tblPr><w:tblI nd w:w="0" w:type="dxa"/><w:tblCellMar><w:top w:w="0" w:type="dxa"/><w:left w:w="108" w:type="dxa"/><w:bottom w:w="0" w:type="dxa"/><w:right w:w="108" w:type="dxa"/></w:tblCellMar></w:tblPr></w:style>< w:style w:type="list" w:default="on" w:styleId="NoList"><w:name w:val="No List"/><w:semiHidden/></w:style></w:styles><w:body ><wx:sect><w:p><w:r><w:t>T he dog ran up the hill</w:t></w:r></w:p><w:p><w:pPr><w:rPr><w:b/></w :rPr></w:pPr><w:r><w:rPr><w:b/></w:rPr><w:t>The dog ran up the hill</w:t></w:r></w:p><w:p><w:pPr><w:rPr><w:i/></w :rPr></w:pPr><w:r><w:rPr><w:i/></w:rPr><w:t>The dog ran up the hill</w:t></w:r></w:p><w:p><w:pPr><w:listPr><w:ilv l w:val="0"/><w:ilfo w:val="1"/><wx:t wx:val="P"/><wx:font wx:val="Symbol"/></w:listPr><w:rPr><w:i/></w:rPr>< /w:pPr><w:r><w:rPr><w:i/></w:rPr><w:t>Dog</w:t></w :r></w:p><w:p><w:pPr><w:listPr><w:ilvl w:val="0"/><w:ilfo w:val="1"/><wx:t wx:val="P"/><wx:font wx:val="Symbol"/></w:listPr><w:rPr><w:i/></w:rPr>< /w:pPr><w:r><w:rPr><w:i/></w:rPr><w:t>Ran</w:t></w :r></w:p><w:p><w:pPr><w:listPr><w:ilvl w:val="0"/><w:ilfo w:val="1"/><wx:t wx:val="P"/><wx:font wx:val="Symbol"/></w:listPr><w:rPr><w:i/></w:rPr>< /w:pPr><w:r><w:rPr><w:i/></w:rPr><w:t>Up</w:t></w: r></w:p><w:p><w:pPr><w:listPr><w:il vl w:val="0"/><w:ilfo w:val="1"/><wx:t wx:val="P"/><wx:font wx:val="Symbol"/></w:listPr><w:rPr><w:i/></w:rPr>< /w:pPr><w:r><w:rPr><w:i/></w:rPr><w:t>The</w:t></w :r></w:p><w:p><w:pPr><w:listPr><w:ilvl w:val="0"/><w:ilfo w:val="1"/><wx:t wx:val="P"/><wx:font wx:val="Symbol"/></w:listPr><w:rPr><w:i/></w:rPr>< /w:pPr><w:r><w:rPr><w:i/></w:rPr><w:t>Hill</w:t></ w:r></w:p><w:p><w:pPr><w:rPr><w:i/></w:rPr></w:pPr ></w:p><w:sectPr><w:footnotePr><w:p os w:val="page-bottom"/></w:footnotePr><w:endnotePr>< w:pos w:val="doc-end"/><w:numFmt w:val="lower-roman"/></w:endnotePr><w:typ e w:val="next-page"/><w:pgSz w:w="12240" w:h="15840" w:orient="portrait"/><w:pgMar w:top="1440" w:right="1800" w:bottom="1440" w:left="1800" w:header="720" w:footer="720" w:gutter="0"/><w:noEndnote w:val="off"/><w:docGrid w:line-pitch="360"/></w:sectPr></wx:sect></w:body> </w:wordDocument>

    1. Re:Doh! Ah here is the code above was the output by JebusIsLord · · Score: 2

      So lets look at this... the namespaces are all published in clear text as urls, no binary data is apparent, and no dtd or schema is even used. It is also well-formed using the W3C's own validator. In short, I see no problems at all, especially when saved as an xml doc and viewed in IE where it formats it all pretty.

      --
      Jeremy
    2. Re:Doh! Ah here is the code above was the output by JebusIsLord · · Score: 2

      I know it doesn't NEED to have urls as namespace declarations, but it does which is nice - especially if those eventually actually point to real documents (i assume they dont right now). By validation I don't mean xhtml validation obviously i mean is is simply well-formed xml.
      Oh and the original poster was trying to make a show of how hard the thing is to read, and my point regarding formatting was just that he dumped it all as a block there, which can be easily sorted out if for instance you viewed it as an xml document in IE. Relax dude, I know its not html :)

      --
      Jeremy
  55. XML dialect - Say it with me. by Wolfier · · Score: 2

    Urx Earm Alloa diaolig!!

  56. XML..... by Tsali · · Score: 2

    Did they mention Extensible Markup Language in the article or could it be one of these???

    - eXtra Money Language
    - eXtremely Microsoft Language
    - eXtra MuddLed.
    - eXtraneous Markup Language
    - eXtrapolated Modded Licensing
    - XBox Machine Language
    - XDocs Monopoly Language

    Can someone clear this up? I don't have to time to tinker with the whole "reading articles" concept.

    --
    This space for rent.
  57. exactly by ink · · Score: 3, Informative

    I wish I had some mod points for you; that's exactly what Microsoft means when they say that their documents are saved using XML. They include Win32 class-ID objects all over the place.

    --
    The wheel is turning, but the hamster is dead.
  58. Um... by RomSteady · · Score: 2
    Let me see...Office 11 is in beta. Microsoft often makes tweaks of file formats and internal structures for their products up to the last minute. The beta is only in use at a limited number of sites, and is merely a tech beta, not even a feature beta. Documentation for products isn't even ready for final tech review until 16 weeks prior to the product being done. It's very possible that the documentation that they're asking for is either 1) not done, or 2) done, but in an internal spec document that is subject to change.

    I'd say wait and see what happens at release. Anything developed off of assumptions made based on the current state of the product will most likely be broken at release anyway. If it isn't released at ship time, then worry. Until then, it's kind of pointless to ask for the stuff.

    --
    RomSteady - I came, I saw, I tested. GamerTag: RomSteady / http://www.romsteady.net
  59. I can't believe it... by fudgefactor7 · · Score: 2

    From the snippet: "But there's a catch: It has yet to disclose the underlying XML dialect.'"

    Just because the XML dialect isn't readily available people are already assuming MS will not make it open? Got news for ya, Office 11 is still in beta, that means things may still change. And as you all know, MS publishes an absolute shitload once they set their mind to it.

    So, chill out a little, will ya? Wait for it, then bitch when it doesn't appear. It's almost like you guys are new at complaining, or something.

  60. In other news by quintessent · · Score: 2

    Bill gates paused in a grocery store line to let someone in front of him. We're not sure what he was up to, but we noticed he didn't let any other customers in front of him. This could be a deliberate attempt at gaining another monopoly in yet another critical area, and we're pretty sure it has to do with cash register printers and XML. Could this be the Achiles heel that brings down the giant in the courts? Citizens arm and unite!

  61. /. inserted the spaces but look at the rest! by burgburgburg · · Score: 2

    Just take a look at the rest of it, without the spaces. Or copy to your editor and remove them yourself. It's still ridiculous, atrocious and pathetic.

    1. Re:/. inserted the spaces but look at the rest! by HiThere · · Score: 2

      I assume you mean that slashdot inserted the spaces.

      (Yes, I agree that it's quite bad. But there's a significant difference between quite bad and impossibly bad.)

      --

      I think we've pushed this "anyone can grow up to be president" thing too far.
  62. Re:"Could this be grounds for another lawsuit?" WT by Danse · · Score: 2

    That's the problem. Microsoft is not obligated to release the info on their format. As a convicted monopolist, they should be. This is yet another example of just how poorly our judicial system handles this kind of case. By the time you can prove something and get through all the appeals, it doesn't fucking matter anymore!

    --
    It's not enough to bash in heads, you've got to bash in minds. - Captain Hammer
  63. has the RTF spec been kept up to date? by jlusk4 · · Score: 2

    Has the rtf spec been kept up to date as Word doc formats have changed?

    I had the feeling the existing spec was old and outdated.

    1. Re:has the RTF spec been kept up to date? by MadFarmAnimalz · · Score: 2

      I don't know if the spec has been kept up to date or any such thing, but it does occur to me that it is not a real alternative to begin with; you lose formatting features, most notably to me is the lack of footnotes and end notes, along with a million other things.

      Yes, these limitations could be overcome by changing the way you structure your document, but it is just easier to go back to saving in .doc.

      This is what I personally expect will sink MS Office's XML feature: the loss of formatting features.

      --
      Blearf. Blearf, I say.
  64. Why Not Wait Till Word 11 Ships? by Flamesplash · · Score: 2

    As far as I can tell Word 11 hasn't shipped, so why is it so bad that they haven't given info about an aspect of a currently unavailable product? It's like worrying a date will dump you and then yelling at them with out actually knowing.

    "How dare you dump me"
    "huh? what are you talking about."

    Paitence Is.

    The software maker says it plans to disclose additional information on Office 11's XML schemas, possibly when the update ships next spring.

    Sounds to me like they plan on telling people when the functionality is actually usable. While it may not be the "ideal" timeline for some I see know problem with it. You get the functionality you get the outline of the XML.

    Maybe I missed something in the article, maybe Word 11 has been out for a while already, if I have I apologize.

    --
    "Not knowing when the dawn will come, I open every door." - Emily Dickinson
  65. It will be easier by PineHall · · Score: 2

    If Microsoft keeps its schemas proprietary, looking at the XML code will make it easier to figure out one's own schema than the way it is now figuring out the binary Word format. But people likely will still want to save in the default Word format, instead of XML. Hmm, using the XML output may make it easier to decode the Word format. That would be nice. Microsoft is not going to give away its advantage, but I think they are confident of their market share to let things become just a little easier for the other word processors.

  66. maybe they havent released the specification..... by InnovATIONS · · Score: 2
    Because it isn't FINISHED? After all we are speaking about Office 11, a product that is not itself released either. I am at least willing to reserve judgement until I see what the thing really is. I don't expect to be really surprizedl but I could be wrong.

    And bear in mind that XMLDocs are not likely to be simple because word processing documents are not simple. People grouse about Word HTML docs but most of that complexity was necessary to create a HTML document that accually looked like the original word document. XML docs are unlikey to be all that concise because users are going to be unwilling to sacrifice layout and formatting features just in order to have the resulting document be pretty looking XML

    You could create a word processor that simplified and structured its features toward creating nicely structured HTML but then it would be FrontPage and not Word.

  67. It would be the first time they documented anythin by RockyJSquirel · · Score: 2

    Not only was RTF never fully documented, but different versions of Word had incompatable RTF readers.

    If you examine an RTF file you'll notice all kinds of redundant codes that are put in to cope with incompatable MSWord versions.

    Fully design, fully document a protocol, Microsoft?
    I just spit out my drink.

    Rocky J. Squirrel

  68. Re:Could new .XML doc format be LESS open than .DO by AnyoneEB · · Score: 2, Informative

    Someone will end up with a leaked alpha or beta copy of Office 11 and start working on the file format. If they will be able to figure it out fast enough is the question. It's possible, but if it's not done completely enough by Office 11's release what you describe will happen. Someone else said that Microsoft won't change .doc anymore partially because Google supports returning .doc's in search results... of course that just requires stripping all formating, which would probably be pretty easy.

    --
    Centralization breaks the internet.
  69. What I Expect by ChristopherLord · · Score: 2, Insightful

    What I am hoping/expecting for in this new format is something like XSL:FO plus binary sections for ActiveX controls, etc.

    For the 5 or so posters saying this will be something like:

    <data>
    ASdfksjdfFjfjAAASADFfddfds==
    </data >
    I highly doubt it. They are on record in several places as saying they want these new files to be indexable and parsable with standard tools, and base64 encoded blocks I am sorry to say, are not indexable. But of course Embedable objects will probably be forced to manifest this way.

    Regarding the claims that this will be like their horrid HTML implementation, I think it is clear you've not done much work with XML. Either a document is valid or it is not. If its not valid, most parsers will simply reject the file (unlike HTML, which just deals with the problems). If a document is valid, there should be no tool that doesn't properly load and parse it into the DOM, unless it is somehow broken!

    The question for me is how well they implement content-presentation seperation. Will there be a 'Word 11 XSL file' with the actual content of the file seperated nicely into tags like

    <SectionHeader>Resume</SectionHeader>
    or will the style and content be mashed together like so:
    <font size="50pt">Resume</font>
    This is the question I want answered more than anything, and I can't wait to see which way they go with it. If everything is seperated nicely, we may just have an excellent source for user-produced well-formed xml documents which can be integrated into XML-based content management systems with PDF-based presentation and HTML previews, etc.
  70. XML is as XML does by roffe · · Score: 2

    I think I'll just point to something I wrote a long time ago, at the time Microsoft first announced XML support but before the US Courts gave Microsoft unlimited license to do as they damn well please.

    --
    -- Rolf Lindgren, cand.psychol
  71. it has to be the default by g4dget · · Score: 2
    You are (presumably) not a convicted monopolist, so you can do whatever you want to when it comes to file formats. But Microsoft is a convicted monopolist, and it is proper for the government to tell them what format to save their files in, in particular when their choice of format is one of the principal ways by which they are able to maintain their monopoly.

    As for making it the default, if it isn't the default, it won't work. Not only do most users not understand how to save in other file formats, if it isn't the default, it probably will be too buggy to be used. None of the non-Microsoft formats in Word, PowerPoint, or Excel are really usable for day-to-day use because they lose formatting or worse.

  72. MIRROR: Original XML by gazbo · · Score: 2, Informative
    I've mirrored the actual xml file that has not been mutilated by slashcode policies.

    Look here using a browser that will display the raw xml nicely formatted - IE works fine, supposedly Mozilla does too but I can't seem to get it to work; it parses the file and just displays the text.

    Shame this is all so hidden away in the story.

    1. Re:MIRROR: Original XML by Reziac · · Score: 2

      Well, that's weird... NS3 thinks XML saved from OfficeXP locally is fine and displays it correctly, but thought this sample page was plaintext. Must be something different in the tags somewhere.

      --
      ~REZ~ #43301. Who'd fake being me anyway?
  73. Linux? by The+Pi-Guy · · Score: 3, Funny

    Dr. Love thinks that a relationship between microsoft and linux has a reasonable chance of working out, but on the other hand, it might not. Your relationship may suffer good and bad times. If things might not be working out as you would like them to, do not hesitate to talk about it with the person involved. Spend time together, talk with each other.

    D'oh!

  74. Yes, we should. by g4dget · · Score: 2

    The rules are indeed different for convicted monopolists, or even companies that dominate a market. It's OK for you to release a broken version of XML for your office suite, it's not necessarily OK for Microsoft.

  75. Re:They Have Too by Malcontent · · Score: 2

    Then why even bother with XML? Why pretend that it's some kind of an open format? Why not just stick with the proprietary format they have now?

    --

    War is necrophilia.

  76. It's XML, get over it. by Ankh · · Score: 5, Informative

    Wow, what a lot of false information. Maybe this will help a little. Disclaimer: I am XML Activity Lead at W3C, so I have a bias.

    The new Visio is using SVG.

    The new Word lets you use any XML vocabulary you like. How obfuscated it is is *entirely* up to you.

    It's not using base64 to put binary propietary data into XML documents. It's using plain XML.

    It's well-formed, and Word appears not to make up thousands of elements. The person in charge of this project is actually clueful, and was in the W3C XML Working Group (1996-1998 by the way).

    The tools all use XSLT extensively.

    It wouldn't surprise me if you could get Word to read and write the OpenOffice format just fine. There's a restriction that you can't re-order content in Word right now, I think.

    People claiming to have "insider info" and then posting blatant falsehoosd, or claiming you can put binary data directly in XML, aren't helping here. Even if you get high from hating Microsoft, the open source community and Free software world need to understand that the goalposts have moved a little.

    The extent of corporate assets tied up in memos, reportsand other documents is very large, massively higher than the collective value of relational databases.

    Yes, it looks as if Microsoft has suddenly discovered XML just as they suddenly discovered the Web. In fact, they were involved heavily in XML from the start, were among the first to ship commercial support for XML, and have been working on XML in Office 11 for a long time.

    --
    Liam Quin

    --
    Live barefoot!
    free engravings/woodcuts
    1. Re:It's XML, get over it. by mkweise · · Score: 2, Insightful

      If the were going to use XML as the native document format, I'd be impressed. But adding it as an export format that most users probably won't even notice unless they actively look for it? That's not exactly what I call embracing the standard.

      --
      Gentlemen! You can't fight in here, this is the War Room!
  77. Humpty Dumpty by jefu · · Score: 2

    A short quote:

    'When I use a word,' Humpty Dumpty said, in a rather scornful tone,' it means just what I choose it to mean, neither more nor less.'

    'The question is,' said Alice, 'whether you can make words mean so many different things.'

    'The question is,' said Humpty Dumpty, 'which is to be master - that's all.'

  78. I'm so fucking tired of this FUD by NineNine · · Score: 2

    All I've been reading about on Slashdot is that "the *only* reason that our company is still using Windows is because Office file formats are proprietary. We're tied to Office and Windows." Now, at least at this stage, this is the BEST possible fucking news, and everybody is still bitching. Nothing is more open than XML. That's all we know right now. Office data may be in completely open, standard XML. There's no telling what it'll look like, but there's no possible better news to hear than the Office formats may be wide open.

    Yet, everybody's still bitching. I have a feeling that what it is is that all you l33t *nix gurus are finally gonna have to put your money where your big fucking mouths are when the format is open, and you're gonna have to actually move to OSS/StarOffice, etc., and you're still looking for reasons not to.

  79. Don't get confused. by twitter · · Score: 4, Interesting
    You are goddamned fucking lucky that the government tells you what the default values for things should be. That's what the government is there for, mostly; to tell you that the default value for a building is to have a fire exit and that it may not be locked.

    Most rational specifications are for performance. The method should not matter as much as the end result. Fire codes are an extreem example, but even there the specification is flexible. The local government does not tell people how to build buildings, only that there needs to be so many exits per so many people and floor space. They don't nail you down to real specifics. Most rational specs are such as mil-specs for acryilic - it must be able to sit in the South Florida sun for one year without delaminating. How you make the thing does not matter, so long as it does what it should.

    By these rational and objective standards M$ junk generally fails. If you say that a Word doc should be legible and keep it's formatting for a number of years, Word fails. The same thing can be said of all other M$ junk - it's designed to break and therfore government should reject it's use anywhere records are kept. That's all public work. That's hardly engineering the document, it's simply stating the thing should work as advertised.

    All normal standards, from ASCII to WWWC are formed by professional agreement. Governments intervention is not needed. Disruptive vendors are generally seen through.

    --

    Friends don't help friends install M$ junk.

  80. <word>TVqQ93JSF0ds92jJs</word> by yerricde · · Score: 2

    you cant for instance have binary-encrypted elements

    Oh yes you can: just put a doctype, then <word xmlns="http://xmlns.microsoft.net/office/11/word"> , then a block of MIME encoded data, then </word>. If not, what in the XML specification prohibits this?

    --
    Will I retire or break 10K?
  81. 1337ness by 1g$man · · Score: 3, Funny

    I guess the cool thing now is to put the tagline "Could this be grounds for another anti-trust suit against Microsoft?" on every Microsoft story, even when the context has absolutely nothing to do with anti-trust.

    Huh.

  82. Re:They Have Too by Sivar · · Score: 2

    Microsoft is switching to XML because it will become the standard data exchange format of all things .NET (other than source code, obviously), and because it is faster and simpler to parse.

    After the format wars between Office and WordPerfect--the wars to make each incompatible with the other, I have heard the Office format described as:
    "...is not just a data format. It is an entire world philosophy in and of itself. It is more complex than a space shuttle, more confusing than trying to complete the Fourier analytic proof of quadratic reciprocity."
    I've seen Office 2000 corrupt two of its own documents twice in the last two months. This may be why.

    --
    Computer Science is no more about computers than astronomy is about telescopes. --E. W. Dijkstra
  83. Word dumps RAM by crovira · · Score: 2

    Word files are RAM dumps. The memory is allocated, uh, oddly and chunks are scattered all over and over and over (because parts have been re-indexed but not yet over-written or garbage collected.)

    If you don't know the scheme, you haven't got much of chance of re constituting the document. Even if you DO know the scheme, it still bites. In fact that's why versions of Word files are incompatible. Not even M$ can do that properly. (Actually its because they'd need to have redundant implementations of code to perform the same functions from the different versions. Its easier to turn that incompatibility into a marketing lever.)

    The streaming I/O performance is actually quite poor compared to that of WordPerfect. And they lock up the files so you have to use DDE or OLE to get at the actual text stream.

    --
    MSBPodcast.com The opinions expressed here are my own. If you don't like 'em... Think up your own stuff.
    1. Re:Word dumps RAM by Planesdragon · · Score: 2

      Word files are RAM dumps. The memory is allocated, uh, oddly and chunks are scattered all over and over and over (because parts have been re-indexed but not yet over-written or garbage collected.)

      Can you provide a source? Not that I doubt you, I just want to see the original for myself so I'm absolutely sure that you're correct.

      It seems to me that a RAM dump would be faster to load and save than a text stream--and easier to implement version changes in, to boot. So I don't think that its something they did to be malicious...

  84. Open as in chest wound... by mkweise · · Score: 3, Funny

    ...not as in can of worms.

    In other words they're involuntarily providing the bare minimum of interoperability that the marketplace demands. News for nerds to yawn at.

    --
    Gentlemen! You can't fight in here, this is the War Room!
  85. HTML Tidy by Pseudonymus+Bosch · · Score: 2

    You don't seem to know HTML Tidy, one of its capabilities is cleaning Word's pseudoHTML.

    --
    __
    Men with no respect for life must never be allowed to control the ultimate instruments of death.
    GW Bu
  86. Re:A 20 year old solution: by glenstar · · Score: 2
    To begin with, GNU will be a kernel

    Oooops! Missed the the boat by a little bit there, eh Richard?

  87. Re:A 20 year old solution: by glenstar · · Score: 2
    GNU will run on them at an early date.

    Uh oh... *double* oops.

  88. Re:They Have Too by Malcontent · · Score: 2

    " Microsoft is switching to XML because it will become the standard data exchange format of all things .NET (other than source code, obviously), and because it is faster and simpler to parse."

    This makes no sense on two levels.

    One is that you are presuming that the .NET platform (or whatever the fuck it is today) is incapable of exchanging binary formats. In fact it is probably more efficient to send .doc files back and forth instead of streaming them to text and back.

    Two is that the MS-XML that office will be using will not be interchangable with any body elses parser. If you are going to embed binary data into the XML document then you are going to have the incompatible documents.

    --

    War is necrophilia.

  89. Re:They Have Too by Malcontent · · Score: 2

    "You'll have to ask Microsoft why they hav a suddern desire to switch everything to XML, I have no idea."

    Mmm very interesting. Either they are stupid or evil.

    Do you really think they can force everybody else to stream their version of XML into office files?

    --

    War is necrophilia.

  90. Re:"Could this be grounds for another lawsuit?" WT by Danse · · Score: 2

    But Your Honor, it makes no difference that I was convicted of raping that other woman! This is a completely different woman! I've never been convicted of raping this particular woman! Don't you see that you should give me the benefit of the doubt?

    In case anyone still doesn't understand, what I'm saying is that a company convicted of monopolizing one market should not simply be reprimanded for that one market while being allowed to monopolize another area. The judicial system is incredibly inadequate when it comes to dealing with problems like Microsoft. By the time anything gets done, it simply doesn't matter anymore and MS has found some other way to monopolize the market. Then the whole thing starts over again. We've been playing that stupid game with MS for nearly 10 years now. It's ridiculous that they continue to get away with it and have never gotten more than a slap on the wrist.

    --
    It's not enough to bash in heads, you've got to bash in minds. - Captain Hammer
  91. Re:EXCEL SAMPLE by zeugma-amp · · Score: 3, Interesting

    This displays really well as source in Phoenix .5. There is a blurb at the top that says "This XML file does not appear to have any style information associated with it. The document tree is shown below." ... Then it displays it as prettily formatted (though fairly useless) code.

    I'd like to see a clean HTML version of the same. It might make it somewhat easier to understand more or less what it is doing

    --
    This is an ex-parrot!
  92. Forget new features. How about something stable? by vandan · · Score: 2

    We use Access 2002 as a front-end to our SQL Server / MySQL databases. Access 2002 is the most unstable product we have ever had from anyone, apart from maybe Windows 3.11. It regularly crashes and damages databases with dialog boxes saying "Microsoft appologises for the inconvenience. Would you like to send a bug report?". And once the mdb file gets more than about 10MB (forms and code - no data) things very really strange. Forms get corrupted and dropped. Saving changes to anything takes 5+ minutes, and often results in a crash. It really is a pile of shit. If only there were a reasonable open-source alternative that didn't require learning some obscure language like Object Pascal (for God's sake, what were they thinking).
    No upgrading for us anyway. We'll put up with this and save our money for faster machines.

  93. same as usual by jdkane · · Score: 2
    Could this be grounds for another anti-trust suit against Microsoft?"

    No. Because in XML you are allowed to define your own application of it. Hopefully I as a developer could also create my own XML application (cryptic or not) without getting in legal troubles. Otherwise I might as well start learning a trade if the computer world is really that much of a mess.

    The move could also hamper data exchange with competing desktop productivity software that recognizes XML, such as Corel's WordPerfect or Sun Microsystems' StarOffice, say analysts and competitors.

    Just because somebody else is first to the game doesn't mean the last guy has to follow. Microsoft has always created their own standard. They will do it again. That should be of no surprise to anybody. And MS Word won't change much as a result because it is currently proprietary and most likely will continue to be.

    However I can definitely see that if Microsoft uses common XML standards that are compatible with other office suites then the underdogs might get a chance. So should we blame Microsoft if they don't do this? Microsoft is not open source, they are about the money. They have no reason to support standards and compatibility if it will hurt their bottom line. On the other hand, they might shoot themselves in the foot with such a strategy because people may not like it. Of course history hasn't taught us this lesson even though we would like to see it learned from an open source standpoint.

  94. Re:"Could this be grounds for another lawsuit?" WT by commodoresloat · · Score: 2

    Also this is not about their past criminal record on unrelated crimes; they were abusing their monopoly power in both the OS market and the Office market at the same time, and their monopoly in one area aided and abetted their monopoly in the other. These are separate crimes only by legal fiction.

  95. Comment removed by account_deleted · · Score: 2

    Comment removed based on user account deletion

  96. Re:HTML Tidy and a slight rant by commodoresloat · · Score: 2
    I have used several tools like this including tidy. There's one called demoronizer, and there's one that used to be a bbedit plugin (I can't seem to find it now); these things all help but as someone else noted, Word's output of HTML is so screwy there are very few scripts that will really fix these things consistently. It's almost as if they went out of their way to confound all clean-up tools. How hard can it be to automatically output HTML that isn't a complete and utter nightmare? Isn't that one of the reasons we have standards in the first place? For some reason I keep thinking optimistically that MS will fix this in the next release of Word. It's like the OS X bug where dragging an icon to the left side of the screen and letting it go makes everything jump go to the right (which still happens in 10.2.3). It's stupid, annoying as hell, even inexcusable, and it's probably frightfully simple to fix, yet it's ignored in every update. Of course, the OS X bug is a minor annoyance compared to the absolute drain on productivity provided by the MS bug.

    (And before anyone tells me "if you don't like it don't use it" -- I don't use it. I mean, Word is great for writing academic papers and all (I don't know any other office-type product that works well for people who write with a lot of footnotes, and no I don't have time to learn LaTeX, as cool as I might think that would be) but I would never think of using word to output HTML. But the problem is if you are getting documents from other people who only use Word, no matter what.

    I tell you what, the killer app, at least for the average desktop user, would be a streamlined version of Word that only did what a word processor should do, and that automatically (and preferably seamlessly) sent other tasks to more well-designed applications for those purposes. I mean, I understand why relatively clueless people use Word for HTML, but why the hell do they try to use it for desktop publishing, for image manipulation, even for freakin' web browsing? The program shouldn't encourage such behavior by providing bad implementations of these tasks; instead it should send the task to a program that knows what it's doing.

    The craziest thing about this is that MS is in a unique position to deliver exactly such an app -- they have Word already recognized as the absolute standard, they have their own desktop publisher, image manipulation tools, web design tools, and web browser. If they were willing to let go of the bloatware and open up and standardize their formats, this project would be a no-brainer. Since they won't do it, somebody else should. Apple is too committed to Word to do this (I don't think AppleWorks is taken seriously by anyone, though I could be wrong), so there really is the possibility projects like openoffice or koffice being able to deliver something like this.

  97. Comment removed by account_deleted · · Score: 2

    Comment removed based on user account deletion

  98. Re:They Have Too by Malcontent · · Score: 2

    I seriously doubt it. There is just too much pressure from openoffice which is free and has a completely open XML file format. Sure some CIOs will stick with MS office but little by little that monoply will fade. It will start at the small business level because they can least afford office. As those businesses grow they will continue to use openoffice just out of momentum if nothing else. Also there will be tremendous amount of foreign countries which can not afford msoffice.

    Unfortumately for MS their twin monopolies are being threatened by free competitors which are pretty damned good. Given a choice between pretty-good-and-free and better-but-expensive most rational people will will choose the former.

    --

    War is necrophilia.

  99. XML != open. XML only makes *syntax* clear by divec · · Score: 3, Insightful

    Just because a file format is XML, it does not mean it's open. Even if it's "real" XML and not a wrapped binary dump (Vvjfio1@1/515...). All XML does for you is to make the *syntax* of the file format clear, not the underlying meaning. Analogously, in German, every noun begins with a capital letter, and root verb forms generally end with "-en"; this tells you a bit about the phrase "Mit grossem Bedauern haben wir vom Ableben Ihres Gatten erfahren", but it's certainly not enough to understand it.


    Even an XML schema is not enough - that just tells you which elements can appear where and what they can contain. That's like knowing that a normal German sentence has the main verb in the second position in the sentence. This still doesn't tell you the meaning of the above sentence, though you can see that "haben" is the verb and "Mit grossem Bedauern" is the first part of the sentence.


    For an XML language to be open, you need a full description of what each possible construct in that language means.

    --

    perl -e 'fork||print for split//,"hahahaha"'

  100. I think you mean... by Dog+and+Pony · · Score: 2

    "Open? Sure it's open! Just click here... and *poof* your document is open. What's that? You mean you want to open it with something other than M$ Office? Why on earth would you want to do that?"

  101. Matter Of Air Superiority by Puu · · Score: 2, Informative

    The testing is sickening. But it's us or them, really.

  102. Re:TVqQ93JSF0ds92jJs by JebusIsLord · · Score: 2

    No no no, thats not my point. I agree with you there. My point is the ELEMENT still has to contain binary tags and attributes, but the data INSIDE the element can be binary. This might sound like a silly/pointless thing to say, but the fact is, an xml file containing nothing but base-64 binary data is STILL parsable by anyone's text viewer, just the DATA isn't (yes i know that is the important part). This is at least a little easier to read than a pure binary file, because the binary blocks have to have some ascii metadata attributed to them.

    --
    Jeremy
  103. Re:HTML Tidy and a slight rant by jbert · · Score: 2

    mean, I understand why relatively clueless people use Word for HTML, but why the hell do they try to use it for desktop publishing, for image manipulation, even for freakin' web browsing?

    Not just MS users...
    *cough* emacs *cough*

    OK. (So this *could* be taken as a troll, but hey - try and see if in a jovial, festive, spirit :-)

  104. Hello World by sharkey · · Score: 2


    xmlns:w="urn:schemas-microsoft-com:office:word"
    xmlns="http://www.w3.org/TR/REC-html40">

    <head>
    <meta http-equiv=Content-Type content="text/html; charset=windows-1252">
    <meta name=ProgId content=Word.Document>
    <meta name=Generator content="Microsoft Word 9">
    <meta name=Originator content="Microsoft Word 9">
    <link rel=File-List href="./Hello%20World_files/filelist.xml">
    <title >Hello World</title>
    <!--[if gte mso 9]><xml>
    <o:DocumentProperties>
    <o:Author>Seth Ramsey</o:Author>
    <o:LastAuthor>Seth Ramsey</o:LastAuthor>
    <o:Revision>1</o:Revision>
    <o:TotalTime>1</o:TotalTime>
    <o:Created>2002-12-20T13:09:00Z</o:Created&g t;
    <o:LastSaved>2002-12-20T13:10:00Z</o:LastSaved>
    <o:Pages>1</o:Pages>
    <o:Company>Arlington/Roe &amp; Co., Inc.</o:Company>
    <o:Lines>1</o:Lines>
    <o:Paragraphs>1</o:Paragraphs>
    <o:Version>9.4402</o:Version>
    </o:DocumentProperties>
    </xml><![endif]-->
    <styl e>
    <!-- /* Style Definitions */
    p.MsoNormal, li.MsoNormal, div.MsoNormal
    {mso-style-parent:"";
    margin:0in;
    margin-bottom:.0001pt;
    mso-pagination:widow-orphan;
    font-size:12.0pt;
    font-family:"Times New Roman";
    mso-fareast-font-family:"Times New Roman";}
    @page Section1
    {size:8.5in 11.0in;
    margin:1.0in 1.25in 1.0in 1.25in;
    mso-header-margin:.5in;
    mso-footer-margin:.5in;
    mso-paper-source:0;}
    div.Section1
    {page:Section1;}
    -->
    </style>
    </head>

    <body lang=EN-US style='tab-interval:.5in'>

    <div class=Section1>

    <p class=MsoNormal>Hello World.</p>

    </div>

    </body>

    </html>

    --

    --
    "Outlook not so good." That magic 8-ball knows everything! I'll ask about Exchange Server next.
  105. Yes and No by burgburgburg · · Score: 2
    Yes, I wrote that Slashdot inserted the spaces (using the commonly accepted /. as a substitute).

    I never said it was impossibly bad. I have grepped out the stupid ridiculous errors. It just took far more time then it should have because the code was just so atrocious.

    I often found it easier to save Word files as raw text and write the HTML around them instead of having Word do it. It saved time. That's a sad, pathetic statement.

  106. Re:Think DXF by leandrod · · Score: 2
    > Think how few docs now use any sort of templates or employ style sheets.

    Agreed about your point, but I want to point out a reason I consider to be partial explanation for the fact.

    Templates and stylesheets in MS Office are difficult to use, do not work at all for complex stuff, and break from one version to another.

    When I used MS Word for DOS and OS/2, from versions 3 to 5.5, we had stylesheets and templates as separate things. Templates were just documents set aside as documents. Stylesheets were separate files that contained only the style definitions and formatting. You could easily apply different stylesheets to any document, thus getting the desired output.

    When MS Word for Windows, in its version 6 if memory does not fail me, merged templates and stylesheets, chaos ensued. I could not convert my old documents properly. I failed to reproduce the efficiency of the old work flow. I had been educating fellow users on the benefits of structuring and separating formatting onto stylesheets much before I heard of LaTe or SGML, but now even myself could not make it work. Even when I could structure complex documents, they would break in other systems. Never again I could separate content and formatting, and apply different stylesheets to the same document.

    I have heard about Microsoft systems that they are a matter of luck. That some people (whom I never met) have bulletproof systems (I doubt) and some others have just bad luck. Even if it was true, which I doubt, it would still be a comment on the sad state of things that so much depends on sheer luck. As it is, the better explanation I find about these so different perceptions is that some people had knew Unix, DOS and mainframe systems (like I did), and so they find MS-W32 to be worthless; much more people have been reared on DOS and MS-W16, and so find MS-W32 to be the greatest thing on Earth.

    --
    Leandro Guimarães Faria Corcete DUTRA
    DA, DBA, SysAdmin, Data Modeller
    GNU Project, Debian GNU/Lin
  107. Do they still Not Get It? by billstewart · · Score: 2

    " But this format will (of course) only represent the appearance, not any structure." WHAT!?!?! Do they still not bloody get the bloody concept, or are they deliberately trying to make interoperability unusable? They did this in earlier versions of Office with their save-as-html modes, which did stupid things like saving a "Header Type 2" as "14-point-boldface-text" or whatever your current style was rather than saving it as an HTML "H2", but at that point it could be attributed to stupidity and/or incompetence, since some people think for some reason that HTML is an appearance description language rather than an specific implementation instance of a content description metalanguage, which is a bit too abstract for some people. But XML is much more explicit about being a content description metalanguage, and if you've got enough of a clue about it to output your material as XML, you've got to get that much of the concept. I'd attribute this one to malice.

    --

    Bill Stewart
    New Fast-Compression-only CPR http://preview.tinyurl.com/dy575ks