Slashdot Mirror


MS Office XML Format Now In TextEdit

computerdude33 writes "Apparently, Apple heard of Microsoft Office changing to XML formats. If you have OS X 10.4.2, you can save documents in TextEdit in Word XML Format. They are saved with a *.xml extension, and are riddled with references to Word. Here is an example of one of these documents."

7 of 86 comments (clear)

  1. Re:Ugly format.. by Heisenbug · · Score: 4, Insightful

    I don't really see the problem with "bloated" xml, when the files are zipped by default. Instead of smushing your efficiency requirements in with your readability and standardization requirements (and screwing all three), you first handle readability and standardization and then rap it in a standard efficiency layer. The upshot is, not only are the files often *smaller* than the old Word equivalent, but I can also hack through them using a couple of standard perl packages that have come with linux, OS X and cygwin for years.

    Where's the downside?

  2. Re:in case you're curious... by That's+Unpossible! · · Score: 4, Insightful

    So a simple two word text file has the following 33 XML tags pasted here with the greater and less than signs removed...

    What is your point? Oh lord, this file is 1200 bytes long, for "just two words of text."

    I created the same two-word document and saved it in several text-based formats that preserve the formatting. HTML (2700 bytes), RTF (3600 bytes), PDF (16,600 bytes), and of course, Word .doc format (20,000 bytes).

    The XML version is smaller than all three, and I dare-say, easier to parse and manipulate with a 3rd party program.

    Yeah, if you don't want any formatting information stored with your text, use plain text. But otherwise, XML seems to be as good a format as any of the other markup doc formats commonly used in Office.

    --
    Ironically, the word ironically is often used incorrectly.
  3. Re:Who is maintaining the "standard"? by fm6 · · Score: 2, Insightful
    Netscape never "defined the standard". There have always been W3C specs for HTML. The problem was that in the middle 90s, W3C was taking forever to define specs for more than the most trivial web pages, and Netscape wasn't willing to wait on them.

    Nor was it true that "nobody cared". Lots of people bitched about it.

  4. Re:in case you're curious... by Trillan · · Score: 2, Insightful

    I thought he was demonstrating different exports from Word. Word 2004 (Mac) makes it 2,167 bytes. Granted, that's horrible HTML...

  5. Re:in case you're curious... by NutscrapeSucks · · Score: 3, Insightful

    Granted, that's horrible HTML...

    It's also a fair example, because Word-HTML can "round-trip" back to Word with no loss in fidelity. A barebones HTML file can not.

    --
    Whenever I hear the word 'Innovation', I reach for my pistol.
  6. Re:Who is maintaining the "standard"? by martinX · · Score: 2, Insightful

    Which is why even the dedicated MS-haters blanched at having to use NN4. It was bloated, buggy, crappy.

    MS didn't achieve browser dominance just through (mis)use of their monopoly. Netscape helped them by releasing NN4.

    --
    When they came for the communists, I said "He's next door. Take him away. Goddam commies."
  7. Re:Interesting... by King+Babar · · Score: 2, Insightful
    An interesting thing is that trying to open one of those files in Pages results in a dialog that says "This XML files was created with an unsupported beta version of Word" and it doesn't open it. I'm not drawing any conclusions, I just think it's interesting.

    Ah, Pages. The program has some neat features, but has all of the hallmarks of being rushed out of the door for the 1.0 release. It's a nifty program for making flyers, and maybe short newsletters, but it's pretty much a loss to do any serious word processing in the thing, as it currently stands. In a way, it doesn't surprise me to hear that TextEdit is leading the way on the XML front, despite the fact that Pages has an XML native format...

    --

    Babar