Slashdot Mirror


Office 2003 and XML

zachlipton writes "Internet World is reporting that initial reports from Office 2003 beta testers don't look good for those hoping to share documents with non-MS systems using the XML file format. Gary Edwards, the OpenOffice.org representative for the OASIS XML file-format group is quoted as saying "although it's still early in the review process, it does look as though XP XML has been so seriously crippled as to be useless to anyone but the big content management and collaboration system providers." Apparently, all formatting and presentation information is removed from the XML. Furthermore, Office's new collaboration featres will only work with users who are also running Office 2003 (requiring Windows 2000 or 2003) that are connecting over XP servers." So Microsoft will continue its efforts to lock-in users with proprietary formats, and hopefully the rest of the world will produce an XML standard document format without them.

99 of 502 comments (clear)

  1. Duh. by McDutchie · · Score: 3, Insightful

    Well, it friggin' figgers, doesn't it? Anyone who didn't see this coming must have been living on another planet.

    With the US antitrust suits off now, the EU is our only hope to curb their anticompetitive practices.

    1. Re:Duh. by t0ny · · Score: 5, Insightful
      How do you figure this is anti-trust? This is simply a company who has the dominant product protecting their lead. And quite honestly, I dont see anything wrong with that, as long as they confine their practices to their product (ie. they arent making Office the only suite that can run on windows)

      Have you ever played a game like Civilization or Alpha Centari? You would be amazed at how much those games make you understand politics. Once you are in the lead, you do anything you can to protect that lead. And why would you expect the real world to be any different?

      But this isnt a game, this is business. And since businesses are SUPPOSED to make money, they need to make sure people continue to buy MS Office. And making an office suite that shares documents with all the various third-tier office suites just doesnt do that. Why should my company buy MS Office if the documents it produces are exactly the same as those of FreeBeerOffice? Now, if FBO cannot do things MSO can do, then there is an incentive...

      --

      Manipulate the moderator system! Mod someone as "overrated" today.

    2. Re:Duh. by Kyaphas · · Score: 2, Insightful

      True, but from what I've come to understand, when you have a monopoly, the rules change. You can't "do anything you can to protect that lead".

      --
      ---- The price of freedom is eternal vigilance. -Thomas Jefferson
    3. Re:Duh. by jd142 · · Score: 2, Insightful

      Civilization and Alpha Centari are pretty much zero sum games in practice if not in theory. Life isn't like that. Sometimes if you and your neighbor can both win and both feel like winners. As you say in your post, once you are in the lead, you have to stay in the lead.

      In the real world, once you are in the lead (say a civilization with advanced sciences and arts, bounty for all, etc) why would you work to keep other civilizations/countries down? You'd work to improve their science, their arts, their housing. Then you'd both win.

      Might as well say that an understanding of Risk gives you the ability to command armies and understand the way countries interact.

    4. Re:Duh. by itwerx · · Score: 2, Insightful

      Cool, enough bricks and I can build a house!
      But seriously:
      Windows became a monopoly before Office did. Office's present monopoly is based on the foundation laid by Windows. Office's status as a monopoly is grandfathered from Windows. QED
      (It is possible for something to be true without a judicial ruling...)

    5. Re:Duh. by kin_korn_karn · · Score: 4, Funny
      Separating data from format is one of the strengths of xml.

      Also, of the comma-delimited file.
    6. Re:Duh. by zmooc · · Score: 2, Insightful
      And quite honestly, I dont see anything wrong with that, as long as they confine their practices to their product (ie. they arent making Office the only suite that can run on windows)

      How is that any different? They do open up their Windows API so people can write software for it but they don't open up the document format so people can write documents for it. How is closing up windows so it can run only office any different from closing up word so it can open only office documents?

      --
      0x or or snor perron?!
    7. Re:Duh. by H310iSe · · Score: 2, Insightful

      Hello, a decent Word document converter has been needed for ages. Thing is, postscript converters will loose things like tables (?) auto numbering and lots of other things that make working in modern word processors such a joy. MS has completely obfuscated the .doc file format (the way the document is encoded is a nightmare) so converters have been limited... if someone out there could suffer through the .doc and write a converter this whole waiting for MS to get with XML would be moot.

      --
      closed minded is as closed minded does
  2. At some point..... by i_want_you_to_throw_ · · Score: 5, Insightful

    Microsoft will have to learn IBM's lesson about transforming from a company that makes standards, to one that contributes to them.
    They still don't get that their attempts to "embrace and extend" the whole damn internet isn't going to work.

    The rest of the world WILL produce an XML standard document format without them, thank heavens.

    1. Re:At some point..... by McDutchie · · Score: 4, Insightful
      he rest of the world WILL produce an XML standard document format without them, thank heavens.
      Which will be an irrelevant format because everyone will still need Word to read all the ubiquitous crippled Word XML format documents flying around on the net.
    2. Re:At some point..... by gmuslera · · Score: 4, Insightful

      Word (or even complete office), Win2k/XP as desktop and server. If someone sends me a document in Office 2003 format that he say I "MUST" read, I ask him to choose between sending me US$2003 to be able to read it, or sendme it in a really open format.

    3. Re:At some point..... by bfree · · Score: 4, Interesting

      Why? The attitude sounds harsh when expressed so simply, but if you tell you "client" that you can't read the file and that your company has decided not to purchase the software required to be able to do so as otherwise they would have to pass on the associated costs to their clients, so could they please send the file in a format you can read instead (even Word XP or earlier thanks to oo.o) or fax it, should the client really have a problem and if so is it worth keeping hte client (yes I really said that, lots of the time troublesome clients aren't worth keeping without changes if you actually can cost them completely)? Similarly with a coworker you can ask them if you can buy the software from their budget (in a company setting there should be company standards so this should be easy)!

      --

      Never underestimate the dark side of the Source

    4. Re:At some point..... by ccp · · Score: 5, Insightful

      Why not?

      If your clients tell you to bend over, you bend over? You seem to have a very sad life. Grow some spine, explain things to them, and you'll be surprised about how many of them get it.

      And, in case you wonder,

      I'm not a student.
      I own a business.
      And yes, I'm doing rather well even with principles.

      Cheers,

    5. Re:At some point..... by NDPTAL85 · · Score: 2, Interesting

      Why? Because its a bullshit attitude to have when dealing with clients.

      Your supposed to bend over backwards to help and assist your clients, not make them do that for you. Of course if you do business with a holier-than-thou Free Software ethos then yeah I guess you wouldn't see a problem with acting like that. And I'm not saying it would put you out of business either. You'll simply be regarded as a jerk.

      --
      Mac OS X and Windows XP working side by side to fight back the night.
    6. Re:At some point..... by LordNimon · · Score: 3, Insightful
      Your supposed to bend over backwards to help and assist your clients, not make them do that for you.

      Not necessarily. What if bending over backwards forces you to spend thousands of dollars more on software, just because one or two clients are unwilling to use the "Save As..." option in their word processor? Would you hire a consultant that charged an extra $10/hour because some of his other clients are too stupid/lazy/arrogant to cooperate with the consultant to get the job down at the lowest cost and least amount of time?

      --
      And the men who hold high places must be the ones who start
      To mold a new reality... closer to the heart
    7. Re:At some point..... by ccp · · Score: 2, Interesting

      It must be a cultural thing, but...

      Do you have to reduce EVERYTHING to dollars and cents?

      Have you heard of principles, dignity, pride?

      Who said that you have to bend over backwards to satisfy your clients? I'd hate to have these clients. Mine are satisfied with a good deal, and I mean good to both parties.

      Cheers,

    8. Re:At some point..... by Clockwurk · · Score: 2, Insightful

      but if you tell you "client" that you can't read the file and that your company has decided not to purchase the software required to be able to do so

      You probably won't be able to keep that client (or get their business). Example: our family business does all drawings on AutoCAD, and many of our clients also use it. One of our very best (and profitable) clients however, has switched all their engineers over to Pro-E (many thousands of dollars per seat) and will only send us files in that format. We have two options: tell the customer that we won't purchase Pro-E (they will take their business to our competitors; we lose much $$$), or we can bite the bullet and get some seat licenses for Pro-E (we keep their thousands/millions in business, but we buy software we don't want).

      Most businesses don't see your not purchasing needed software as saving consumers money, they see you as a penny-pincher who doesn't understand the concept of overhead (you have to spend money to make money).

      If you decide not to purchase Office, thats your decision, but you shouldn't expect others to bend over backward because you're a cheap-skate or an idealist.

    9. Re:At some point..... by Photon+Ghoul · · Score: 2, Interesting

      I hope that your company keeps you in the back room so that you don't have to deal with any clients.

      Understand that I solely use OpenOffice for my documents when I'm not using vi. I prefer vi over all other types of documenting - it's fast and easy for me. Anyway, the point is - your personal preference doesn't matter. Communicating quickly and effeciently is.

      Yes, Microsoft sucks, closed formats suck, etc. The truth of the matter is, you either adapt to what (in this case) your clients use or they don't do business with you. Most businesses that I've worked for tend to opt for the adapt method than the 'go to hell' method.

      It has nothing to do 'bending over backward' for everything the customer asks for. It has everything to do with communication and making stupid things like file formats as transparent and unimportant to the client as possible.

      If you lose business over something as stupid as a file format, I only assume that business must be booming with the rest of the customers that only deal in your preselected formats.

    10. Re:At some point..... by NDPTAL85 · · Score: 2, Insightful

      I don't understand your logic at all. Buying the software is a one time cost. You don't have to buy it each time a client wants to send you something. Its just a one time cost. Once its paid, you can stop being annoying to your clients. I consider that well worth the money. I receive far too many various MS files to be asking each recipient to use "Save As". It disrupts the flow of business for a not good enough reason.

      It would be like McDonalds asking all their customers to remove their shoes and socks when they enter the restaraunt and place them in the closet before ordering their food. It might keep the floors clean but its just really inconvienent.

      --
      Mac OS X and Windows XP working side by side to fight back the night.
    11. Re:At some point..... by NDPTAL85 · · Score: 2, Insightful

      You might hate to have those clients, but others would say they'd hate to have to do without their money. ;)

      And what do you mean a cultural thing? Some folks make too much out of being dignified and prideful. They'll turn any insignificant issue into a matter of principles.

      If you want to be principled then choose a REAL issue to make your stand on. Something like human rights, homelessness activism or education reform. But the use of MS Software? Puhlease.

      --
      Mac OS X and Windows XP working side by side to fight back the night.
    12. Re:At some point..... by MeanMF · · Score: 4, Interesting

      You could also just download the free MS Word viewer that Microsoft provides here.

    13. Re:At some point..... by urmensch · · Score: 2, Interesting

      so what if I'm not running windows?

      System Requirements for Using Word Viewer

      * Microsoft Windows® 95 operating system or Microsoft Windows NT® Workstation operating system 3.51 or later

    14. Re:At some point..... by Uwe+Barschell · · Score: 2, Interesting

      I read the article. A representative from OpenOffice said that according to reports he has heard, the MS-XML format is crippled. An Office 2003 beta tester quoted in the article had a different view:

      Gary Edwards (OpenOffice Representative): "Although it's still early in the review process, it does look as though XP XML has been so seriously crippled as to be useless to anyone but the big content management and collaboration system providers. Reports are that when saving to XML, [Office 2003] strips out the presentation and formatting information, leaving near raw content."

      Mark McWilliams (MS-Office 2003 beta tester): "The opened XML document looks exactly like the original .doc file. And if I open up the XML file in a text editor, I can see that all of the formatting is properly maintained in the XML file."

      This beta tester also said the document he used was heavily formatted, and that there is an alternative, data-only XML format in MS Office 2003 that does remove the formatting.

      Who is right? I dont know and I dont care, because I dont use MS-Office 2003. However, I am usually suspicious of criticisms of a product that are levelled by its competitors. The users of that product usually have a more objective and accurate view.

    15. Re:At some point..... by aaarrrgggh · · Score: 4, Insightful

      I agree with what you are saying, but there is a caveat: once a product has reached critical mass, you have to go along with everyone else.

      I remember problems with AutoCAD back 7 years ago or so, going from release 12 to release 13. 13 was a dog. It had an incompatible file format, forcing upgrades for everyone that shared the same document. Since 13 didn't offer enough incentive for them to reach critical mass, it died with most people sticking with 12 until the next release came out... which solved a lot of problems. Autodesk got a humility pill and realized that forcing the upgrades is bad policy, although you can do thing to encourage it (default format save).

      The trouble with MSFT's approach is that it breaks too many things at once; you have to get critical mass not only on the office application, but also the operating system and servers. A company that is not posed for this migration will not do it. If a single client requires it, then they will hire a secretary to do a saveas down to a more manageable format. If half the clients require it, it is difficult to avoid the upgrade.

    16. Re:At some point..... by frozenray · · Score: 4, Interesting

      > You could also just download the free MS Word viewer that Microsoft provides here [microsoft.com]

      For those not running Windows, the Word viewer comes "free" with a $199.- (list price) version of Windows, a good sized chunk of your system disk (not that it really matters much given today's HD prices and capacities) and the usual installation hassles, like drivers for equipment which isn't included on the CD etc. Even if you got Windows "free" with your PC from the manufacturer, you just paid the Microsoft tax up front, and will continue to pay if you want to keep your system up to date.

      That's like saying the Grappa I got offered after shelling out $150.- for dinner with a date last Saturday was "free". Sure, I didn't pay for it, but you can't get it without buying dinner first.

      Yes, I know there are solutions for reading MS Office documents on Linux. But I always cringe when people tell me to use the "free" readers - they're not free in any sense of the word in my book.

      --
      "There are already a million monkeys on a million typewriters, and Usenet is NOTHING like Shakespeare." - Blair Houghton
    17. Re:At some point..... by ender- · · Score: 2, Insightful

      don't understand your logic at all. Buying the software is a one time cost. You don't have to buy it each time a client wants to send you something. Its just a one time cost. Once its paid, you can stop being annoying to your clients

      Except that with Win3k/Office3K that is no longer the case. With their new licensing schemes, you will be forced to upgrade when Microsoft says to upgrade. And with Palladium, Microsoft will be able to disable your old software if you don't pay to upgrade. It will no longer be a simple matter of continuing to use old software, because you won't be able to.

      There's plenty of companies still using old software for both servers and desktops. If the old software performs the tasks you need, why should you waste money upgrading? Are the documents your clients sending you so complicated that they can't be written in Word 97? I seriously doubt it.

      I for one have recently given up Microsoft completely on my personal machines. And if I'm ever in a position to determine the IT purchasing for a company, I will avoid Microsoft if it is at ALL feasable.

      Ender

    18. Re:At some point..... by mOdQuArK! · · Score: 3, Insightful
      Buying the software is a one time cost.

      Actually, once Microsoft succeeds in transitioning to the subscription model, then buying Microsoft software will be a regular, on-going cost.

    19. Re:At some point..... by MrResistor · · Score: 4, Insightful

      You could also just download the free MS Word viewer that Microsoft provides here [microsoft.com].

      Strangely, there doesn't seem to be a Linux version. Or a Mac version, either. It's not so free when I'd have to buy a copy of Windows and spend 2 hours installing it, is it?

      --
      Under capitalism man exploits man. Under communism it's the other way around.
    20. Re:At some point..... by RoLi · · Score: 2, Funny
      It would be like McDonalds asking all their customers to remove their shoes and socks when they enter the restaraunt...

      Or like Microsoft asking all their customers to sign away all their rights and let the MS-police (aka "BSA") search everything in their organization.

    21. Re:At some point..... by walt-sjc · · Score: 2

      Unfortunately, the process of clueing in the users takes too long.

      Hmm. If you set your company's email server to automaticlly reject any ".doc" attachments with an appropriate autoresponse, the "clue" process can actually be quite quick and painless.

      As a side note, I was quite surprised when my realtor forwarded house information in .rtf format with a jpeg picture as opposed to a proprietary .doc format. RTF isn't great, but at least I could read it easily on my linux box.

    22. Re:At some point..... by Pxtl · · Score: 2, Informative

      Well, it does produce a burning sensation... but no, its a very, very strong drink. A byproduct of making wine, and a fun one at that.

  3. Separating Content from Presentation a Good Thing by avdi · · Score: 4, Insightful
    Apparently, all formatting and presentation information is removed from the XML.
    And this is bad how? Isn't this the dream that XML document proponents have aspired to for years? You just can't please some people...
    --

    --
    CPAN rules. - Guido van Rossum
  4. Style Sheets by FattMattP · · Score: 5, Insightful
    Apparently, all formatting and presentation information is removed from the XML.
    Good. That's the point of XML. Formatting and presentation goes in style sheets.
    --
    Prevent email address forgery. Publish SPF records for y
    1. Re:Style Sheets by danlyke · · Score: 3, Interesting

      Yeah, but...

      It's unclear from the article whether that leaves the style information intact, and obviously Gary Edwards has an ax to grind, but in the systems I implement, sometimes I can't get users to adopt the use of style sheets, but I can extract the semantic information from stylistic patterns. It's not all that difficult to look at the formatting for a screenplay, for example, and pull out the meta information about what actors appear in what scenes based on the bold outdented bits.

      If I can get to the presentation markup as well, if the style sheets are in an easy to use format, then this is no problem. If the XML is a simple export format rather than the full document then I may as well be printing to PostScript and trying to reverse engineer the semantics from that.

    2. Re:Style Sheets by sketerpot · · Score: 3, Informative
      If I read the article correctly (and it isn't very well written, so I could be wrong), they just take all the format and presentation information out. If you have something boldface in your document, it doesn't get noted in the XML file. However, the only real way to find out for sure just what this XML is like is to see one of the XML files---and they don't look like they're going to make it that easy.

      Anyway, Office has a ridiculously complicated format. Any XML that it generates will most likely be a nightmare even if they don't try to make it that way.

    3. Re:Style Sheets by Captain+Large+Face · · Score: 4, Interesting

      The problem is that they don't include it elsewhere.. So in order to share documents in the style intended by the user, it must be saved as the proprietary format.

      IMHO, this ensures the user will opt-out of the XML format, and stay with the proprietary format. As I posted above, if Microsoft are going to do this, then they should bundle an XSL document with each XML document.

    4. Re:Style Sheets by AndyS · · Score: 2, Interesting

      Why? This is a file format. The word processor will handle it all for you.

      I don't think this is going to be like HTML when you expect to write it yourself. I imagine this will look more like the OpenOffice file format where you have multiple XML files inside a ZIP file (along with graphics and other multimedia stored inside the zip)

  5. Re:Windows 2003? Where's that? by burninginside · · Score: 2, Informative

    Windows 2003 server is comming out on April 24
    Windows 2003

  6. Wow. by deviator · · Score: 4, Funny

    I am shocked. Shocked! I'm shocked that Microsoft would do something like this that wasn't in the best interest of their customers.

  7. Do Better? by 4of12 · · Score: 2, Interesting

    This is hardly surprising news.

    My question, though, is whether it is possible for other vendors and OpenOffice to create a better , more pleasing formatting and presentation of the content in the XML than Office 2003 does?

    --
    "Provided by the management for your protection."
  8. Missing the point by graphicartist82 · · Score: 4, Insightful

    So Microsoft will continue its efforts to lock-in users with proprietary formats, and hopefully the rest of the world will produce an XML standard document format without them.

    I'm not trying to start a flame war here, but it seems that they're missing the point! We don't want it to be MS with one format and the rest of the world with another. That really wouldn't make it much different from how it is now. At least the way it is now, non-MS office software can read the MS formats. If it comes down to the choice between using the MS format or the "rest of the world" format, MS is going to win every time..

  9. Re:Separating Content from Presentation a Good Thi by molarmass192 · · Score: 4, Insightful

    I think the point is that if you save to their XML specification, you will loose all your document formatting. So yeah, the data is there, but it can't be reopened in Office or any other word processor and be in a structured way. Essentially, it is the same as just saving as plain text which has already been available since Office 95.

    --

    Good people do not need laws to tell them to act responsibly, while bad people will find a way around the laws-Plato
  10. Re:Separating Content from Presentation a Good Thi by DaveAtFraud · · Score: 4, Insightful

    I have to agree. The the basic concept behind SGML and its diminutive offspring, XML, was to separate content, structure and presentation. This just means that you have to share a style sheet, FOSSI, or whatever when you share a document if you expect the person you share it with to be able to view it.

    There may be other *valid* criticisms of what Microsoft is doing but this isn't one of them.

    --
    They that can give up essential liberty to obtain a little temporary safety deserve neither safety nor liberty.
    Ben
  11. Re:Separating Content from Presentation a Good Thi by JordoCrouse · · Score: 4, Insightful

    And this is bad how? Isn't this the dream that XML document proponents have aspired to for years? You just can't please some people...

    Unfortunately, Manny Manager and Sarah Secretary are now very used to depending on the formatting and presentation information. To be honest, not too many people these days subscribe to the whole minimalist document theory (unless your idea of starting your editor is typing 'vi').

    The main point here is to encourage the .XML format for interoperability. If the XML format can't figure out the fonts, colors, and various drawing elements in your document, then people will abandon it for something that does - at the expense of the rest of us.

    --
    Do you have Linux and a DotPal? Click here now!
  12. bollocks by graveyhead · · Score: 4, Insightful
    hopefully the rest of the world will produce an XML standard document format
    This is just so wrong. It smacks of a writer who doesn't really understand the utility of XML. There doesn't need to be "The One True Document Format"... that's not what XML is all about.

    Instead, create an XML format that is specific to your needs and write a DTD or XML-Schema that describes it. If you need to translate it to someone elses' XML document format, a quick XSLT stylesheet will transform the document with a minimum of effort.

    Just my 2 cents.
    --
    std::disclaimer<std::legalese> sig=new std::disclaimer; sig->dump(); delete sig;
    1. Re:bollocks by graveyhead · · Score: 2, Informative
      I don't see how average office worker could ever do anything with XSL(T)
      An average office worker should NEVER have to deal with XSLT, and probably shouldn't even be messing with XML outside a visual editor that conforms to your DTD, ala programs like XML spy.

      The point is, if you have to translate to another format, you hire a developer to do it once, and the XSLT stylesheet that he/she develops can be reused again and again to transform documents. Maybe make a drag & drop script to do the transformation, or possibly a web based back end solution. You don't have to write a separate XSLT stylesheet for every single document, just once to support a required combination of input and output formats.
      --
      std::disclaimer<std::legalese> sig=new std::disclaimer; sig->dump(); delete sig;
  13. MS .doc / Adobe PostSript & PDF by PerlPunk · · Score: 5, Interesting

    All Microsoft needs to do is make their standard an open one (that can be used by others), like Adobe has done with their PostScript and PDF formats. Adobe has done quite well with their products based on these formats, too. Products like Adobe Illustrator and Photoshop (which works very well w/ bitmaps saved in PostScript) are the industry standard in digital art. If Microsoft followed a similar model, I'm sure that Microsoft Word will continue to be the industry standard in word processing software, and Microsoft as a business won't be any less richer for it.

    1. Re:MS .doc / Adobe PostSript & PDF by Sabalon · · Score: 2, Insightful

      And I can take that gif, jpg, psd or pdf and open it in another application, make changes, etc...

      Basically, I'm not forced to use the Adobe product.

      I'm sure that Microsoft realises this and would hate to let the users have a choice of what they can use. Why let them choose when they can almost be foreced to use the MS product.

      I dunno...maybe I've been hanging out on /. too long :)

    2. Re:MS .doc / Adobe PostSript & PDF by krammit · · Score: 2, Insightful

      Agreed. Truth is, even if they exposed ALL of the formatting properties available in Office documents via XML, they would still be the only product on the market to implement all of the formatting features completely. By the time anyone caught up, MS could extend the functionality further. It's one thing to own the standard document format, it's another to be the market learder with the only product that fully supports the industry's open standard.

      --
      "Watch your cornhole, bud."
  14. Re:Separating Content from Presentation a Good Thi by gorilla · · Score: 5, Insightful

    There is a big difference between seperating presentation from content and removing the presentation totally.

  15. Part of the concept by nhavar · · Score: 4, Insightful

    Isn't part of the concept of XML relating DATA and being able to seperate presentation from pure content. Isn't the additional concept of XML it's extensibility and adaptability for one group to use it differently than another? Because if not I've been using XML wrong for about 2 years now.

    This article makes it sound as if MS is doing something completely improper with XML (i.e. changing it's "standard"). But it seems to me that MS is simply separating content from presentation and relying on ????(something proprietary, xsl, more xml) to provide presentation. Just because they don't use the standard the same way you want them to doesn't mean that they are breaking the standard. I'm sure if you look at the XML that they output it's all standard XML. It also sounds as if they are not using any of the "tricks" that others have complained about (i.e. storing binary data in an xml tag).

    Instead of bitching about the problem maybe we should
    1) provide feedback if we are a beta tester
    2) wait for it to be released
    3) ready some tools to provide interoperability
    4) work harder on creating tools better than MS

    --
    "Do not be swept up in the momentum of mediocrity." - anon
    1. Re:Part of the concept by Qzukk · · Score: 2, Insightful

      This article makes it sound as if MS is doing something completely improper with XML

      And you know what, the article is absolutely right. Microsoft is doing something horrible with XML... using it for something it wasn't intended.

      When was the last time you saw a word document consisting of only data? No bold, italics, font settings, formatting, or any of that other "unwanted" presentation information. What would powerpoint be, without presentation metadata? A collection of words and images?

      Now, I haven't used the software, and the article doesn't mention how much is actually stripped out... is it basically a text dump bordered by ...? Does it include formatting tags (that may or may not be publicly defined)? Does it include some kind of tag for images and other embedded objects? Does it include markup for change tracking, annotation, and other Office features?

      I would reserve final judgement until I saw an .xml file generated by Word.

      --
      If I have been able to see further than others, it is because I bought a pair of binoculars.
    2. Re:Part of the concept by Fnkmaster · · Score: 3, Insightful

      Did you read the article? It's not about breaking a standard, it's about making a fucking USELESS file. If no formatting information is saved, it's no better than File->Save As Text. Clearly, separation of presentation and content is not unreasonable, and I think everybody would say they support that. But that's not what they've done. They have (at least according to the article, we won't know for sure till it's released) is eliminate the presentation data from their XML format. ELIMINATION of presentation makes the format useless for document exchange, and thus an essentially useless feature, period.

  16. Vague article doesn't have the details I want by jfmiller · · Score: 2, Insightful

    It is obvious that Office 2003 will not have a beautiful open standard the will interpolate with any piece of software. I find that unfortunate, but not unexpected. As the Oasis link points out, Microsoft is not really interested in letting its consumers out of the box of proprietary formats they are currently stuck in.

    The article is on the other hand very vague (probably because the information still isn't available) about what information is left in. My interest is no so much in being able to read OfficeXML documents, though as a WordPerfect user I would find this handy. What I am really interested in is if Word 2003 can in anyway be cajoled into being an authoring tool for already existing XML formats like DocBook. WordPerfect2000's support for XML is present, but clunky. My real hope was that Microsoft would offer a more useful solution, and to spite the bad rap about "presentation information" being removed, if other more useful information like 'heading,' 'strong,' 'table' etc. are still present, then I think it is a(n admittedly small) step in the right direction.

    JFMILLER

    --
    Strive to make your client happy, not necessarly give them what they ask for
  17. Separation of Content and Format? by budGibson · · Score: 2, Insightful

    Isn't the whole idea of XML to separate content from format? So, Microsoft is guarding the last mile from the software infrastructure (including their data format) to the user's brain (supplied by formatting). So, to use Microsoft's data format, I have to come up with my own styling. Isn't this what happens with rss and rdf already? Isn't this potentially a win? Couldn't an industry spring up using microsoft's data format and a set of styling sheets built to transform that format (ie, xslt).

    I sense some of the shock and outrage around this article is that people would like to be able to use excel as their data viewer, with an open file format that they could write to. What about simply treating excel as a data publishing system, perhaps even transforming its output to the more open standard developed by OASIS? This starts to consign excel to legacy that needs an adaptor.

  18. Re:No good to me... by wjsteele · · Score: 2, Informative

    Um... XP has been out for quite a while now... do you mean Office 2003???

    Anyway, I don't see what the fuss is all about... if everybody would care to read the article, it describes exactly what they are doing.

    Key Point: XML is for DATA. DATA, not formatting, XSL is for formatting. The content is stored in XML. The content (data) is what would be needed by other system. InfoPath (formally XDocs) also has content (in XML) AND formatting (in XSL.)

    It can use the XML from Word/Excel with ease... you should try it. By having the content stored in XML, it makes it very easy to take that Word/Excel Document and submit it to a Web Service for further processing.

    Bill

    --
    It's my Sig and you can't have it. Mine! All Mine!
  19. sometimes.. by siphoncolder · · Score: 5, Interesting
    I wonder if michael is testing us for stupidity, literacy, and actual technical knowledge of the issues.

    1) Take MS, make a report that says they did something bad, watch how many people flock to bash them DESPITE THE FACTS PRESENTED IN THE ARTICLE, which leads me to:

    2) How many people read the article? And of those people who DID, :

    3) How many of them know that XML is supposed to be a divorce of data from presentation? Why this comes as a shock to people is obvious - they didn't know that.

    The poster above who said "style sheets" - bravo. You couldn't have made a better point with two words.

    --
    i'm amazed that i survived - an airbag saved my life.
    1. Re:sometimes.. by bryanbrunton · · Score: 2, Insightful

      You must not be familiar with the Slashot business model:

      (1) Post Inflammatory (or sometimes Blantantly Unfactual) Story on Issue X
      (2) Get lots of hits from pro and anti-Issue X people
      (3) Get lots of hits from people who waste time informing everyone how ignorant the Slashdot editors are
      (4) Profit!

      Michael and CmdrTaco specialize in these stories. See CmdrTaco's recent post about SuSE "back away from UnitedLinux" to see an excellent example of this.

      It really comes down to all they want are page hits. They couldn't care less or are may too ignorant to care about things like journalistic integrity.

  20. This is great news by tsa · · Score: 2, Funny

    Many offices will soon have to upgrade their PC's and software to be able to use XP together with this new MS software. Apart from this being a Good Thing for the economy, this has another important side effect: the 2nd hand market will be flooded with PIII's and cheap Athlons. I was thinking of buying a new computer to make a nice Linux server but I guess I will wait until this new Office thing comes out.

    --

    -- Cheers!

  21. Re:Separating Content from Presentation a Good Thi by Captain+Large+Face · · Score: 3, Insightful

    I don't think this means that there is no stylistic information in the document, rather that the style information is contained within the proprietary code segment of the document.

    If Word documents all utilised the same style for various elements, it'd all be hunky-dory. However, users like their choice of a 50pt purple serif font for a title to stand, so the formatting information MUST be included with the document.

    Perhaps a better format would be a zipped file that contains seperate XML and XSL documents...

  22. Re:is it possible... by 4/3PI*R^3 · · Score: 2, Interesting

    If Micro$oft incorporates DRM into the proprietary file format they are under no legal obligation to document the format according to the antitrust settlement.

    If there is no documentation then any reverse engineering of the file format would be at least a violation of the EULA.

    In the worst case, since reverse engineering the format would allow a person access to a copy protected data set, this would be a violation of the DMCA.

    Did any of us really thing that B.G. hadn't thought this whole thing out years ago? He may be the scourge of the industry but he's not an idiot. B.G. doesn't do anything on the spur of the moment he plans everything.

  23. Content - Presentation = GOOD by Muggin · · Score: 2, Informative

    The point of XML is to seperate the presentation from the content anyway. If you add in formatting and what have you directly into XML you have defeated that purpose. That is why there is XSL and CSS. Those are the things you are supposed to use for the actual presentation and formatting.

  24. Two ways to look at it... by DigitalSorceress · · Score: 2, Interesting

    I'm sort of of two minds on this -

    On one hand, there are a lot of folks who have very strong opinions about the fact that the data should be separated from presentation... If Office 2003 were to strip the MS-apps-specific formatting (which is probably NOT very standards-friendly), but leave the style markings (heading, paragraph, footnote, etc...) then really, they would be providing a semi-structured document that conformed to XML standards.

    As a web application developer/web author, there have been many times when I have been given MS Word docs and Excel spreadsheets as content for our web site... In the past, I have resorted to copying the whole page directly onto a text editor (thereby scrubbing all formatting information) and then using HTML markup to make the document look much like the Word original, but without having to deal with that rather poor HTML output the Word and Excel's Save as HTML features produced. If I could have a semi-structured document, it would have been easier to write some macros to parse the XML structure to automate some of the rough formatting (hooks for stylesheets or somesuch).

    On the other hand, it seems to me that is might be in Microsoft's best business interest (the selfish ones) to make darn sure that it's not possible for OpenOffice fully interoperate with MS Office documents. I don't think they would be very smart (current business model-wise) if their new products (which will rapidly become de-facto business standards) helped to enable Open Office standards to take away their marketshare.

    In the final analysis, I probably wouldn't worry too much until there's a critical mass of people using it. By then, a bunch of folks will have figured out what CAN be done with whatever format MS ended up with. At that point, Office 2003's XML format will probably make it possible for people to do something they couldn't do before or at least, to do something easily that once was more trouble that it was worth.

    That's worth something...

    --

    The Digital Sorceress
  25. I have Office 2003 and this article is BS by Anonymous Coward · · Score: 5, Informative

    I have Office 2003 Beta 2 freshly downloaded from MSDN. This article is completely wrong. I did the following:

    1. Opened a heavily formated .DOC Word document with tables, multiple fonts, etc.
    2. Saved the document as XML.
    3. Opened up the XML document in Word and it looks EXACTLY like the original .DOC format.

    I also opened the XML file in a text editor and sure enough it contains complete formatting information.

    1. Re:I have Office 2003 and this article is BS by RanmaSan · · Score: 4, Informative

      It's not pretty, but it works:

      <?xml version="1.0" encoding="UTF-8" standalone="yes"?>
      <?mso-application progid="Word.Document"?>
      <w:wordDocument xmlns:w="http://schemas.microsoft.com/office/word/ 2003/2/wordml" xmlns:v="urn:schemas-microsoft-com:vml" xmlns:w10="urn:schemas-microsoft-com:office:word" xmlns:SL="http://schemas.microsoft.com/schemaLibra ry/2003/2/core" xmlns:aml="http://schemas.microsoft.com/aml/2001/c ore" xmlns:wx="http://schemas.microsoft.com/office/word /2003/2/auxHint" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:dt="uuid:C2F41010-65B3-11d1-A29F-00AA00C1488 2" xml:space="preserve"><o:DocumentProperties><o:Titl e>Some Centered Bolded Text</o:Title><o:Author>Mark McWilliams</o:Author><o:LastAuthor>Mar k McWilliams</o:LastAuthor><o:Revision>1</o:Revision ><o:TotalTime>2</o:TotalTime><o:Created>2003-03-13 T17:30:00Z</o:Created><o:LastSaved>2003-03-13T17:3 2:00Z</o:LastSaved><o:Pages>1</o:Pages><o:Words>10 </o:Words><o:Characters>57</o:Characters><o:Compan y>i-FRONTIER</o:Company><o:Lines>1</o:Lines><o:Par agraphs>1</o:Paragraphs><o:CharactersWithSpaces>66 </o:CharactersWithSpaces><o:Version>11.4920</o:Ver sion></o:DocumentProperties><w:fonts><w:defaultFon ts w:ascii="Times New Roman" w:fareast="Times New Roman" w:h-ansi="Times New Roman" w:cs="Times New Roman"/><w:font w:name="Tahoma"><w:panose-1 w:val="020B0604030504040204"/><w:charset w:val="00"/><w:family w:val="Swiss"/><w:pitch w:val="variable"/><w:sig w:usb-0="61007A87" w:usb-1="80000000" w:usb-2="00000008" w:usb-3="00000000" w:csb-0="000101FF" w:csb-1="00000000"/></w:font></w:fonts><w:styles>< w:versionOfBuiltInStylenames w:val="3"/><w:latentStyles w:defLockedState="off" w:latentStyleCount="156"/><w:style w:type="paragraph" w:default="on" w:styleId="Normal"><w:name w:val="Normal"/><w:rsid w:val="7765DB"/><w:rPr><w:rFonts w:ascii="Arial" w:h-ansi="Arial"/><wx:font wx:val="Arial"/><w:sz-cs w:val="24"/><w:lang w:val="EN-US" w:fareast="EN-US" w:bidi="AR-SA"/></w:rPr></w:style><w:styl e w:type="character" w:default="on" w:styleId="DefaultParagraphFont"><w:name w:val="Default Paragraph Font"/><w:semiHidden/></w:style><w:sty le w:type="table" w:default="on" w:styleId="TableNormal"><w:name w:val="Normal Table"/><wx:uiName wx:val="Table Normal"/><w:semiHidden/><w:rPr><wx:fon t wx:val="Times New Roman"/></w:rPr><w:tblPr><w:tblI nd w:w="0" w:type="dxa"/><w:tblCellMar><w:top w:w="0" w:type="dxa"/><w:left w:w="108" w:type="dxa"/><w:bottom w:w="0" w:type="dxa"/><w:right w:w="108" w:type="dxa"/></w:tblCellMar></w:tblPr></w:style>< w:style w:type="list" w:default="on" w:styleId="NoList"><w:name w:val="No List"/><w:semiHidden/></w:style><w:sty le w:type="paragraph" w:styleId="StyleBoldCentered"><w:name w:val="Style Bold Centered"/><w:basedOn w:val="Normal"/><w:rsid w:val="7765DB"/><w:pPr><w:pStyle w:val="StyleBoldCentered"/><w:jc w:val="center"/></w:pPr><w:rPr><wx:fon t wx:val="Arial"/><w:b/><w:b-cs/><w:sz-c s w:val="20"/></w:rPr></w:style><w:style w:type="paragraph" w:styleId="SmallTitle"><w:name w:val="Small Title"/><w:basedOn w:val="StyleBoldCentered"/><w:rsid w:val="7765DB"/><w:pPr><w:pStyle w:

  26. The authors of the article didn't bother to RTFM.. by malakai · · Score: 5, Informative

    The point of the Office 2003 "Save as XML" with the "Data Only" checkbox is _NOT_ a poor mans Save As XHTML. It's decide to allow the data of the document and pet placed into an XML document based on a schema. You literally can make your own schema file/XSD, and use a tool inside Word to map the elements of a Word document to elements of the schema. If you simply map a paragraph to a string you will lose formating. Unless of course you define in your schema how you'd like to store formating information. But that is generally an overkill.

    Think of a resume. you could define an XSD for a resume, and be able to save resumes against this XSD, as validated pure XML.

    Now, if you want to produce a document, using an XML syntax but want to combine both data and presentation, then you want WordML.

    WordML uses Word's own tags to markup the word document. I was going to show you an example of WordML but i don't feel like escaping allt he greater-than/less-than signs. Anyhow, WordML contains all the formating and everything necessary to display a Word document as it is supposed to look.

    I think this Open Office guy is looking for a devil in Office 11 that isn't there. That or he didn't read the friggin manual.

    -Malakai

  27. Wait a minute... by sheldon · · Score: 4, Insightful

    "has been so seriously crippled as to be useless to anyone but the big content management and collaboration system providers."

    That indicates to me that the problem is really that the document format is so complicated that it takes tremendous resources to understand and implement compatibility with it, as this implies that larger companies like say a Xerox will have no problem producing tools to work with it.

    So from a business consumer perspective this is still a tremendous win.

    This sounds like more whining from the open source crowd.

  28. Re:Separating Content from Presentation a Good Thi by djoham · · Score: 5, Informative

    This may be bad (keeping in mind the jury is still out on exactly how Microsoft is making this work) because in the case of office documents, the style is actually *part* of the content, from the perspective of Joe Office User.

    If Microsoft just puts the raw text data into a .xml file, then that .xml file is practically useless to anyone who wants to collaborate with the original author since all of the styling information is lost.

    As an example of a good way to do this (IMHO), take a look at how OpenOffice.org builds their files. When you make a .sxw (the default writer format) you're actually taking the raw data of the document, the styling rules for the document and a few other important bits and pieces and zipping them up into a single file.

    After unzipping this file, the following directory structure was exposed:

    content.xml
    META-INF/manifest.xml
    meta.xml
    mi metype
    settings.xml
    styles.xml

    With this type of design, you can get the best of both worlds. Technically, there is a separation between your presentation and content which allows simple programatic access to the data when necessary. At the same time, this design allows for full collaboration between people who also consider the styling of the data to be part of the content because the style rules for the content are included with the document.

    With xml-saved Office documents containing only data and no style, collaboration between non-office users (and apparently Win9x users as well) will be no better off than before. Perhaps worse, assuming the binary .doc, .xls etc formats have changed and will need to be reverse-engineered again.

    If this article is true and Microsoft has decided to remove the styling of their xml-saved office documents, I see two possible reasons for this:

    The first is obvious. You're not using Office? Ok, second class citizen, here's the data but in a format that is next to useless for you to use.

    The second possibility involves Microsoft just not being where they want to be with the Office XML sharing. Keep in mind that it took OpenOffice.org something like a year and half or so to define their XML interchange format. Microsoft may be going there, but due to overwhelming inertia, it just might not be going there very quickly.

    Personally, I think the first option is the most likely. However, with OpenOffice.org working with OASIS and others on a common XML interchange format, I'm hoping Microsoft will be forced by the marketplace into option 2.

    Best regards,

    David

  29. Look how surprised I am. by pete-classic · · Score: 2, Funny
    Microsoft will continue its efforts to lock-in users with proprietary formats[. . .]


    Look how surprised I am:

    :^|


    -Peter
  30. XSL and FO by StevenYelton · · Score: 2, Interesting
    I suppose what really needs to happen is they supply the XML document for the content and a style sheet for the presentation.

    It would also be nice to be able transform the XML via a provided XSLT into fo (FO at W3C and FOP). Then you could present the document as a PDF, RTF, Doc, Java applet, or whatever.

  31. Re:Separating Content from Presentation a Good Thi by Delirium+Tremens · · Score: 2, Insightful
    This is a probably a troll, but I'll byte.

    You seem to forget that, in the context of office programs such as Word, the 'content' is the sum of 'text' + 'formatting' + 'presentation'. You need all 3, or you do not have a workable document. Having 'text' only is not enough. We are not talking about being able to read a .doc file on your scrollable cellphone screen here. We are talking about interoperability between all major office suite producers.

  32. Proprietary Document Formats by Daimaou · · Score: 4, Insightful

    Proprietary document formats were fine at one point. Most people shared documents via printed paper, or shared them via "soft copy" within their own organizations. However, the time for printed documents and interoffice "soft copies" is over. We need the ability to share documents with the world in an easy to use, feature rich, and easy to edit format. Since a significant part of a document's legibility is in its style and formatting (or at least people are more apt to read a well formatted document over one which is not) text files are out.

    Once an easy to use, open document format is created, and the ability to read and write those documents is built into many programs, I think we will see an end of .DOC file attachements.

    While there are currently some "open" formats like PDF and PS, the problem is that they are not easy to create for the average user, nor are they easy to edit. While PDF may be a good format, we need something better.

    XML is a logical choice as a base for an open format because it is a well defined standard, it is text based, and is quite easy to parse.

    But I ramble.

  33. Re:Forget XML, doc, and other crap by TheRaven64 · · Score: 2, Informative
    Or, if you INSIST on something else, use rtf.

    Wow. Someone on /. suggesting we use an MS file format.

    For those of you that aren't aware, RTF is an 'Open' format created by MS. All native word files I've looked at ('97 and earlier) used an RTF derived format. The RTF spec is availible from Microsoft, and is the most obfiscated document I have ever had the misfortune of having to read (in the end I gave up and derived the format from the output of wordpad, it was easier).

    --
    I am TheRaven on Soylent News
  34. The article is blantantly wrong... by malakai · · Score: 5, Informative

    Read some other articles, or better yet get ahold of a beta and try it out. The authors of this articles will feel like schmucks when they realize what they missed.

    First off, by default, if you save the word document as XML, it gets saved as WordML,which preserves Word's styles and formatting in an XML name-space that's separate from the one bound to the schema-controlled data.

    If you check off the checkbox "Data Only" then you will lose all formating and your own XSD will be used to map this document into XML data.

    WordML looks like a XML'ified RTF language. It would be trival to create an XSL stylesheet that transforms WordML into HTML/CSS with all formating (that HTML is capable of) which directly mimics MS Word. OpenOffice could also eat WordML quite easily and have all the formating/style of Word.

    What the authors of this article are REALLY bithing at, is the fact that MS didn't buy into the OpenOffice Document Specification from OASIS. MS prolly sees OASIS as the US sees the UN. Defunct, not needed.

    If you describe your data using XML semantics, and all it takes to convert from semantic style A to B is some XSL, then who cares about forcing everyone to use one specific format.

    -malakai

  35. Re:Separating Content from Presentation a Good Thi by Delirium+Tremens · · Score: 2, Interesting

    Check your favourite HTML tutorial.
    Yes, good HTML is valid XML.
    Unlike your example, which is not even valid XML. But that's beside the point.

  36. WordML by malakai · · Score: 4, Informative

    If you "Save as XML" in Office 11, then by default the data is saved as WordML. WordML is an xml version of MS internal storage format (basically RTF). OpenOffice could quite easily write an interpreter for WordML. Hell, I could write an WYSIWYG editor for WordML in a day. If that. It's pretty simple if you understand the basics of RTF.

    It's only when you Save as XML with the "Data Only" checkbox that you get into striping formating (and rightly so). Word WARNS you about this. In addition, you can specify your own XSD to save to. And word will VALIDATE this for. Not to mention, you can use a word tool to map elements of Word documents to elements of your schema. DAMN COOL.

    In addition (As if that isn't enough) when you save, in either way, you have the option of specifiying a XSL style sheet. It'll go ahead and transform the output for you as part of the save.

    Then only thing the OpenOffice people are upset about is that MS didn't buy into the OASIS/OpenOffice Document Specification. Tough shit. I'll write them an XSL that'll work again WordML to solve that for them. Lazy bastards.

    -malakai

  37. REPEAT AFTER ME: XML IS NOT A FILE FORMAT by Trailer+Trash · · Score: 5, Interesting

    Internet World is reporting that initial reports from Office 2003 beta testers don't look good for those hoping to share documents with non-MS systems using the XML file format...

    That's because XML is not a file format, it is instead a format for file formats. To quote the O'Reilly "Learning XML" book, page 2:

    Note that despite its name, XML is not itself a markup language: it's a set of rules for building markup languages.

    I've said this many times on /. (look at my history), but the fact that a particular format is XML-based says nothing of your ability to read it. I'm even going beyond the fact that Microsoft could simply stick their traditional file formats into a CDATA and claim XML compliancy.

    The statement "If Microsoft used a standard XML format for their documents then anyone could read them" makes as much sense as an equally stupid statement like "If Microsoft just used 8-bit bytes in their file formats then anyone could read them".

    Sorry to rant, but the level of cluelessness around XML is astounding. Please read up, there's a ton of useful information on XML around the internet.

    MDC

  38. Save As XML = WordML by malakai · · Score: 5, Informative
    Taken from a real review of the XML/Office features:

    Once valid, the document can be saved as XML in two ways. The default is to create WordML, which preserves Word's styles and formatting in an XML name-space that's separate from the one bound to the schema-controlled data. You can optionally save through an XSLT transformation which, in a publish-to-the-Web scenario, could translate WordML formatting into HTML/CSS formatting. Alternatively, if you tick the Save as Data option, you can instead save just the raw XML data. In that case, you can bind one or more XSLT stylesheets to the document, each of which can generate WordML styles and formatting.


    InternetNews is authored by morons.

    -malakai
    1. Re:Save As XML = WordML by Hangtime · · Score: 5, Informative

      Same thing with Excel, you can save as Excel with formatting or not. This comes from the Excel XML with formatting. Quite simply the article is flamebait.

      <Style ss:ID="s26" ss:Parent="s16">
      <Borders>
      <Border ss:Position="Bottom" ss:LineStyle="Continuous" ss:Weight="1"/>
      <Border ss:Position="Top" ss:LineStyle="Continuous" ss:Weight="1"/>
      </Borders>
      <Font ss:FontName="Times New Roman" x:Family="Roman" ss:Size="12" ss:Bold="1"/>
      <NumberFormat ss:Format="_(* #,##0_);_(* \(#,##0\);_(* &quot;-&quot;??_);_(@_)"/>
      </Style>
      <Style ss:ID="s27">
      <Alignment ss:Vertical="Bottom"/>
      <Borders/>
      <Font ss:FontName="Geneva"/>
      <Interior/>
      <NumberFormat/>
      <Protection/>
      </Style>
      <Style ss:ID="s28">
      <Font ss:FontName="Geneva" ss:Size="12"/>
      <NumberFormat ss:Format="0.0"/>
      </Style>

      <Stuff in between here to get around Lameness filter>

      <Style ss:ID="s27">
      <Alignment ss:Vertical="Bottom"/>
      <Borders/>
      <Font ss:FontName="Geneva"/>
      <Interior/>
      <NumberFormat/>
      <Protection/>
      </Style>
      <Style ss:ID="s28">
      <Font ss:FontName="Geneva" ss:Size="12"/>
      <NumberFormat ss:Format="0.0"/>
      </Style>

  39. Real World Vs. Game by blahlemon · · Score: 5, Funny
    Truth be told the real disadvantage to this being the real world vs. a game is I can't set the level of difficulty to my liking...nor can I stop and speed up time.

    Or spy on other people from a God perspective. Damn you! Now I'll have to spend the rest of my day realizing how pathically small my scope is...

    --
    It take more faith to believe in evolution than it takes to believe in God
  40. Re:Separating Content from Presentation a Good Thi by sporty · · Score: 2, Insightful

    Nononono. Word is all about presentation of data. Some of the data IS the presentation. Writing, "The bullet points below" with a list of bullets below.

    Taking the presentation out of data would be like making PSD"s xml but putting the colour in some hidden away place. You'd have only the useless basics and nothign else.

    At least XLink the "presentation layer" you are imagining in, in a seperate resource file... ala XSL or SOMETHING.

    --

    -
    ping -f 255.255.255.255 # if only

  41. Re:Separating Content from Presentation a Good Thi by Azghoul · · Score: 5, Insightful

    Your use of the tired "Bzzzzt" exclamation at the beginning of your post completely overwhelmed any potential interest in whatever it was that you were trying to say.

    Please, next time try to avoid the condescending tone, people might respond more constructively.

  42. Once again, MS gets slapped by FUD by Planesdragon · · Score: 2, Interesting

    From the article:

    "XML and Web services use, especially for content-driven applications, is still very much limited to basic use of XML as a data-exchange mechanism between systems -- primarily for internal integration approaches," he said. "When dealing with exchanging information internally, what is most important is not to bundle all collaborative features into making for a huge, cumbersome XML file that only certain applications can process, but rather to strip out all the presentation layer features and focus on just the data to be exchanged. In this case, I don't see how Microsoft is violating that. You can choose to save a document with all the rich presentation data left in, if you choose (and that data will only be processable by Office applications), or you can choose to save the XML with just the data in it. I don't see how that cripples anything."

    If MS is doing XML right, an XML export from word will only mark the text file with the necessary handles to bind to the formatting file. If you open the text file without the formatting file, you get rather plain text.

    The same thing happened with MSHTML. Yes, it's got a lot of proprietary comments in it (the "" tags), but the CSS and formatting designations are as standard as the crude hacks and random idosyncracies that a human web designer may do.

    Plus, it's only an "early beta." I hope that the authors of the article send their comments to MS, so MS can expand on what their XML exports can do.

  43. Seems reasonalbe to me... by TheLastUser · · Score: 2, Informative

    XML is no place for presentation markup. That should be done with XSL.

  44. Goldfarb's conjecture by RobotWisdom · · Score: 4, Informative
    I think the point is that if you save to their XML specification, you will lose all your document formatting.

    I think the root of the confusion goes back to Golfarb's original theory for SGML-- that the styles in a document are secondary to the structures, and should be kept separate.

    This has been a religious conviction ever since, despite the fact that most authors are messy and intuitive, and SGML-etc are very, very rigid and unintuitive. The rationalisation is that messy authors can just represent their styles using 'fake' (ad hoc) XML, but if this turns out to be 90% of the real users of MS Office, then I think MS could indeed save valid XML, but it won't be portable in any useful sense.

  45. drm ? by Billly+Gates · · Score: 2, Informative

    Isn't office 2k3 suppose to support drm encryption? If so then this would make the file format useless since it will be encrypted.

    From a pure bussiness standpoint (not technical)a close proprietary file format is essential if you want consumer lock-in to keep prices sky high. If a competitor can write software that can read your files and format them proprerly then you lose your file format monopoly and would have to compete with everyone else.

  46. Did anyone RTFA?? by banka · · Score: 2, Informative

    the article says it all.
    to quote:


    However, Mark McWilliams, a software engineer and Office 2003 beta tester, said he has seen nothing to indicate that Office 2003 removes formatting information from files saved in .xml. He noted that he opened a heavily formatted .doc Microsoft Word file, saved the file as XML, and later opened the file in Word 2003.

    "The opened XML document looks exactly like the original .doc file," he said. "And if I open up the XML file in a text editor, I can see that all of the formatting is properly maintained in the XML file."

    He also noted that when saving a file, a user has the option of saving in a "data only" XML format which does remove formatting.

    1. Re:Did anyone RTFA?? by aluminum+boy · · Score: 2, Interesting
      I totally agree. In fact, a single XSLT would likely be required to convert the XP-XML into the OASIS model, with or without formatting! Sounds much more interoperable than it was previously.

      It would be nice if Microsoft used an open (OASIS) format, but it sure doesn't sound like people are locked into the format.

  47. And we would 'upgrade'.... WHY??? by GreenEggsAndSpam · · Score: 3, Insightful

    Without going into the evils of microsoft and it's office products, how there are better OS's and products out there on the market, I'd like to ask: WHY??? Office 95 to 97 was a substantial jump. 97 to 2000 was a fairly substantial jump. Stability, document abilities, general ease of use. Most people were happy with 2000. Stable, if large. 2000 to XP: Smaller install, activation / registration nightmare, some interface changes, but otherwise the application is the same. How documents are saved, their base format has been changed, yet to the end user this should be transparent. XP to 2003: What is the major differences? I mean... yes, it's going to be new, in a new year, but why would Joe Schmoe, Enterprise User (Or home user for that matter) want to shell out a couple hundred dollars per license where the increase in functionality will be limited? Increased document collaboration would be good, yes, but is it truly worth the cost? How many users don't KNOW how to use the advanced features? I work as a sysadmin at a plastics factory, and the majority of the users barely know how to use a keyboard. I've worked in an insurance company, where I had to teach the corelation between moving the mouse and the pointer on the screen moving. I've done the dot-com thing, with users wanting more but not using it properly. What are the odds that an entire company would be utilizing the software to it's fullest potential? And what percentage of a company would actually get an advantage out of using these features, compared to the time required to train an entire office? Half of it would backfire if some users didn't understand the base concepts, as most don't.. Thoughts?

    --
    When all else fails, use fire.
  48. On XML file formats.. by PeekabooCaribou · · Score: 4, Informative
    I realize this is redundant by now, but I think this is important enough to warrant a few duplicated posts. For Microsoft's XML format to be useful (and even worth implementing), it's going to require some advantages above and beyond what plain text formatting offers. The only completely useless XML format would be:
    <document>
    This is my document.
    Second paragraph.
    </document>
    I make the assumption that at least some tags are applied, such as some sort of paragraph tags and the like. I may be going out on a limb here, but I would even assume that their final XML format will produce documents identical to .doc files. I would also assume that I could pass this file off to Joe in marketing, and he would see a document identical to the one I saw. What I'm getting at is that style has to be held somewhere. If the XML file has no style associated with it, then congratulations, Microsoft, you did it right. But if Word can display the right formatting, then so can anyone else. (Assuming Word doesn't store the styles in a proprietary format, which I don't think is beyond them.) But why am I even writing this? From the article:
    However, Mark McWilliams, a software engineer and Office 2003 beta tester, said he has seen nothing to indicate that Office 2003 removes formatting information from files saved in .xml. He noted that he opened a heavily formatted .doc Microsoft Word file, saved the file as XML, and later opened the file in Word 2003, "The opened XML document looks exactly like the original .doc file," he said. "And if I open up the XML file in a text editor, I can see that all of the formatting is properly maintained in the XML file."
    Time will tell.
    --
    "I'll say it again for the logic-impaired." -- Larry Wall.
  49. Not surprising by failedlogic · · Score: 2, Interesting

    I'm not surprised by any means that MS would chose a proprietary standard. It seems MS over the years has made importing/exporting .DOC documents much harder; locking people to using their apps. Which for MS is from a business/revenue perspective is understandable. It seems Wordperfect has OTOH, been much easier to convert/import and most documents I've imported have been nearly flawless.

    Perhaps on a separate note, what format would be best to use to compose essays and large documents in non-corporate environment. I compose a lot of documents as a student and I require something that I can easily format and safekeep electronically for many years. Other than POT ( no, not that but Plain OLD TEXT ), would some form of XML be better or Tex/TeTex..... ? It would be nice to standardize everything to one format and not have to worry many years later about not being able to retrieve it.

  50. Uh ... what? by Osty · · Score: 3, Interesting

    Furthermore, Office's new collaboration featres will only work with users who are also running Office 2003 (requiring Windows 2000 or 2003) that are connecting over XP servers.

    Excuse me if I don't take this article seriously, but the author apparently knows nothing about Windows. Office 2003 will only work on Windows 2000 or 2003? Not Windows XP? Maybe he meant that the collaboration servers require Windows 2000 servers or Windows Server 2003 servers, since there is no XP Server. And speaking of XP, what exactly does he mean by "connecting over XP servers"? That's simply impossible -- there is no server version of XP, only Home and Pro.


    As for Microsoft not supporting Office on the obsolete Win9x platforms, good for them. It's past time for Win9x to be killed off once and for all. Not supporting it in Office is a good step forward.

  51. Duh. by Tony-A · · Score: 4, Insightful

    How do you figure this is anti-trust? Microsoft has been judged a monopolist. Since past behavior is a good indicator of future behavior, there is a presumption that this is anti-competitive behavior until proven otherwise.

    This is simply a company who has the dominant product protecting their lead.
    For a monopolist, nothing is simply any more. In the absense of market forces to correct misbehavior, exactly how they attempt to protect their lead does matter.

    And quite honestly, I dont see anything wrong with that, as long as they confine their practices to their product (ie. they arent making Office the only suite that can run on windows) [emphasis added]
    As long as nothing in the Office Suite promotes the Desktop OS monopoly.
    As long as nothing in the Desktop OS monopoly promotes their own Office Suite.

    But this isnt a game, this is business.
    And screwing your customers is bad business.
    And screwing your suppliers is bad business.
    And screwing your investors is bad business.
    And screwing your employees is bad business.
    Even screwing your competitors is bad business.

    And since businesses are SUPPOSED to make money, they need to make sure people continue to buy MS Office.
    And General Motors needs to make sure people continue to buy Chevrolets.

    And making an office suite that shares documents with all the various third-tier office suites just doesnt do that.
    It just makes incomprehensible gibberish unless the recipient happens to have the exact same sooper-dooper magic decoder ring. Unless I can read my stuff, under circumstances of my own choosing, I have a problem. Unless I can send stuff to my correspondents and they can read it un circumstances of their own choosing, I have a problem. If my documents are hostage to the whims of a supplier, I have a problem.

    Why should my company buy MS Office if the documents it produces are exactly the same as those of FreeBeerOffice?
    New twist on Clippy?
    No reason they should. That's Microsoft's problem, not yours or your company's (unless you work for Microsoft;)

  52. Re:Microsoft's new file format is: by nenolod · · Score: 5, Funny

    Oops, i forgot to set the reply to "Code". Please note, your SAX parser probably wont be able to parse this, heh. It is however, theoretically proper XML.

    <?xml version="1.0" standalone="yes" encoding="en">
    <!DOCTYPE worddoc [
    <!ELEMENT document (document_properties, document_section)>
    <!ELEMENT document_properties (title, author, organization, department, job, generalsummary)>
    <!ELEMENT title (#PCDATA)>
    <!ELEMENT author (#PCDATA)>
    <!ELEMENT organization (#PCDATA)>
    <!ELEMENT department (#PCDATA)>
    <!ELEMENT job (#PCDATA)>
    <!ELEMENT generalsummary (#PCDATA)>
    <!ELEMENT document_section (sectionsummary, proprietarybinary, unenhancedcrappytext)>
    <!ELEMENT sectionsummary (#PCDATA)>
    <!ELEMENT proprietarybinary (#PCDATA)>
    <!ELEMENT unenhancedcrappytext (#PCDATA)>
    ]>
    <document>
    <document_properties>
    <title>Crappydoc</title>
    <author>William H. Gates III</title>
    <organization>BORG</organization>
    <department>Unimatrix 0</department>
    <job>Secondary information processing adjunct</job>
    <generalsummary>Doc about crappy M$ things.</generalsummary>
    </document_properties>
    <document_section>
    <sectionsummary>Haha, you cant parse this and make it look perty, it's BINARY! You're still screwed!</sectionsummary>
    <proprietarybinary>firoiorfioeiojvonvonviniooiwnco ncooisoi39f940f9439 0f904390f94390fj904j90j3f09j4fj3490jf30jf040fj03j0 9fj9340fj043j90fj4903fj9043jfj0vjoirejvoojvoerjgoe jgojerogjoejoenmvotnhnoignoengotnhinringuinfi</pro prietarybinary>
    <unenhancedcrappytext>Hehe, doesnt this text just look ugly? I bet it does, if you arent using M$ WORD!</unenhancedcrappytext>
    </document_section>
    </document>

  53. Article Content Conflicts with /. Posting by pclark999 · · Score: 2, Informative

    I took the trouble to go to Internet World and read the ENTIRE article. The portion of the article quoted on /. clearly implies the informtion being received about MS is second hand. "REPORTS ARE [emphasis added] that when saving to XML, [Office 2003] strips out the presentation and formatting information...". The person quoted is a representative of the OASIS OpenOffice XML Format Technical Committee, so there is a definite risk of bias, particularly when coupled with secondhand information. The article goes on to quote someone who is actually is an Office 2003 beta-tester. He claims that saving in an XML format does not, in fact, strip out the formatting, and states the tests he ran to confirm this. The source of confusion may be in different XML formats supported by Office 2003. There are two, one of which strips out all of the formatting information, while the other does not. A lively debate then ensues between the pros and cons of both approaches.

  54. This may be a stupid question by nicotinix · · Score: 2, Interesting

    but can we not write a little add-on program to word/excel/powerpoint to allow File->Save As->Openoffice.sxw or File->Export->Openoffice.sxw???

    I am not a programmer, so I don't know how feasible that is. I know I would download and install something like that.

  55. Jesus TapDancing Christ by gsfprez · · Score: 2, Insightful

    people who are users like me just want a fscking file that i can open with Word, with OOo, with iWrite.. whatever... and then send it to other people. If it requires the use of pixie dust or ass cream - so long as it works, that's all anyone wants.

    Relgious zeal with XML content being separated doesn't MEAN SHIT to users. And it doesn't get me anywhere when the fact remains that when i send in my busines proposals to the government, they want it in Word-97 .doc format. Like i can even buy fuscking Office 97....

    wankers. However you want to make an open format - be our (the Joe Salesdepartment) guest... until there is something which is universal (.doc and .pdf) and editable (.doc only) we're stuck realistically with .doc... as bad as it is.

    --
    guns kill people like spoons make Rosie O'Donnell fat.
  56. interoperability should not depend on font size by cyril3 · · Score: 2, Interesting
    I thought xml was supposed to describe data so that an application could do what it wanted without having to follow the rules of the sending app. eg in finance, there will be a set of industry standard tags that an accounting program will set to data in an xml file and if i get that file my program will understand the tags and use the info in ways i want.

    it's not limited to allowing me to dispaly a word document of a report in open office and have it look exactly the same. I want to be able to import the xml file and have my analysis software know that a particular record is an non-current asset etc.

    who cares what font was used. Interoperability should not depend on font size or colour.

    It's precicesly the Microsoft specific bits of a file that should be stripped out. If a display property is only available on a ms platform then the xml file should not contain them.

    The big if is wheter ms takes out more and leaves the xml file unusable because there is insufficient description.

    What you will find is that industries and user groups will begin to define xml schema for their data. WP will be different but xml will still have a place.

  57. Stop bashing because of Bullshit! by Alex_Ionescu · · Score: 2, Insightful

    I think its about time someone points out that this article, or whoever those "testers" are, are full of sh*t, or have serious problems using a computer.

    Saving in XML format keeps 99% of all the formatting in a .DOC document. I saved a 20-page research with all kinds of pictures (stretched/cropped etc) and using bullets, italics, bold text, different sizes and fonts. I re-opened the document with Word, and it looked just like its .DOC counterpart.

    *FURTHERMORE*, Microsoft has even added an option called "Data Only", which will save only the Data itself in the XML file (-as the format was MADE FOR-). You can then choose to append an XSL file for the format.

    MS pleases both sides, both the strict-XML-Data-Only group, as well as the maximum-openness group, and yet over 550 post are complaing about an article with no substance. I don't love MS, but don't bash them for something they've done right.

    The XML saving feature in Word is flawless and semes to be standard-compliant. Any XML reader should be able to display the document properly, under any OS.