Slashdot Mirror


Tim Bray on Microsoft Office

jgeelan writes "The co-inventor of XML, Tim Bray, has been talking about the newly XML-enabled version of Microsoft Office, code-named 'Office 11' and tells XML-Journal that 'when the huge universe of MS Office documents becomes available for processing by any programmer with a Perl script and a bit of intelligence, all sorts of wonderful new things can be invented that you and I can't imagine.'"

495 comments

  1. Yay Evil Monopoly Of Doom! by Sneftel · · Score: 3, Interesting

    Wow, I was way off when I predicted that Microsoft would further obfuscate their Word format. This seems to be in all respects a Good Thing.

    StarOffice has used XML for their native file formats for some time now; I wonder if this means we'll see an even better-quality translator between the two formats?

    --
    The opinions stated herein do not necessarily represent those of anybody at all. Deal with it.
    1. Re:Yay Evil Monopoly Of Doom! by OrangeSpyderMan · · Score: 3, Insightful

      Wow, I was way off when I predicted that Microsoft would further obfuscate their Word format.

      They won't have to. Since they are going the SQL server way for their filesystem, they can happily give away the hold they have on file formats, since they are going to have a stranglehold on accessing those files. You want an open file system? Here you go (and MS has a lot to gain by doing this - they instantly give Word access to most other data formats) - but don't think anything other than a microsoft OS will actually be able to access the files - thanks to our new deliciously obfuscated method of storing data on a disk. Reverse engineering kernel level SQL data (how a bit of crypto, for DRM of course, thrown in) will probably be even harder than reverse engineering file formats was. And impossible to do legally (say hi to all those DMCA guys out there.)

      --
      Try NetBSD... safe,straightforward,useful.
    2. Re:Yay Evil Monopoly Of Doom! by Jeremiah+Cornelius · · Score: 4, Insightful
      I don't beleive any of this crap is goingto happen from MS. Not for a New York second.

      You'll be DMCA'd out of the loop for trying, and the format will validate itself with 'Palladium' features in software, or some such.

      However, the mind reels at the idea of managing PowerPoint and Excel files from emacs!

      --
      "Flyin' in just a sweet place,
      Never been known to fail..."
    3. Re:Yay Evil Monopoly Of Doom! by foniksonik · · Score: 1, Offtopic

      Mod this up. Pretty insightful look at MS approach. The kernel level SQL part especially... look out for the dangerous bits of Long Horns, eh?

      Good stuff.

      --
      A fool throws a stone into a well and a thousand sages can not remove it.
    4. Re:Yay Evil Monopoly Of Doom! by tonywestonuk · · Score: 5, Insightful

      So, what happens when somone want's to email an XML enabled Word document...... Does it somhow become encrypted on its way out of the database, remains scrambled on it's way over the internet, and reassembles itself into nice XML once it arrives on the recepients computer?.... Doesn't sound like XML to me?!

    5. Re:Yay Evil Monopoly Of Doom! by jsse · · Score: 5, Funny

      I don't beleive any of this crap is goingto happen from MS. Not for a New York second.

      Dark-masked B.Gates approaching you:
      "I find your lack of faith....disturbing."

    6. Re:Yay Evil Monopoly Of Doom! by DNS-and-BIND · · Score: 2, Troll

      They'll simply add "features" to XML, enabling a Microsoft extension of the standard. The new MSXML will by copyrighted by Microsoft.

      --
      Shutting down free speech with violence isn't fighting fascism. It IS fascism!
    7. Re:Yay Evil Monopoly Of Doom! by thelen · · Score: 5, Insightful

      Okay, so it'll be harder to mount a windows partition effectively, but this doesn't affect transmission of documents, especially if they're stored in an XML format. As for me, I think it's more valuable to have files that I can read outside of their native filesystem rather than have a readable filesystem filled with unreadable files.

    8. Re:Yay Evil Monopoly Of Doom! by passthecrackpipe · · Score: 5, Insightful
      No you were not. MS routinely uses XML to encapsulate (proprietary) binary data. In the case of the MSOffice file format, this is especially true, but to a lesser extent this also goes for stuff like BizTalk etc (that has a terrible license attached to it). If Ms is *really* serious about using open formats, and using XML in their Office suite, they should put their money where their mouth is and join in the OpenOffice File format project. Most of the opensource players are working their already, and the EU is also set to join. I assure you that mature participation of Microsoft would be very welcome.

      Of course, this will never happen. Instead, MS will continue to push their own "open" XML based file formats. Microsoft Kerberos, anyone?

      --
      People who think they know everything are a great annoyance to those of us who do.
    9. Re:Yay Evil Monopoly Of Doom! by Anonymous Coward · · Score: 0

      But then you say "belittle", even though you wouldn't say "belot".

    10. Re:Yay Evil Monopoly Of Doom! by Anonymous Coward · · Score: 0

      ...

    11. Re:Yay Evil Monopoly Of Doom! by Anonymous Coward · · Score: 2, Insightful

      No, and 8bit binary files .DOC files don't just become scrambled either.

      XML don't make things easier to parse, you still have to figure out what means, just as you would have to figure out 04 07 in a binary file.

    12. Re:Yay Evil Monopoly Of Doom! by Anonymous Coward · · Score: 0

      you wouldn't say belot because regardless of how you spell it, it ain't a word.

      A little and alot are both, uh, words, or whatever the word is for two words that make up one word. compound word? binary word? wordlet?

      People love to verb nouns.

    13. Re:Yay Evil Monopoly Of Doom! by Anonymous Coward · · Score: 0

      Totally agree
      mod it up one more! :)

    14. Re:Yay Evil Monopoly Of Doom! by JohnFluxx · · Score: 2, Informative

      XML does support encryption of its data...

    15. Re:Yay Evil Monopoly Of Doom! by OrangeSpyderMan · · Score: 3, Interesting

      It will indeed be harder to mount the partition. It may also be harder to use that XML data, since what we may be talking about is XML encapsulation of binary, proprietary, encrypted file formats. Don't necessarily think you're going to receive at the other end a plaintext file with a few tags - what you will receive will have been through a closed kernel "request" to an encrypted database "filesystem", a proprietary DRM system (hardware and software) - and you genuinely believe there just gonna bang it out as plaintext at the other end?

      --
      Try NetBSD... safe,straightforward,useful.
    16. Re:Yay Evil Monopoly Of Doom! by Anonymous Coward · · Score: 0

      Exactly.

      I'd edit them in vi.

      .
      .
      ~
      ~
      ~
      :

    17. Re:Yay Evil Monopoly Of Doom! by Anonymous Coward · · Score: 0

      Wow, you guys are truly masters of pessimism and negativity. True cynics at their finest.

    18. Re:Yay Evil Monopoly Of Doom! by AirFax · · Score: 1

      Office should support all versions of Windows. At least starting from ME.

    19. Re:Yay Evil Monopoly Of Doom! by vandemar · · Score: 2
      Dark-masked B.Gates

      Also known as Darth Fences of Microsith, sworn enemy of the Jedix, and commander of the NTie fighter squadrons.

    20. Re:Yay Evil Monopoly Of Doom! by fitten · · Score: 1

      Database file systems weren't dreamed up by Microsoft, afaik. I think there are several platforms that have been hinted at having dbfs in the future plans.

    21. Re:Yay Evil Monopoly Of Doom! by Perl-Pusher · · Score: 3, Insightful
      There is also the fact that microsoft loves to put stuff in their Eula. I can also imagine anyone producing a reader for the "encrypted XML" running afoul of the DCMA.

      "Doesn't sound like XML to me?!"

      Sure it is! It's XML with Microsoft Security Extensions!

    22. Re:Yay Evil Monopoly Of Doom! by archen · · Score: 1

      Wow, I was way off when I predicted that Microsoft would further obfuscate their Word format. This seems to be in all respects a Good Thing.

      Wait until you have to buy a new version of MS office to read them. Okay, maybe I'm too much into conspiracies, but I don't think it matters anymore to MS. They have enough dominance that they can push this "subscription" model onto office. This means that it doesn't matter if they don't update the software, or improve their software in any way. You pay regardless. Changing their format just one more time is all they need if they can get everyone to into the subscription model. (And we all know how the forced upgrade works because older versions can't read the new format)

    23. Re:Yay Evil Monopoly Of Doom! by Anonymous Coward · · Score: 0

      Kerberos is a domain level authentication protocol, it has nothing to do with XML. Get your facts straight to avoid sounding like an idiot.

    24. Re:Yay Evil Monopoly Of Doom! by Anonymous Coward · · Score: 0

      Finally, someone with some sense. Yes, you will be able to get your documents out of the database/fs and store them as formats lesser OSes, such as Linux will be able to access. Only a completely paranoid moron would believe that you could only access your files on your one machine. But, seeing as that makes up the marjority of the Linux community, I can see how that would be a prevelant thought.

    25. Re:Yay Evil Monopoly Of Doom! by passthecrackpipe · · Score: 1

      Hmm, I think you are the one sounding like an idiot. It is well known that when MS implemented Kerberos within active Directory, they did so with several non-standard, proprietary extensions, in the same way they are doing with the Office XML file format. Have you ever opened an Office XML file in a text editor? Have you ever tried Active Directory Kerberos interoperabillity? Thought not.....

      --
      People who think they know everything are a great annoyance to those of us who do.
    26. Re:Yay Evil Monopoly Of Doom! by Anonymous Coward · · Score: 0

      I agree with the other poster, you really do sound like an idiot with your "Of course... blah blah" karma whoring. I almost signed up as a 'fan' of yours until I saw that.

    27. Re:Yay Evil Monopoly Of Doom! by Anonymous Coward · · Score: 0

      Just two words... err... one word and one acronym:

      TCPA and Palladium

    28. Re:Yay Evil Monopoly Of Doom! by OrangeSpyderMan · · Score: 2

      Indeed, but then neither were OSes, word-processors, spreadsheets, browsers... etc Never stopped Microsoft trying to bolt down their implementations of each one as hard as they can. The problem isn't with "dbfs" - it's a damn good idea, it just that it also happens to be a damn good way of obfuscating the data if you don't want to play fair, and past experience has proved that MS don't. I am more than willing to be proven wrong.

      --
      Try NetBSD... safe,straightforward,useful.
    29. Re:Yay Evil Monopoly Of Doom! by passthecrackpipe · · Score: 2, Funny

      OH NO!!!! An anonymous poster, who I don't know and will never meet has decided to NOT pledge his/her alliance with me on an Internet-based forum. I am shocked! My self-esteem has plummeted! How will I survive this massive blow to my ego!?!

      --
      People who think they know everything are a great annoyance to those of us who do.
    30. Re:Yay Evil Monopoly Of Doom! by Old+Wolf · · Score: 1

      Why you?

    31. Re:Yay Evil Monopoly Of Doom! by vluther · · Score: 1

      so you're saying I won't be able to use scp to copy said file to linux box ?...just because a txt file created on an ntfs partition can't be written to while it's still on said ntfs partition, doesn't mean it can't be written to when moved to a different filesystem...

      or am i missing something even more sinister ?

    32. Re:Yay Evil Monopoly Of Doom! by Abreu · · Score: 2

      There is already a msxml.dll file in windows... It is used by Internet Explorer to parse xslt documents in a non-standard way, driving me nuts ...and forcing me (again) to program websites for two (sometimes three) platforms instead of one.

      --
      No sig for the moment.
    33. Re:Yay Evil Monopoly Of Doom! by dwgranth · · Score: 1

      yup... i cant see that happening... this is more likely from MS CDS Link ;)

    34. Re:Yay Evil Monopoly Of Doom! by GargoyleMT · · Score: 1

      Me, a name I call myself, Far, a long, long way to run
      Sew, a needle pulling thread, La, a note to follow So
      Tea, a drink with jam and bread, that will bring us back to Do...

    35. Re:Yay Evil Monopoly Of Doom! by darien · · Score: 0, Offtopic

      I say "bequiet."

    36. Re:Yay Evil Monopoly Of Doom! by badhack · · Score: 1
      If you were going to pick two numbers, I would have at least picked 47 46.


      And wouldn't XML be easier to parse? It's well documented, libraries (for linux) readily available, and it's highly organized.


      badhack

    37. Re:Yay Evil Monopoly Of Doom! by Anonymous Coward · · Score: 0

      So, what happens when somone want's to email an XML enabled Word document......

      To you it looks like an e-mail attachment.

      To me, it looks like http://WheresYourLicence.passport.microsoft.com?Ge tDoc.aspx?key=xxxxx

    38. Re:Yay Evil Monopoly Of Doom! by OrangeSpyderMan · · Score: 2

      What makes you think that a file system will just support SQL request from a client. Because it uses SQL engine? Christ you guys have learnt nothing from MS's behaviour over the last 10 years. At what stage have they ever modified anything to help 3rd party products integrate? One example, anyone? You think all that data is just gonna be sitting there, freely accessible to anyone who can write an SQL request? Palladium, anyone? The DMCA was designed from the ground up to make it worthwhile for large companies to implement encryption and DRM. What you describe is not a 'dbfs' but a database, and for it to work that way, you'd still need a filesystem to organise data physically on the disk. That is not what this is about - this is about using a db engine (not a db) to organise and access the physical data on the disk. This means the db ain't just gonna be a db, but it's going to have to implement whackloads of fs stuff too, just to write the data to the disk, and the chances of MS not taking this opportunity to obfuscate data are very very slim indeed.

      --
      Try NetBSD... safe,straightforward,useful.
    39. Re:Yay Evil Monopoly Of Doom! by Anonymous Coward · · Score: 1, Insightful

      First of all, why would Microsoft bother creating an XML file format if it was just an "encapsulation of binary, proprietary, encrypted file formats"? What would be the point? A PR move to say that they use XML? Having not seen what Office 2002 generates in the way of XML, I can't really say how obfuscated it is or isn't; however, I can't think of any reason to adopt an XML format if it wasn't at least a little more open then the binary file formats they've been using.

      Also, how would a "binary, proprietary, encrypted file format" fit into everything else Microsoft is doing with .NET? Wouldn't Microsoft want the content of a document to be open enough so that it could be read and processed by applications using .NET's XML libraries? If they're going to sell the whole .NET XML concept, it would be a big advantage to say that you can process documents generated by the Office suite.

      Explain to me why Microsoft would want to prevent you from sending your self-generated Word documents to another computer? What possible sense does this make? Is it because they hate their customers and want to piss them off so they won't use Microsoft products any more? Has RedHat paid Microsoft to include technology that will piss off all Windows users?

      The whole point of Palladium is that the content provider chooses how the content can be distributed. Microsoft has no interest in protecting documents you've generated yourself. Palladium in and of itself doesn't do anything; it's the content providers (audio, video, software, and hardware providers) that will make the thing fail if they make the content controls too restrictive. I don't have a problem with the content industry trying it out as long as they're up front about any restrictions on the content itself. However, I do have a problem with making it illegal to reverse engineer and break encryption, but that's a different story.

      They've got their faults, but in the end both the content companies and Microsoft are businesses. They've got to respond to their customers (us), otherwise we'll go elsewhere. It's free as in country.

    40. Re:Yay Evil Monopoly Of Doom! by donutello · · Score: 5, Informative

      What a bunch of pseudo-technical garbage!

      I have a Masters in Computer Science with a focus on databases and storage technology and very little of what you said makes any sense to me. There's nothing easier than getting at data stored in SQL. Where I work, we've shipped a few products where we didn't document the schema because it was too complex and we didn't feel we could support it. Within weeks, almost all of our major customrs had it reverse-engineered anyway. SQL is very easy to get at!

      kernel level SQL data

      There's no such thing. SQL data is stored in tables. You use queries to get at it. Period.

      Also, your story doesn't make any sense. The article says Office 11 is in Beta already. IIRC, the SQL Server and Palladium stuff in the OS doesn't come until Longhorn. Do you think they will actually release a version of Office which won't work until their next OS (who knows when that will be) is released and adopted? How will they make money off all the people who recently upgraded to Windows XP then?

      --
      Mmmm.. Donuts
    41. Re:Yay Evil Monopoly Of Doom! by Anonymous Coward · · Score: 0

      where have you been the last 20 years. Do you actually think MS is just going to give away the keys to the castle. There is going to be a catch. And I ma not a linux zealot just a casual observer.

    42. Re:Yay Evil Monopoly Of Doom! by FatherOfONe · · Score: 4, Insightful

      Not that I totally disagree with your point, but with ".Net" people will be discouraged, or it will be far more difficult to send the actual document. My guess is that some future version of Office will default to "Send the shortcut".

      Now they of course will change Office for the Mac to read from those servers... The data WILL be stored in XML on those servers, so coders will have an easy time with it.

      You bring up an interesting point about paranoid people and Microsoft. I have followed Microsoft fairly closely over the last ~18 years and feel comfortable saying that they have never worked with any "standard" out there. They have ALLWAYS developed their own. Can you name an example of any "standard" software technology they have adopted and not changed? A perfect example of this would be ZIP. Why doesn't Microsoft use it instead of CAB files? There are many many more I could use as examples if you would like.

      Microsoft has an internal saying "If it is not ours destroy it".

      My point is this. A company that has for 18 years been trying to lock people in to their technology, will cause some people to be a bit paranoid.

      --
      The more I learn about science, the more my faith in God increases.
    43. Re:Yay Evil Monopoly Of Doom! by Anonymous Coward · · Score: 0

      we didn't document the schema because it was too complex and we didn't feel we could support it

      Within weeks, almost all of our major customrs had it reverse-engineered anyway.

      You might wanna switch businesses, mate. If you couldn't document something you implemented because it was too complex, but your customers could reverse engineer it in weeks, you've either got the brightest customers around, or the dumbest engineers.

    44. Re:Yay Evil Monopoly Of Doom! by sketerpot · · Score: 2, Insightful
      Sure it is! It's XML with Microsoft Security Extensions!

      That reminds me of something that MS has been doing for quite a while now: the file type reported for any HTML files is "Microsoft HTML file" (your system may vary). Will XML become Microsoft XML? I hope not.

      If everything about this really is kosher, though, then everybody give a great big "Thank You!" to MS!

    45. Re:Yay Evil Monopoly Of Doom! by Anonymous Coward · · Score: 0

      I feel another embrace and extend coming on..

    46. Re:Yay Evil Monopoly Of Doom! by spitzak · · Score: 3, Insightful
      why would Microsoft bother creating an XML file format if it was just an "encapsulation of binary, proprietary, encrypted
      file formats"? What would be the point? A PR move to say that they use XML?


      YES! Now you are starting to get it!

      I can't think of any reason to adopt an XML format if it wasn't at
      least a little more open then the binary file formats they've been using.


      How about for a "PR move to say they use XML". In addition it is obvious how to make an XML that is exactly as obscure, by putting the entire contents of the old format into a binary block.

      Also, how would a "binary, proprietary, encrypted file format" fit into everything else Microsoft is doing with .NET? Wouldn't Microsoft
      want the content of a document to be open enough so that it could be read and processed by applications using .NET's XML libraries?


      No, of course not. You would only read Word documents with the special "read a Word document" interface. It might use the XML libraries underneath, but big deal. Be assurred you will be unable to reconstruct all the contents of the document by any kind of perverted arrangement of calls to the "read a Word document interface". (though not just a complaint abount MicroSoft, I think .NET, DCOM, CORBA, KCOP, etc all pervert the idea of "object orientation" by making elaborate communcation protocols which are only "object oriented" because they call some part of the protocol an "object". Real object-orientation means there is some commonality of functionality, and the only instances I can think of that really work are the original Unix where everything known then (terminals, printers, tapes, disks) used the same read/write/seek calls, and Plan9 which tries to extend this to networks and file systems).

      Explain to me why Microsoft would want to prevent you from sending your self-generated Word documents to another computer? What possible sense does this make? Is it because they hate their customers and want to piss them off so they won't use Microsoft products any more? Has RedHat paid Microsoft to include technology that will piss off all Windows users?

      Ha ha, very funny. Of course you will be able to send a Word document to another computer. It will still be an unreadable Word document. If they can obfuscate things so that the destination computer also has to be running Windows, all the better. You seem to be under the weird delusion that "other computer" meant "other computer running Windows" when in fact I'm sure every other poster here knew it meant the exact opposite, ie "other computer not controlled by MicroSoft".

    47. Re:Yay Evil Monopoly Of Doom! by Anonymous Coward · · Score: 1, Funny

      Also, how would a "binary, proprietary, encrypted file format" fit into everything else Microsoft is doing with .NET?

      It would help binary, proprietary, encrypted MS apps interchange data within a framework.

    48. Re:Yay Evil Monopoly Of Doom! by Anonymous Coward · · Score: 2, Insightful

      No, of course not. You would only read Word documents with the special "read a Word document" interface. It might use the XML libraries underneath, but big deal. Be assurred you will be unable to reconstruct all the contents of the document by any kind of perverted arrangement of calls to the "read a Word document interface".

      I think you are getting near the point.

      A big problem with MS Office right now is that the file formats are such a mess that Nobody can parse the documents without MS Office, and that includes Microsoft.

      If MS wants to get into the content/groupware market, they NEED server-side processing that doesn't rely on running a single-threaded 15MB WINWORD.EXE process.

      Using an XML format allows Microsoft to build a clean C# component implementation of "Read a Word Document", or a "Save SQL Server Data As Excel" without being fucked by their own file formats.

    49. Re:Yay Evil Monopoly Of Doom! by Planesdragon · · Score: 1

      One example, anyone?

      How about two?

      1: Having a 'shell' line in windows so a savy user can pick their own shell.

      2: XP service pack 1, with its "set program access and settings" button.

      the chances of MS not taking this opportunity to obfuscate data are very very slim indeed.

      MS doesn't have a real good reason to take extra opportunity to "obfuscate" anything. They can just not worry about compatability when they don't need to, and the "obfuscation" will take care of itself.

    50. Re:Yay Evil Monopoly Of Doom! by Anonymous Coward · · Score: 0

      Oh, please. Microsoft is perfectly willing to let you integrate with their stuff, IF you bought a licence.

      The one big reason that MS has taken over in business is that they don't sell applications, they sell programming APIs, complete with planned obsolesce. If Yukon doesn't provide a 100 new ways that you can lock your application into Windows, then it has fundementally failed from MS's point of view.

      In short, your hatred of MS has blinded you from figuring out how they actually operate. Integration IS their key selling point.

      One example, anyone?

      You mean besides Windows (always a free SDK), Office, and Internet Explorer? Dumbshit.

    51. Re:Yay Evil Monopoly Of Doom! by Anonymous Coward · · Score: 0

      Upgrade your MSXML. The newer versions are compliant, which makes sense because the XSLT spec is largely a MS invention.

      (In fact the old non-standard MS XSL processor is depreciated, so you shouldn't be writing stuff against it that will break in the next version.)

    52. Re:Yay Evil Monopoly Of Doom! by Abreu · · Score: 2

      Thanks for the info, I will investigate

      --
      No sig for the moment.
    53. Re:Yay Evil Monopoly Of Doom! by pizza_milkshake · · Score: 2

      I agree that MS will find some way to make it just as hard, of not harder, for non-MS apps to read from and write to Word docs. I was actually thinking they'd use lawyers -- they'll copyright or trademark the format (they've talked about doing this in regards to .NET services) and try to sue anyone that writes software that uses it. My $.02

    54. Re:Yay Evil Monopoly Of Doom! by spitzak · · Score: 2
      I would agree that their current format is too much of a mess. However I suspect they will not learn from this and will make a .NET interface rather than publishing low-level details of the file format. They will not do this because of some evil plan, but because of an actual misguided belief that they are making things easier with the high-level interface. Eventually the implementation in .NET will become such a mess that they will have to replace it again.

      I do find it shocking how myopic the MicroSoft defenders who post here are. They are convinced that a .NET or VB interface that runs only on Windows somehow makes the file format "open" and thus fail to see anything wrong with these .NET solutions. Sorry, if that was true then the fact that you can run Word on Windows and read the file would also define it as "open".

      "open" means I can interpret the bits without any proprietary software. If they want to provide some convinience routines to make it easier, that is fine, but there should not be a requirement to use these routines.

    55. Re:Yay Evil Monopoly Of Doom! by lostchicken · · Score: 3, Insightful

      XML can be whatever you want it to. XML does have standarads, but just standards for wrapping data with control codes, not what the control codes mean.

      While StarOffice may use an XML word processing format, it won't be what MSFT will use.

      --
      -twb
    56. Re:Yay Evil Monopoly Of Doom! by EugeneK · · Score: 1

      "the XSLT spec is largely a MS invention."

      How do you figure that? The first XSLT draft lists only one MS employee as a contributor and neither of the two editors work for MS - one's Jim Clark, don't think he ever worked for MS, the other, Stephen Deach was working for Adobe.

    57. Re:Yay Evil Monopoly Of Doom! by hondo77 · · Score: 2

      Who said Microsoft is interested in using open formats? They're interested in using XML--not necessarily the same thing. What business reason is there for Microsoft to join the OpenOffice Source Project? They're the market leader, everybody else has to worry about working with them, not the other way around.

      --
      I live ze unknown. I love ze unknown. I am ze unknown.
    58. Re:Yay Evil Monopoly Of Doom! by Anonymous Coward · · Score: 0

      'There's no such thing. SQL data is stored in tables. You use queries to get at it. Period.'

      But only when there is a SQl server running. where will it be running? Oh yea, it is part of the OS, Windows to be exact. So, could you explain to me how on earth I am going to be able to mount my windows partition when dual-booting Linux?

      To state things differently, It is easy to get information out of a SQL database when the server is running, how easy is it to get data out when the server is down? Answer - hard, very hard.

      The only way that I see this working easily is if the filesysten actually doens't change, but windows will simply use queries internally to acess files, and the OS will simply save the files (in the form of databade tables) in a ntfs or fat32 filesystem.

      My advice to you : stick with your area of expertise - databases - and leave the engineering of OS's to someone who knows what they are doing.

    59. Re:Yay Evil Monopoly Of Doom! by Baikala · · Score: 1

      I know what you mean.
      A few weeks ago 2 coworkers an I where installing the lastest version of our ERP in a w2k cluster. Beliveit or not, on the "3th party software checklist" a fully patched (an licensed ofcourse) copy of Office w2k/xp was required just for a few excel reports.
      That put us 3 days behind schedule because our PHB didn't what to proceed without the actual office license!

      --
      16,777,216 comments ought to be enough for any forum!
    60. Re:Yay Evil Monopoly Of Doom! by Anonymous Coward · · Score: 0

      You mean besides Windows (always a free SDK), Office, and Internet Explorer? Dumbshit.

      In what way have these anything to do with 3rd party products?

    61. Re:Yay Evil Monopoly Of Doom! by passthecrackpipe · · Score: 1

      Well, you make some good points, but MS has consistently marketed their usage of XML as their way of being "open" and "interoperable". We all *know* that Ms is not interested in open formats. Ms, however, keeps insinuating that they are. They have little choice of course. Would you buy from a business that tells you "Buy the ultimate in vendor lock-in now!"?

      --
      People who think they know everything are a great annoyance to those of us who do.
    62. Re:Yay Evil Monopoly Of Doom! by Anonymous Coward · · Score: 1, Informative

      If this is "unreadable" or "obfuscated", then you've got your eyes closed.

    63. Re:Yay Evil Monopoly Of Doom! by Theatetus · · Score: 1

      Using Mingw32 or lcc-win32 or bcc55 I can use the free (beer) windows API and write windows programs. Contrast that with Carbon or Cocoa (in their earlier days).

      --
      All's true that is mistrusted
    64. Re:Yay Evil Monopoly Of Doom! by Anonymous Coward · · Score: 0

      oracle was pitching, "Everthing into the database" for a while.
      http://www.oracle.com/ip/deploy/database/f eatures/ index.html?ifs.html

    65. Re:Yay Evil Monopoly Of Doom! by Mr.+Firewall · · Score: 1

      Wow, I was way off when I predicted that Microsoft would further obfuscate their Word format. This seems to be in all respects a Good Thing.

      I wouldn't hold my breath waiting for it. I seriously doubt that the M$ leopard has changed any of its spots.

      It's safe to assume that M$ is still a carnivorous predator.

      --
      In times of universal deceit, telling the truth gets you modded -1 Troll
    66. Re:Yay Evil Monopoly Of Doom! by thelen · · Score: 2

      .NET, DCOM, CORBA, KCOP, etc all pervert the idea of "object orientation" by making elaborate communcation protocols which are only "object oriented" because they call some part of the protocol an "object". Real object-orientation means there is some commonality of functionality, and the only instances I can think of that really work are the original Unix where everything known then (terminals, printers, tapes, disks) used the same read/write/seek calls, and Plan9 which tries to extend this to networks and file systems).

      I believe you mean "real-object orientation" not "real object-orientation". Object-oriented programming is characterized principally by the conjunction of data and behavior, which the protocols you denigrate adhere to rigourously. In contrast, your idea of object orientation appears to mean a unified manner of interacting with real objects. But grouping common functions is merely sane programming, not a particular paradigm, and certainly not what the rest of the world means by OOP.

    67. Re:Yay Evil Monopoly Of Doom! by Mr.+Firewall · · Score: 1

      Get your facts straight to avoid sounding like an idiot.

      Sorry, you're the one sounding stupid here. M$ made a lot of fanfare three years ago about Window$ 2000 working with Kerberos.

      Kerberos is an open standard... or, was an open standard before M$ "embraced and extended" it....

      Turns out that Kerberos on Window$ 2000 servers and workstations ONLY works if the KDC (basically, the Kerberos master server) is a Win2000 box. So it's not REALLY open.

      Passthecrackpipe's analogy to Kerberos is about M$' treatment of "open" standards, not about the differences in the roles of Kerberos vs. XML. Zir* point is that M$ will likely do the same thing with XML.


      *Genderless 3rd-person pronoun. Replaces "his/her"

      --
      In times of universal deceit, telling the truth gets you modded -1 Troll
    68. Re:Yay Evil Monopoly Of Doom! by Anonymous Coward · · Score: 0

      mod parent down, everything he says is prety much retarded and uneducated. Thank you.

    69. Re:Yay Evil Monopoly Of Doom! by nybble_me · · Score: 0

      SELECT *.doc FROM [C:\Documents and Settings\JohnDohn\My Documents]

      --

      reenigne
    70. Re:Yay Evil Monopoly Of Doom! by mcc · · Score: 1
      but don't think anything other than a microsoft OS will actually be able to access the files

      in order for your theory to work, one of the following things will have to happen:
      1. The Office 11 XML file format will not be the same for the Macintosh and Windows versions.
      2. Apple will have to sign on with to MS's Trusted Computing initiative.
      Which are you suggesting is going to happen? Neither seems likely to me. Unless the mac version of Office 11 also engages in the policy of secrecy, any attempt to keep people from "getting at" the files directly will have no teeth.

      And Microsoft cannot afford to marginalize the mac users. They can afford neither to make it so office 11/mac can't read the files from office 11/windows, nor to make it so that users are somehow restricted from transferring files between windows and non-windows box (becuase how are they going to be able to tell if you're e-mailing that file to a mac user or linux user?). There are situations and businesses where one of Office's biggest advantages is a standard, uniform platform for word processing between the macs and PCs in the office. Lose that advantage, and entire businesses could be convinced to switch to something else so that all of their computers can interoperate. And now that MS has eliminated all serious competitors to Office, MS's biggest fear is that some tiny niche (such as the mac users) could for some reason find themselves needing to switch to something other than MS Office.

      Remember, users are funding, and if all those mac users start funding Nisus or whoever, Nisus could potentially get enough money together to port to windows and become a potential competitor to microsoft. Microsoft doesn't want that..
    71. Re:Yay Evil Monopoly Of Doom! by jimbolaya · · Score: 2
      Let's not get paranoid here! Regardless of how the low-level bit-by-bit format of the disk is set up, you can be sure that Microsoft will provide a library for, hmm, reading files off the disk. Unless you believe that Microsoft will be the only publisher of software that can read and write to the disk.

      I almost hate to mention a few of the ways you could get files off the Longhorn file system...FTP, HTTP, Samba, e-mail, ISO-whatever-it-is CD-ROM...is anybody really worried about this? Seriously?

      --

      There ain't no rules here; we're trying to accomplish something.

    72. Re:Yay Evil Monopoly Of Doom! by spitzak · · Score: 2

      That's a description of XML itself, not of how Word will use XML to store files.

    73. Re:Yay Evil Monopoly Of Doom! by Anonymous Coward · · Score: 0

      Unless the table and data are encrypted, that is.

    74. Re:Yay Evil Monopoly Of Doom! by mbogosian · · Score: 2

      If everything about this really is kosher, though, then everybody give a great big "Thank You!" to MS!

      Let's see, if I remember correctly from high school logic...

      A: everything about this really is kosher
      B: everbody give a great big "Thank You!" to MS!
      Given: A is FALSE

      A -> B val
      F -> T T
      F -> F F

      So as long as we thank them anyway, we still have a true statement. It's our only course of action, so here goes:

      "Thank you sir, may I have another?"

    75. Re:Yay Evil Monopoly Of Doom! by Corporate+Drone · · Score: 2
      donutello says:

      I have a Masters in Computer Science... There's no such thing (as "kernel level SQL data"). SQL data is stored in tables. You use queries to get at it. Period.

      I appreciate that you have a Masters in CS. I have a Bachelor's in CS. (What does that prove, btw?)

      Let's add one point that you managed to skip over, in your analysis: SQL data is stored in tables. You need authority to the tables in order to make any sense of the data. You use queries to get at it. Period.

      ok ... so, if you don't have access to the tables without permission from the OS (and you better believe that only a Microsoft OS will have permission), then you have two choices: (1) break the authentication scheme, or (2) translate the raw bits of a proprietary DB.

      so, if I'm MS, I give up my stranglehold on the Word format, replacing it with a stranglehold on the authentication to query via SQL. In other words, you have play by my rules to have the right to use queries to get at data. Period.

      that being the case, what's your point, then?

      --
      mmm... yeah... You see, we're putting the cover sheets on all TPS reports now before they go out...
    76. Re:Yay Evil Monopoly Of Doom! by donutello · · Score: 2

      You're a fucking idiot. I have a Masters with a focus on databases and storage technology.

      And yes, you blooming idiot, you need authority over the tables and presumably you are a frigging sysadmin and therefore automatically have that authority. There isn't a database out there (barring storing encrypted data in the database which is decrypted OUTSIDE the database) which bars an admin from doing anything they want to with the data.

      You obviously understand nothing about databases so butt out.

      And there is still no such thing as Kernel level SQL data. That is just techno-babble used to fool idiots who don't understand what is being talked about - like yourself.

      --
      Mmmm.. Donuts
    77. Re:Yay Evil Monopoly Of Doom! by Anonymous Coward · · Score: 0

      I had a reply worked up, but then I realized you went into perlscript mode and your post is a straight "Bill Gates Ran Over My Dog" M$-bash and is entirely intellect free. Try again sometime.

    78. Re:Yay Evil Monopoly Of Doom! by Sneftel · · Score: 1

      You're missing the fundamental point Have you been following ANYTHING that Microsoft has been doing? Sysadmins will no longer have unfettered access to their filesystem. Only the OS gets to have that.

      It's like this:

      localhost$ su
      Password:
      localhost# whoami
      root
      localhost# cat /etc/passwd
      Palladium trust error: fuck off and die, hacker
      localhost# chmod +r /etc/passwd
      Palladium trust error: no fucking way.

      This is why "kernel level SQL" makes sense. These are queries that ONLY the kernel is allowed to run. For that matter, definitely not non-microsoft kernel drivers.

      --
      The opinions stated herein do not necessarily represent those of anybody at all. Deal with it.
    79. Re:Yay Evil Monopoly Of Doom! by Anonymous Coward · · Score: 0

      I can use the free (beer) windows API and write windows programs.

      No you can't. You can use the part of the Windows API that MS chooses to tell you about. The other part you can only guess at.

    80. Re:Yay Evil Monopoly Of Doom! by Anonymous Coward · · Score: 0

      Quoting that you have a masters in computer Science, doesn't mean squat. A lot graduates with computer science degrees have difficultly with even the simplest computer related topic eg. send emails.

    81. Re:Yay Evil Monopoly Of Doom! by Corporate+Drone · · Score: 2
      You're a fucking idiot. I have a Masters with a focus on databases and storage technology. And yes, you blooming idiot

      wow... i so get it now. I mean, I was totally oblivious to your point, but then you called me a fucking idiot, and now it's all clear!

      p.s., thanks for clearing up what I missed in grad school. apparently, your school taught that when someone disagrees with you, you wave your degree around vigorously, and then use ad hominem attacks. so, so effective.

      p.p.s., btw, i started re-arguing the point, but ya know what? Forget it. Yeah -- you're right. I concede. Now take your degree and your attitude, and go back to your sandbox, like a good little boy...

      --
      mmm... yeah... You see, we're putting the cover sheets on all TPS reports now before they go out...
  2. Incompatibilities Once Again by robbyjo · · Score: 3, Insightful

    .... I guess it's just MSXML rather than THE standard XML. But we can figure it out with some "intelligent guesswork" now because the file would be human-readable.

    --

    --
    Error 500: Internal sig error
    1. Re:Incompatibilities Once Again by JaredOfEuropa · · Score: 5, Insightful

      It's just like the old SGML module for Word they used to have about 6 years ago. My guess is that there will be some significant drawback to saving documents in XML, such as loss of some formatting information. That would convince users not to save in the XML format... but that isn't the important thing to Microsoft.

      More significantly, there might be small incompatibilities, or ways that Word-created XML documents divert slightly from what is normal and proper in XML. Perhaps Word will make some (intentional) mistakes when reading back XML files generated in other applications, just like Word's old SGML module would choke on many proper SGML documents.

      Make no mistake: the fact that almost everybody is using Office and the associated file formats makes it very hard for a new contender to enter the office suite market. Microsoft must be aware of the power they have over the market with their Office file formats. Think of it: when you exchange files with other businesses, you have two realistic choices of file formats: Office or plaintext. And now Microsoft is introducing compatibility with an open and well-defined markup langauge, in favour of their proprietary language? I'll believe it when I see it.

      --
      If construction was anything like programming, an incorrectly fitted lock would bring down the entire building...
    2. Re:Incompatibilities Once Again by Qrlx · · Score: 5, Insightful

      Think of it: when you exchange files with other businesses, you have two realistic choices of file formats: Office or plaintext

      I think PDF is a viable (growing even) third option. Adobe is "evil" just like MS (remeber Sklyarov)... regardless, PDF is nice and it works well, and the files are way smaller than word docs.

    3. Re:Incompatibilities Once Again by DrXym · · Score: 2

      PDF is fine if you just want to print stuff out, but it contains absolutely none of the information of the original document that allows you to edit it.

    4. Re:Incompatibilities Once Again by SoSueMe · · Score: 1

      I would expect MicroSoft to implement Word/XML with exactley the same adherance to standards as FrontPage/HTML.

    5. Re:Incompatibilities Once Again by arkanes · · Score: 2

      If someone could point me to a free(as in beer or speech, I'm not a zealot), stable (slap acrobat), fast, and easy to work with application for viewing and editing PDF files, on Windows, I'll be alot happier about PDFs.

    6. Re:Incompatibilities Once Again by greenhide · · Score: 2

      It's just like the old SGML module for Word they used to have about 6 years ago. My guess is that there will be some significant drawback to saving documents in XML, such as loss of some formatting information. That would convince users not to save in the XML format... but that isn't the important thing to Microsoft.

      Probably not, judging by my experience with documents translated to HTML. Although some features are lost (headers and footers, notably), MS Word has actually been pretty good about using styles and HTML comments to define all the remaining content--including Mail Merge and detailed styles and formatting.

      Most of the features that are lost were because those features don't exist on web pages at all (notably, headers and footers), so there wasn't a sensible reason for putting them into the output.

      But I think we can actually expect this XML format to offer almost all of the formatting of a standard Word document.

      And now Microsoft is introducing compatibility with an open and well-defined markup langauge, in favour of their proprietary language? I'll believe it when I see it.

      The key is behavior in reading those XML files. HTML is also an open format, and their documents can be saved in HTML format. But Word-specific features and many of their styles won't work except in Microsoft Word. Even if their documents are 100% xml compliant/compatible, that doesn't mean that the information contained in them is particularly useable in a non-Word application.

      --
      Karma: Chevy Kavalierma.
    7. Re:Incompatibilities Once Again by Anonymous Coward · · Score: 0

      xpdf is okay...it came with my slackware 8.1 distro

    8. Re:Incompatibilities Once Again by pomakis · · Score: 2
      PDF is fine if you just want to print stuff out, but it contains absolutely none of the information of the original document that allows you to edit it.

      I would say that 95% of the documents floating around on the web, being e-mailed to people, etc., are meant to be "read only", i.e., with no intention to be editable. PDF is a much better format for this purpose.

    9. Re:Incompatibilities Once Again by sir99 · · Score: 2, Informative

      GSView (requires Ghostscript) works pretty well on Windows. It's also free beer/speech, depending on which version you get (old versions get relicensed as GPL when a new version is released). As for editing, I don't know of anything besides acrobat that edits PDF directly.

      --
      The ocean parts and the meteors come down
      Laid out in amber, baby.
    10. Re:Incompatibilities Once Again by Anonymous Coward · · Score: 0

      My guess is that there will be some significant drawback to saving documents in XML, such as loss of some formatting information.

      My guess is that big "drawback" will be that users will have to click on File + Save As XML (option hidden due to the 'rollup' menus).

      Just like with the current Save As HTML, this will mean that, in practice, the Word docs you get in e-mail will never ever be in XML format.

    11. Re:Incompatibilities Once Again by Abreu · · Score: 2

      And that's a Good Thing(C) most of the times!

      --
      No sig for the moment.
    12. Re:Incompatibilities Once Again by alfredo · · Score: 2

      One friend on MSN/Hotmail cannot read PDF's in his mail for some reason.

      In OSX PDF's is the way to go. I can turn any document into a pdf by hitting print, then hitting the PDF option.

      GSView is pretty neat too.

      Ignore MS and they will go away.

      --
      photosMy Photostream
    13. Re:Incompatibilities Once Again by King+Babar · · Score: 2
      It's just like the old SGML module for Word they used to have about 6 years ago. My guess is that there will be some significant drawback to saving documents in XML, such as loss of some formatting information.

      Um... You'd better hope that saving to XML loses some formatting information, since that's the whole point of *ML approaches: to separate content from presentation. A more charitable reading of what you say is that the style sheet you need to apply to the XML to render a Word document might be crippled. Could be.

      Frankly, though, I suspect that the *opposite* thing will occur. The style sheets won't be crippled, rather, they will be absolutely wonderful. So good, in fact, that you would not want to do without them. So powerful, that you will want to re-do your entire website in Word XML just to use them, and allow users complete transparency in going to-and-fro. Regular HTML and will become scarce on the web. PDF will also be seen as less necessary.

      Of course, there might be a catch or two...like, the style sheets will never be publicly available, and you will not be allowed to use them from a non-MS browser, and the XSLT that allows you to export to html and pdf won't work quite right...

      As always, you have to be very careful about what you ask for, since you might just get it.

      --

      Babar

    14. Re:Incompatibilities Once Again by surprise_audit · · Score: 1
      ... I guess it's just MSXML rather than THE standard XML. But we can figure it out with some "intelligent guesswork" now because the file would be human-readable.

      And that's exactly where they'll bite. MSXML will have some "incompatibilities" just like MS introduced with Sun's Java. It'll probably be some kind of security flaw(*), to provide them with an excuse to skew the standard just a little bit when they provide MSXML-SP1. The standard will be skewed just enough for other XML stuff to start breaking, at which point some MS bigwig will have a perfect opportunity to badmouth open source yet again.

      On top of that, MS will be able to DMCA-SLAPP people trying to unpack MS(DRM-enabled)XML files to find out what's broken.

      * Read 'stacked security flaws'. You know, the kind that mask each other so that only the top one shows, giving them a perfect excuse to issue multiple Service Patches that silently implement all kinds of other DRM stuff as they unwind the stack.

  3. wicked :) by oo7tushar · · Score: 2

    I've been waiting for this. It's gonna allow me to goto a full Linux system and not have to pay any money...I hope.

    I'm wondering, can MS charge for licences to write tools that parse the XML documents?

    1. Re:wicked :) by Mnemia · · Score: 3, Insightful

      I doubt it. XML is specifically designed around interoperability, and I don't think MS can charge for use of a standard they don't own. That's why I think that they will break standards compatibility somehow.

  4. Too good to be true by trustno_one · · Score: 1, Flamebait

    Why the fuck would MS give up their MS Office file format monopoly?

    1. Re:Too good to be true by Anonymous Coward · · Score: 0

      Maybe the lawsuit actually worked a little...

    2. Re:Too good to be true by Masa · · Score: 5, Insightful

      Because it doesn't matter if everyone is able to read, modify and generate Office-compatible files. People will us Office products in future. Opening the file formats doesn't change anything.

      XML makes it easy to create programs that will depend on MS Office. So this only makes it easier to create programs which depend on Microsoft products.

    3. Re:Too good to be true by MrHanky · · Score: 5, Interesting

      Maybe they need a migration path away from the win32-based format they use now. .NET also seems to follow that path. Remember that MS needs access to other platforms than the i386/desktop in the future - mobile devices for instance. Keeping a format that is basically a binary image from a PC is good for locking out competition, but not when you have to start competing with yourself.

    4. Re:Too good to be true by bokmann · · Score: 5, Insightful

      Except I will look to xml.openoffice.org to write some xslt transformations to take Microsoft office documents and liberate them once and for all.

      Once I can move my team of 20 people to open office with no real worries or complaints about 'interchanging' files with lusers still using Microsoft, I will.

      BUT, have you ever looked at an HTML file generated by Microsoft word? It is a GREAT example of how they can pollute a standard into something unreadable.

      I suspect that they will copyright or otherwise lock up their DTD/Schema, and try to lash out at anyone that uses them in other than 'approved' ways.

    5. Re:Too good to be true by RPoet · · Score: 1

      Enough with your Slashdotian "damned if you do, damned if you don't", "we've never heard of 'benefit of doubt'", "I'ma gonna score me some easy karma" zealot attitude already.

      --
      "Oppression and harassment is a small price to pay to live in the land of the free." -- Montgomery Burns.
    6. Re:Too good to be true by Petronius · · Score: 1

      Because it would kill PDF.

      --
      there's no place like ~
    7. Re:Too good to be true by gpinzone · · Score: 2

      I agree with you, but... Considering that HTML really doesn't do a lot of the things that can be done in Word like headers, footers, etc., they couldn't have just used strict HTML 4.01 without losing a boatload of features. Yeah, you could come up with a way to get an HTML file to look like a Word document, but then you'd have to break away from the whole Word Processor paradigm and embrace a system more like PDFs. Too many incompatibilities.

    8. Re:Too good to be true by SurfTheWorld · · Score: 1

      Your point is well taken. Simply opening a file format does *not* change anything. If MS is truly interested in interoperability with third party developers / scripters, they will make a concerted effort to document their XML as well as keep it consistent.

      For example...

      I work on an xml driven middleware project. We have gone to great lengths (via xsl scripts) to generate documentation that matches our dtds. We really focus on backwards compatibility and not breaking it, so as to happily interoperate with the community (specifically with older clients). If MS wants to continue to protect it's market position, they could *quite* easily "use xml" as their file format but obfuscate (via a volatile dtd) it enough that it becomes such a moving target that nobody *wants* to use it. Consider a bogus specification that states that the namespaces, element names, and overall structure completely change (unpredictably) if you have a regularly faced document vs a bold faced document.

      What's my point? Don't assume that MS is opening their file format because they are using xml. It's a clever ploy on their part to say to the world (and more importantly to the DoJ) - 'look, we're inter-operating with the world because we use xml', but then not publish a DTD, not document, break backwards compatibility with every patch release, etc etc etc ... In the end the only components interacting are the MS Word application and the MS Word'ed XML docs on your filesystem.

      --
      Do it for da shorties
    9. Re:Too good to be true by Antos700 · · Score: 1

      It's not that, it's the actual source that is horrible. inconsistent too, as it will do the same thing in the document in more than one way in the html source. Also the layout is very poor as well. How hard would it be to have the code output with generally accepted indenting?

    10. Re:Too good to be true by Anonymous Coward · · Score: 0

      If microsoft's ideas on XML persist, their office schemas will be similar to w3c schemas but unreadable by other xml parsers/validators. Kind of like the BizTalk Schemas.

  5. Yeah... by sfraggle · · Score: 0, Redundant

    Yes Microsoft, Open Standards really are kind of cool, arent they?

    --
    were you expecting to see a sig here? perhaps you'd rather see the inside of an ambulance!
  6. What will be the default save format? by leandrod · · Score: 5, Insightful

    The most important question, besides if the MS Word XML format will be well-documented enough, is if it will be the default saving format. Most MS Office users simply don't care enough to save MS Word documents in RTF, for example, even if it's more than good enough for the vast majority of the documents.

    Not the main issue on the article, but it is unfair to single someone as the inventor of XML, which is just a streamlined version of SGML which is an evolution from IBM's GML.

    --
    Leandro Guimarães Faria Corcete DUTRA
    DA, DBA, SysAdmin, Data Modeller
    GNU Project, Debian GNU/Lin
    1. Re:What will be the default save format? by Anonymous Coward · · Score: 0

      If it is made the default one (and I'm not holding my breath), perhaps that silly utility to 'optimize' your MS Office files (that just changes their format to whatever is the default for the MS Office you have installed) would actually be useful for once...

    2. Re:What will be the default save format? by Rinikusu · · Score: 5, Funny

      Stop right there.
      If you continue with that line of reasoning, someone's gonna demand that it be called SGML/XML.

      Grr.

      --
      If you were me, you'd be good lookin'. - six string samurai
    3. Re:What will be the default save format? by reaper20 · · Score: 2

      Not the main issue on the article, but it is unfair to single someone as the inventor of XML, which is just a streamlined version of SGML which is an evolution from IBM's GML.

      Why not? They list him as a co-inventor, meaning, he didn't do it all himself.

      I wouldn't simplify the comparison between XML and SGML. That's like saying the invention of the printing press was insignificant, since people already had a written language.

      XML makes SGML actually usable, and if this guy helped make it so, then he deserves a little bit of credit.

    4. Re:What will be the default save format? by good-n-nappy · · Score: 1

      I can't see this being the default because it will take longer to read/write and will create bigger files. How can an XML format ever compete with a binary format in terms of speed and size? On the other hand, maybe this is part of their Palladium plan and its intended to encourage new PC sales.

      --
      Never underestimate the power of fiber.
    5. Re:What will be the default save format? by jkramar · · Score: 1

      Long live (G/SG/X/XHT)ML!

      --

      true && more || less
    6. Re:What will be the default save format? by shird · · Score: 2

      Something like XML would compress quite well though, so they would probably introduce some kind of 'compiled/compressed XML file' and the specs of how to decompress it. Similar to their compiled HTML format for help files. It would require an intermediate parsing step before it could be used in perl scripts etc though.

      --
      I.O.U One Sig.
    7. Re:What will be the default save format? by StefMeister · · Score: 5, Interesting
      According to this article on ZDNet, it wil probably NOT be the primary file format:

      To make that happen, Microsoft is turning to what some analysts say is a risky strategy. The company is adopting Extensible Markup Language (XML) as a second file format in all Office applications, to enable better data exchange between the productivity suite and back-end software, such as databases.
      --
      "Son, in a sporting event, it's not whether you win or lose, it's how drunk you get" - Homer J. Simpson
    8. Re:What will be the default save format? by chthon · · Score: 1

      Well, what I do find strange is that Microsoft .doc files (binary format) are always larger than OOo .sxw files (text format).

    9. Re:What will be the default save format? by Anonymous Coward · · Score: 0


      Something like XML would compress quite well though, so they would probably introduce some kind of 'compiled/compressed XML file' and the specs of how to decompress it. Similar to their compiled HTML format for help files. It would require an intermediate parsing step before it could be used in perl scripts etc though.


      In fact, that's what SVG does. SVG files may optionally be gzipped. And the zlib library is readily available everywhere.

    10. Re:What will be the default save format? by Karellen · · Score: 2

      Don't you mean XML/SGML?

      As TCP/IP (TCP over IP; think 3/5 == three fifths == 3 over 5) is TCP running over IP, and GNU/Linux is the GNU toolchain running over a Linux kernel, surely XML is a document metaformat `running over' the earlier, more complex, SGML document metaformat.

      K.

      --
      Why doesn't the gene pool have a life guard?
    11. Re:What will be the default save format? by greenrd · · Score: 2
      sxw files aren't text format. They're zipped xml, hence compressed. But you're right - M$ == bloatware.

    12. Re:What will be the default save format? by AUsBandit · · Score: 1

      Risky?
      I don't think so.

      Remember Microsoft's policy. E^3
      Embrace (XML), Extend (XML),Extinguish (XML)

    13. Re:What will be the default save format? by Rupert · · Score: 1, Flamebait

      I forget who said it, but I liked this quote: "XML is that subset of SGML that Microsoft could understand".

      --

      --
      E_NOSIG
    14. Re:What will be the default save format? by evilpenguin · · Score: 2

      I'm just curious what it is about SGML that you think makes it "unusable?" SGML is perfectly usable. I don't think there is anything "easier" about XML for people writing documents. There's quite a bit easier about writing fully compliant parsers for each, but that's a different story. XML just makes some simplfying assumptions, but since every valid XML document is also a valid SGML document, I'm not sure how SGML can be said to be "unusable."

  7. I doubt it. by theLOUDroom · · Score: 3, Insightful

    I really have my doubts about wether Microsoft will allow "any programmer with a Perl script and a bit of intelligence" to muck around with Office documents.
    I'm guessing their XML document format will be just as hard to decyper and the current office formats.

    --
    Life is too short to proofread.
    1. Re:I doubt it. by sql*kitten · · Score: 5, Insightful

      I really have my doubts about wether Microsoft will allow "any programmer with a Perl script and a bit of intelligence" to muck around with Office documents.

      Why not? After all, the high-quality ActiveState port of Perl to Win32 exists because Microsoft paid for it, and you can download it for free. Not only that, but if you want to write your own code to manipulate Office documents, you have been able to do that for years in VBA - all the Office programs expose rich APIs. In fact, they are composed of Objects that you can instantiate and use in your own programs if you want - all MS care about is that there is a licensed copy of Office on the user's machine. One of the easiest ways to do charting is to simply reuse a bit of Excel, for example. From there it's a short hop via COM to any program you want.

      I'm guessing their XML document format will be just as hard to decyper and the current office formats.

      The fact that Office documents have been in a proprietary format in the past is actually unimportant, since the interfaces to the applications (and hence their documents) are well documented (check MSDN or Barnes & Noble if you don't believe me). So the reason that Microsoft are doing this is that they lose nothing and gain from making the platform even more attractive to developers.

    2. Re:I doubt it. by spongman · · Score: 3, Funny

      what are you talking about? you can 'muck about' with office documents right now with whatever language you want, Perl included. You don't need XML to do it.

    3. Re:I doubt it. by jkramar · · Score: 2, Informative
      That's all fine and good if
      1. you don't mind having to buy Office just to modify Office files
      2. you're on Windows
      . Actually, the various APIs are probably there on Macs as well. However, if you're on Linux, then you're stuck. OpenOffice, Abiword, et al. do a reasonable job of reading Office files, but can't quite read everything perfectly, and the fact that Office documents are binary dumps instead of nice, legible XML doesn't help. This is, as I think many readers have realized, a significant advantage that an XML file format would lend. If they carry this out, then developers of other apps, such as competing office suites and member programs, whether free software or not, will have a much easier time reading and correctly interpreting these documents.
      --

      true && more || less
    4. Re:I doubt it. by ianezz · · Score: 5, Interesting
      I'm guessing their XML document format will be just as hard to decyper and the current office formats.

      There are 2 problems with the current format of Microsoft Office file:

      1. Give the correct interpretation to the bytes representing the document content, in order to import the Office document in some other office suite using a different representation.
        This is mostly solved (thanks to years of trials and errors).
      2. Give the correct interpretation to the bytes representing the document itself AND all the extra cruft having nothing to do with the document contents that the Microsoft Office suite puts in, in order to generate documents readable by the various versions of the Office suite.
        This is definitively more difficult, as nobody knows Office internals and how they expect such additional data to be. StarOffice guys managed to make an acceptable job, at the price of years of trials and errors. It's like watching at a dump of your computer's memory, guesssing what's code, what's data, what's padding and the meaning of every byte...

      Now, do an XML format simplifies things? Well, yes, just as an RTF text is easier to manage than a pure binary format, but nothing prevents putting extra cruft in an XML document, so it's just that instead of having to use a hex editor, you now may use a text editor, but giving a correct interpretation of tags and attributes is something that only Microsoft can do, unless it publishes the full specifications (present and future: after all, XML is eXtendible, right?)

      Personally, I think that:

      • Microsoft is realizing that the current Office formats are getting out of control, so it wants to get rid of them, because mantaining backwards compatibility is becoming too much painful.
      • An XML-based format may be the right answer for Microsoft, in that all the subtles of parsing binary data simply disappear, while it may still make difficult to everyone else to understand what's the real meaning of data. Let's say <obscuretag_42 foobarizer="xyzzy"/>
      • Microsoft was not giving out the specifications of the formats of its Office suite before: should we now suppose it's giving out the DTD/Schema AND a good explanation of how to interpret it? I'd hope the answer is yes, but giving the company's precedents...
    5. Re:I doubt it. by Bartmoss · · Score: 3, Insightful

      I think you are dead on. Plus: a) XML is a great buzzword; b) it makes MS *seem* more "open" and "standards compliant".

    6. Re:I doubt it. by Penguin · · Score: 2, Interesting

      ... in fact, Microsoft has code examples for perl in their Knowledge Base:

      http://support.microsoft.com/default.aspx?scid=k b; en-us;Q214797

      (furthermore I'm impressed that a reply like "They'll probably do something evil..." would be rated as "Insightful")

      --
      - Peter Brodersen; professional nerd
    7. Re:I doubt it. by Anonymous Coward · · Score: 2, Insightful

      "After all, the high-quality ActiveState port of Perl to Win32 exists because Microsoft paid for it"

      That port existed well before the MS involvement in ActiveState.

      Here's the original story on Microsoft's role:

      "6/2/1999 -- Microsoft Corp. and ActiveState Tool Corp. (www.activestate.com) signed a three-year Perl Open Source development and support contract.

      As part of the agreement, ActiveState will add features previously missing from Windows ports of Perl, as well as full support for Unicode on Windows platforms."

      Source: http://www.entmag.com/news/article.asp?EditorialsI D=1633

      ActiveState has similar partnerships with many others: http://www.activestate.com/Corporate/Partnerships/

    8. Re:I doubt it. by commodoresloat · · Score: 2
      * Microsoft is realizing that the current Office formats are getting out of control, so it wants to get rid of them, because mantaining backwards compatibility is becoming too much painful.

      Since when have they maintained backwards compatibility? I just got a MS Word document yesterday that I couldn't open. I was using MS Word.

    9. Re:I doubt it. by Anonymous Coward · · Score: 0

      "all the Office programs expose rich APIs."

      What a load of shit. Unless of course by using 'rich' you mean 'puny, half-assed, and broken with every new version'.

      But then, that's about what I'd expect from a Microsoft groupie like you. At least you've been consistent over the years.

    10. Re:I doubt it. by Anonymous Coward · · Score: 0

      Yes, but when all you get back is muck, it doesn't do any good.

    11. Re:I doubt it. by greenrd · · Score: 2
      OK - I call your bluff. Please can you point me to details on how to manipulate an embedded diagram in a MS Word 2000 file using Java on Linux - thanks.

    12. Re:I doubt it. by nmg · · Score: 1

      Welcome to Slashdot.

    13. Re:I doubt it. by ipjohnson · · Score: 1

      Have you ever actually tried to decode a binary file without the data definition .... XML can only be a step in the right direction ....

      Atleast now yuou know the start and end point of fields within the file, you can see what goes where. With a binary you have to guess where the boundaries.

      I think the bigger problem was pointed out in another thread ... this isn't going to be the default format and anyone that knows to save it as xml knows to save it as straight text. We really aren;t gaining any interoperability.

    14. Re:I doubt it. by khuber · · Score: 5, Insightful
      The fact that Office documents have been in a proprietary format in the past is actually unimportant, since the interfaces to the applications (and hence their documents) are well documented

      So you can read Office documents with other programs as long as you have Office and MS dev tools?

      You do see the folly in that, right?

      -Kevin

    15. Re:I doubt it. by Hooya · · Score: 2
      all MS care about is that there is a licensed copy of Office on the user's machine

      and that's exactly why it doesn't mean didly squat for me. you see, i don't use office. i do program a lot tho and it would be nice to be able to export my output in word/excel etc.. for others who do have office. if MS wants me to have a licensed copy of office as well even tho i have no use for it whatsoever, well i'll stick with PDF thank you very much. (yeah PDF is properiatory but the format is open; i don't need acrobat to be able to create PDFs now do i? check out FOP -- they've implemented a decent PDF libs -- in java tho. only thing 'missing' seems to liniarizing the object tree in the PDF for web viewing)

      i really don't care about XML to the extreme where i'd implement XML without any direct payoffs. so unless i can write the XML with a simple script (perl or otherwise) without the COM bindings. it's of no use to me if the lock-in moves from the document viewer to the program level.

      right now, i have a few apps that dump their output in either GIF/PNG/PDF as an option. i'd consider word/excel format if i could do that from the script without the 'proprietary' modules/COM components. otherwise this new event is a no-op. i'll stick with PDF for right now.

    16. Re:I doubt it. by tburkhol · · Score: 1
      I really have my doubts about wether Microsoft will allow "any programmer with a Perl script and a bit of intelligence" to muck around with Office documents.

      God I hope they don't. The last thing I want to have happen is for some perl/vbs/whatever worm to run amok on the departmental server changing all the "Dear Mr." into "Hey f_;kwad," or random "do"s into "do not."

    17. Re:I doubt it. by ianezz · · Score: 1
      Have you ever actually tried to decode a binary file without the data definition?

      Yes, in fact I said it would be easier (even for Microsoft) to handle an XML-based format instead of a binary-only format.

      But then, let's look on how Microsoft Word exports HTML: in order to avoid loosing information in the conversion process (so you can open your exported HTML with Word and have exactly your document again), the generated HTML is full of special comments and Word-only CSS properties expressing the information from the original document that can't be directly mapped into HTML.

      So, as of today, good or bad, here we already have a purely textual representation of a Microsoft Word document with well-defined data boundaries. It is still easier to reverse engineer than the pure binary format, but IMHO still leaves several obscure points behind.

      What I'm arguing is that the XML representation Microsoft is going to adopt, given the precedents of that company, wouldn't be that much better than this "rich HTML" WRT precisely understanding its meaning, and that Microsoft isn't really interested in gaining real interoperability as much as throwing buzzwords at the audience.

      But then, if this XML-based format is not going to be the default, I agree that this whole discussion is pointless.

    18. Re:I doubt it. by buzzcutbuddha · · Score: 2, Insightful

      Oh I get it! We're beating on Microsoft for not opening up it's file formats earlier because WordPerfect and Lotus products are so much more open...oh wait....

    19. Re:I doubt it. by Avumede · · Score: 2

      Office program expose the API. To get the text out of a MS Word program, even if you have Windows and Office, you have to start up Word, which is really inefficient.

      Many programs that need to parse the documents still must resort to manual methods. If you were writing a program that needed to access the text from these files (a search engine, for example), you would want to crack it yourself. Using Word to do it would almost certainly be slower, and if you use the COM API's, you are restricting your program to only run on Windows. In fact no search engine that I know of uses COM to crack a document.

      The XML will be a vast improvement.

    20. Re:I doubt it. by sql*kitten · · Score: 2

      Office program expose the API. To get the text out of a MS Word program, even if you have Windows and Office, you have to start up Word, which is really inefficient.

      True, but you only need to do it once. You can use the API to extract the contents and then store them yourself in any format you want, then run the indexer over that.

    21. Re:I doubt it. by Reckless+Visionary · · Score: 2
      I just got a MS Word document yesterday that I couldn't open. I was using MS Word.

      Oh phooey, some Mac person probably put ".doc" on the end of a Quark file and sent it to you thinking that's how you changed a file type ;-)

      --
      I think I'll stop here.
    22. Re:I doubt it. by Anonymous Coward · · Score: 0

      The big problem with using COM to "crack" Office documents on the server-side is that the office programs are single-threaded and really designed for interactive desktop use only (ie, they sometimes will pop-up a message box).

      I imagine that Microsoft will ship some .NET classes that allow easy construction/destruction of Office data, and run efficiently within a (Windows) server environment, without requiring the entire applications installed. This will solve various common intranet problems.

    23. Re:I doubt it. by tshak · · Score: 2

      Microsoft was not giving out the specifications of the formats of its Office suite before: should we now suppose it's giving out the DTD/Schema AND a good explanation of how to interpret it? I'd hope the answer is yes, but giving the company's precedents...


      What precedents? The fact that they've always documented everything very well including API's to get at those Office documents? Sure, they don't document the binary formats, because they give you API's to get at the data. What about the precedents with .NET? Nothing to hide here, full DTD's and Schemas for all config files and for Web Services. What about IIS 6? All config is in well documented XML files. Also, what would be the point of preaching developer ease when they don't document their XML?

      --

      There is no longer anything that can be done with computers that is nontrivial and clearly legal. -- Paul Phillips
    24. Re:I doubt it. by theLOUDroom · · Score: 1

      I don't think you get it.
      Yeah, sure Microsoft gives away the APIs to interface with their products, so that you buy & program for their products.
      This article is about the file format not the API.
      If you had a good description of their files formats, you could develop your own, interoperable office suite.

      --
      Life is too short to proofread.
    25. Re:I doubt it. by khuber · · Score: 1
      Well gee other people shoot people so why can't I?

      You can make relativistic comments like that all day but it still doesn't change the fact that MS has locked people into their file formats!

      -Kevin

    26. Re:I doubt it. by spongman · · Score: 2

      i didn't say anything about linux, but you can do it with java.

    27. Re:I doubt it. by tshak · · Score: 1

      If you had a good description of their files formats, you could develop your own, interoperable office suite.


      Kinda like people are doing with .NET?

      --

      There is no longer anything that can be done with computers that is nontrivial and clearly legal. -- Paul Phillips
    28. Re:I doubt it. by greenrd · · Score: 2
      Um, if you can't do it on Linux, it's not really Java. Perhaps you don't know the difference between Javascript and Java?

    29. Re:I doubt it. by buzzcutbuddha · · Score: 1

      No, people have locked themselves into those formats. There are other tools that provide other formats and people have the CHOICE to use them.

      Lotus and Corel wouldn't have it any other way and wouldn't open up their file formats if they didn't see some value in it. Microsoft hasn't seen any value in starting to open the format until now.

      What a novel idea, a company that can adapt....

    30. Re:I doubt it. by Anonymous Coward · · Score: 0

      Actually, Microsoft already has problems maintaining backwards compatibility. Ever notice documents saved in Word 2000 are much larger than those saved in Word 97? Most of the formatting codes are duplicate, and Word2000 formatting codes live alongside Word 97 codes

      The word2000 codes are just extensions of the word97 codes, containing information about tables-in-tables, full RGB colours of borders, etcetera. Word2000 prefers to read only the 'new' formatting codes, Word97 reads the old ones. As they decided to create duplicates of already existing formatting codes, it already seems it got too hard for them to keep things working.

  8. Historical turningpoint? by haeger · · Score: 5, Interesting
    I just thought about someone saying that somewere, when you look back in history, you can see some historical turningpoint where tings just went wrong or right.

    One small such point is when IBM gave out the specs to their hardware for PC allowing everyone to clone it, while Apple did not.

    This could be such a point. Maybe in 10 years we'll look back at this and ask ourselves "Why the heck did MS XML-enable their Office app, releasing the hold that they had"

    Only time will tell I guess.

    .haeger


    I Play Hattrick

    --
    You are not entitled to your opinion. You are entitled to your informed opinion. -- Harlan Ellison
    1. Re:Historical turningpoint? by Anonymous Coward · · Score: 1, Troll

      WFT are you talking about? THe BIOS is PC's had to be reverse engineered and IBM took those who did it to court. IBM did not give this info out. They wanted to be the only people who sold PCs.

    2. Re:Historical turningpoint? by Bud · · Score: 2, Flamebait

      MEEP! Wrong! Phoenix Tech was first to license their reverse-engineered BIOS, opening up the PC-clone market.

      http://oufcnt5.open.ac.uk/~richard_lawton/Sectio n% 206%20Notes.html
      http://www.pbs.org/cringely/pulp it/pulpit19990930. html

      To address the point in your post, Microsoft has a huge penetration in the Office market and no amount of XML fidgetry is going to kick them out. Rather, they'll love it if a small sub-industry grows up around the MS Office XML standard. Then they will release the Office Document XML standard v1.1, then 1.2, then 2.0 and so on, releasing that information only to "trusted partners". No chance the StarOffice team is going to see the next version before it hits the market.

      THAT's what you learn if you look at history. (Which you apparently didn't do. Duh. You lose...)

      --Bud

    3. Re:Historical turningpoint? by BurritoWarrior · · Score: 4, Informative

      IBM release th framework in which to do so because of the governmental investigation they were under at the time.

      They didn't do it out of the goodness of their hearts, but they did indeed do it. It wasn't the complete bios though so Compaq had two teams...one team looking at the specs, and another (that could never look) building a clean room implementation.

    4. Re:Historical turningpoint? by tekunokurato · · Score: 1

      Ten years is a long time.

      Thinking that no one will kick them out by a date at which technology will be markedly different is foolish. If they give up their stranglehold (which they haven't really done, but we all know it's slipping regardless), they're far from first in less than 10.

    5. Re:Historical turningpoint? by Anonymous Coward · · Score: 0

      (Which you apparently didn't do. Duh. You lose...)

      Your attitude is inversely proportional to your slashdot user number. So sad for such an early adopter of slashdot.

    6. Re:Historical turningpoint? by rot26 · · Score: 1, Offtopic

      MEEP! Wrong! Phoenix Tech was first to license their reverse-engineered BIOS, opening up the PC-clone market.

      I don't believe that's what he meant. When the PC was originally released you could buy the Technical Reference Manual (I've got one around here someplace) for a nominal fee. It thoroughly explained the architecture of the PC, and had a complete BIOS function call reference. You didn't necessarily have to "reverse engineer" anything; just duplicate the functionality from the documentation. The significance of clone BIOS's (and Compaq was first, by the way, not Pheonix) was that they used a "clean room" approach to make sure that there was no infringement on IBM's binaries. You could do no such thing with Apple... the lone Apple ][ clone, Franklin, actually used a bit-for-bit copy of Apples BIOS, (or at least portions thereof) which promptly got them sued out of the market.

      --



      To ensure perfect aim, shoot first and call whatever you hit the target
    7. Re:Historical turningpoint? by Gerry+Gleason · · Score: 2
      IBM release th framework in which to do so because of the governmental investigation they were under at the time.

      I don't think there was any connection to the IBM anti-trust case. They had been trying to get into the small and 'home' computer markets for years. The DisplayWriter had some limited success. I don't think they realized that the big market for the machines was business, not the home. If you look at the first offering, this is pretty clear. It had casset tape support, not floppies nor any hint of a hard disk option. The configuration that was sold most often had most of the ISA slots filled so you could have: floppies, Monochrome text plus printer, extra memory, and ??? (memory fade). The XT wasn't far behind, and it had all of that and a hard disk as a base configuration.

      All the talk at the time was that if IBM had really known what they had, they would have tried a lot harder to control things and lock up the standard. It probably wouldn't have been the success it was if they had, but it's hard to know since it didn't happen that way.

    8. Re:Historical turningpoint? by n-baxley · · Score: 2

      IBM gave out the specs to their hardware for PC allowing everyone to clone it, while Apple did not.

      I would argue that IBM gained from that move. Take a look at how the IBM PC market grew because of that openess. While Apple grew, it was not at the same rate. Granted IBM's PC division is now not that much to write about, but that's because they didn't keep up. We may look back and see this as a big plus day for MS.

    9. Re:Historical turningpoint? by Anonymous Coward · · Score: 1, Informative

      IBM was under antitrust restrictions to licence hardware technology under Reasonable and Non-Discrimatory licence.

      The BIOS was half the story, but IBM also held patents on "ISA", CGA, the disk interface, etc. Clone-makers just bought licences for these parts right from IBM (@ only about $5/PC).

      If it wasn't for the "plug-compatible" anti-trust wars in the mainframe market, the PC would have never been cloned,

    10. Re:Historical turningpoint? by Anonymous Coward · · Score: 0

      I am assumming you think IBM decision was smart and Apple's was not. If you think this just ask yourself where IBM's PC business is today vs. Apple's. IBM is a very vast company that has a PC arm and could have given all their PC's away for free and still made pllenty of money off of oh S/390, AS400, RS6000, Lotus (after they bought em) Tivoli (after they bought em), Web Sphere, IBM Global Services....the list goes on.

      Then again you could be saying that Apple made the right decision and is still bringing in Multi billions in the PC lines every year. Then you would be right.

    11. Re:Historical turningpoint? by poot_rootbeer · · Score: 2

      One small such point is when IBM gave out the specs to their hardware for PC allowing everyone to clone it, while Apple did not.

      Bull. If Compaq hadn't reverse-engineered the IBM PC BIOS, there wouldn't have been any Gateways or Dells selling cheap PCs -- you'd be buying them from Big Blue, and paying twice as much.

      The open standards allowed third parties to develop expansion cards and peripherals for the IBM PC, but the same is true in the realm of Macintosh.

    12. Re:Historical turningpoint? by Anonymous Coward · · Score: 0

      Parent isn't a troll. It's the truth. The PC BIOS had to be reverse engineered. It a frickin fact.

    13. Re:Historical turningpoint? by Anonymous Coward · · Score: 0

      Look moderators.

      The parent post here is just wrong.

      There are plenty or replies here that explain why, so mod it down already.

  9. *when* ? by Monty+Worm · · Score: 4, Funny
    when the huge universe of MS Office documents becomes available for processing by any programmer

    I beg you pardon? Smelly programmers can keep their hands off my documents. If I wanted you to have them, I'd have emailed them to you as plaintext. I wasn't aware the the Office license meant my documents were common property....

    --
    ... and today's pet project has ... been discarded for lack of time.
    1. Re:*when* ? by SmlFreshwaterBuffalo · · Score: 1

      I wasn't aware the the Office license meant my documents were common property...

      Give it a little time. This will be part of the EULA for this new version of Office. (But only as far as Microsoft is concerned, of course)

    2. Re:*when* ? by Anonymous Coward · · Score: 0

      I wasn't aware the the Office license meant my documents were common property....

      Then why have you been mailing them around, I got a mail from you yesterday titled "Hi, I send you this file to have your advice".

    3. Re:*when* ? by Anonymous Coward · · Score: 0

      Good luck creating a plain text document in office that is not warped in one way or another. When you do get around to it, send them over.

  10. The right time for MS by terminal.dk · · Score: 5, Insightful

    MS is trying to time this right.

    Right now they are seeing diminishing sales, possible shrinking market share. Most of the danish public sector is looking to save money using OpenOffice/StarOffice.

    MS needs to increase their compatibility with other options, as they would otherwise force customers to convert every single user away from MS at once, instead of OpenOffice coming in slowly.

    They can also hope, that their format is setting the standard, and the other companies will have to play catch-up rather than the other way around.

    1. Re:The right time for MS by Anonymous Coward · · Score: 0

      Part of french administration is looking to save money using StarOffice/OpenOffice...

      I heard about 80000 computers all around france

      It sounds great

  11. Re:What?! by Anonymous Coward · · Score: 0

    Why are all anonymous comments at +3?

  12. imagination by selderrr · · Score: 5, Funny

    ...all sorts of wonderful new things can be invented that you and I can't imagine...

    When will MS ever learn that we don't WANT to imagine how wonderfull the MS Office Universe is ?

    1. Re:imagination by Anonymous Coward · · Score: 0

      speak for yourself. I want to see great software, and I don't care who wrote it.

    2. Re:imagination by dbrutus · · Score: 2

      Great software is all well and good but I care about whether it comes complete with lawyers with a bad attitude attached. If so, no thanks unless I can't get out of it.

  13. read through "EULA" in the XML? by GnomeKing · · Score: 1

    Would it be feasable?

    COULD it make it illegal to "reverse engineer" the document format?
    I can very easily see that if it could, microsoft could include a clause that explicitly prohibits GPL programs from interpreting the XML...

    I wouldnt put anything past microsoft when their trying to keep their formats closed...

    Hrmmmm...

    1. Re:read through "EULA" in the XML? by McCall · · Score: 2, Interesting
      COULD it make it illegal to "reverse engineer" the document format? I can very easily see that if it could, microsoft could include a clause that explicitly prohibits GPL programs from interpreting the XML...

      No way. What happens when I recieve a MS 11 XML Word document on my Linux system via email. I haven't accepted any sort of EULA, and I can start hacking out the DTD straight away - which I must point out, a complex XML document is close to worthless without.

      They may prevent MS users from reverse engineering the documents on their MS OS's and I suppose they could even forbid users emailing their documents to other OS's (EULA's are great eh?) - but I doubt they will do this, it would cripple Microsoft Office.

      Andrew McCall.
    2. Re:read through "EULA" in the XML? by Kindaian · · Score: 1

      It isn't illegal to reverse engeneering anything... regardless of what the EULAS say... It is just that it can be HARD to fight with a 15Tons Gorilla like M$... in court... (which nowadays is impossible as the US Attorney found out the hard way)... --- You know life is beautifull... after you ruin it...

    3. Re:read through "EULA" in the XML? by Bake · · Score: 2

      Did you read the article?

      There was no mention of DTD anywhere in it.

      XML Scheme on the other hand .....

  14. WTF???? by jericho4.0 · · Score: 3, Informative
    from the article;
    The most important question, besides if the MS Word XML format will be well-documented enough,...

    WTF!? XML shouldn't need to be documented. The whole point is to create a human readable file that is parseble by computer. If MS Word delivers an XML file that I can't figure out, it's not XML.

    --
    "A language that doesn't affect the way you think about programming, is not worth knowing" - Alan Perlis
    1. Re:WTF???? by lovebyte · · Score: 4, Insightful

      Have you ever seen some complex XML file? Without documentation it could be as difficult as binary to reverse-engineer!

      --

      I'll do it for cheesy poofs.

    2. Re:WTF???? by Anonymous Coward · · Score: 3, Insightful

      That really depends on your definition of XML and human readable.

      <?xml version="1.0">
      <document>
      jMyB38QAAMETWFjs7IQAAQEVkJBNq0jEAAW
      RvbGWTYBAADARUaGlzRG9jdW1lbnQ8nhAAC
      udGrTEAAC8BATwAAADMAv8AAgEABABIAAAA
      </document>

      is valid xml, just like a uuencoded file is valid ASCII and human readable.

      But if other M$ products are any indication it won't be that bad. I parsed some Visio stuff and the data was more or less readable. The drawing data (or previews, didn't care) was still encoded though. I expect it to go a little like M$ html did.

    3. Re:WTF???? by jericho4.0 · · Score: 2

      Yes I have. And that's exactly my point. I understand XML as a product of thinking "OK. We've got the storage, we've got the computing power, let's stop storing our data in binary and make it readable by humans.". If XML is unreadable, even without knowing what program wrote it, it fails to live up to it's promise.

      --
      "A language that doesn't affect the way you think about programming, is not worth knowing" - Alan Perlis
    4. Re:WTF???? by anshil · · Score: 2

      Thats absolute marketdroid nonse, just because it is XML doesn't mean it must not be documented. It also has to be documented as well as binary. First step is the DTD for the XML which describes some rules the XML follows, but even with a DTD you need a documentation what what means.

      This all applieas also to XML-RPC. It has also to be documented like tradional RPC. Whats the advantage of XML-RPC over RPC? I the hell don't know. Besides that 3 times as much data is transfered I guess the only advantage is a market buzz word thrown in, doesn't matter the technical benefits/costs.

      --

      --
      Karma 50, and all I got was this lousy T-Shirt.
    5. Re:WTF???? by WhaDaYaKnow · · Score: 2

      Exactly.

      And even a document that conformed to an XML Schema could be just as hard to reverse-engineer as a binary file.

      We've all seen the obfuscated C contest (or the obfuscated JavaScript scripts in certain webpages).

      At the end of the day all that matter is if the company _really_ wants to document the format or not.

    6. Re:WTF???? by Krach42 · · Score: 1

      Well, all I know is that I don't like XML, because it's entirely possible to produce "bad" XML where the computer begins one tag, then starts another, then closes the first, then closes the second. Thus you get this situation:
      <STUFF><THINGS> text </STUFF></THINGS> This is entirely human parsable, and understandable, but for the computer this becomes a nightmare. Since, you'd have to shift the THINGS tag out past the stuff.

      One ought to use a markup-language where it's both human readable and strictly recursive (you can't close anything but the most recent tag)

      I use a format like this for my personal use (when it seems like everyone would use XML) I just use a # to indicate the begining of a tag, then descriptive text, then { } enclosing the text, otherwise a ; if there is no text. Arguments are simply put in ( )'s... So the above example could only be #stuff{#things{text}} And could _NOT_ be expressed in a poorly formatted way.

      Funny thing is it's trivial to convert this to XML, but yet vice-versa isn't necessarily so easy. (because of the above example)

      --

      I am unamerican, and proud of it!
    7. Re:WTF???? by richie2000 · · Score: 2, Redundant
      I parsed some Visio stuff and the data was more or less readable.

      Visio was just recently bought by M$, they obviously haven't had time to corrupt the file format yet.

      --
      Money for nothing, pix for free
    8. Re:WTF???? by anshil · · Score: 2

      Thats not true, you cannot shift tags in XML, well you can write ASCII text files having this " text ", but this is NOT valid xml, simple as that.

      You will never get past a XML validation tool like xmllint with this.

      For "your" format, look how RTF looks like, it's very similar. only a tag starts with \, and groups with { }.

      The point in using XML is not that the format is that superiour to your own formats (well it ain't bad altough), but the point is that you can already use a whole set of tools on this format, you would have to write yourself for you own. Like a validator (xmllint), a technical exact spec how data symantic is encoded (the XML-DTD), and finally you can use an already written parser/reader to read the syntax (my favorite libexpat).

      --

      --
      Karma 50, and all I got was this lousy T-Shirt.
    9. Re:WTF???? by giel · · Score: 0

      I absolutely agree it should be READABLE without documentation. However to produce valid documents in a certain context one would need at least a DTD...

      --
      giel.y contains 2 shift/reduce conflicts
    10. Re:WTF???? by DGolden · · Score: 4, Insightful

      Here's my pet rant:

      I would say that XML isn't a markup language - a markup language would allow the "bad nesting", since a markup language should be "layers of virtual highlighter pen" applied to an underlying data stream. XML, since it requires "proper nesting", is just Lisp sexps reimplemented, but with terrible syntax. It's yet-another-tree-structured-data-format. Big Wow. A true markup language environment would facilitate part-structured data, like HTML used to be, rather than shoehorning everything into trees.

      Lisp sexps would just say (stuff (things "text"))

      In fact, that's pretty much all there is to lisp syntax right there. The above is (a) a potentially valid lisp program and (b) a valid lisp data structure.

      XML is a data format designed mainly to allow C and Java programmers to use vaguely Lisp-like processing techniques without realising it and/or admitting it to themselves.

      --
      Choice of masters is not freedom.
    11. Re:WTF???? by lovebyte · · Score: 1

      If XML is unreadable, even without knowing what program wrote it, it fails to live up to it's promise.
      Duh! XML fails to live up to its hype. Yes, for sure. There is nothing new here.

      --

      I'll do it for cheesy poofs.

    12. Re:WTF???? by Krach42 · · Score: 1

      Yeah, I noticed the LISP like processing myself also... I was just like "damn, stupid repeating stuff" and I decided that XML wasn't very worth it. My intentions of "shoehorning" everything was to make it easily recursively parsible. (I call my format RPF, recursively parsible formatting) I don't actually use it as much as the document embedding that I designed (which allows both HTML and XML to be put into it) The format is the same only each document is headed with a % not a #.

      Again, parsing these are trivial, and that was my entire intentions. I was also aware that XML required the proper nesting, thus my statement that it was possible to produce "improper" XML.

      I disagree with a lot in XML, (from what I've seen from XHTML) I mean selected='selected' screw that. I was a simple selected. *shrug* eh, everyone has their own desires.

      --

      I am unamerican, and proud of it!
    13. Re:WTF???? by spongman · · Score: 2

      the bitmaps are stored as encoded data, the drawings are stored as standard VML.

    14. Re:WTF???? by Krach42 · · Score: 1

      There's no reason to write a validator for my format. It's impossible to write incorrectly formatted text. ###stuff{} ? that's a # followed by a stuff tag. The only key characters are #, { and } (ok, and () for parameters, but this is trivial) to get a } in regular text, you do a #}, otherwise it closes the previous tag. Closed to many tags? :P who cares, not the recursive parser, had to many tags open? who cares not the recursive parser.

      Ok, so you could validate for that, just lexx and count the number of {'s verse the number of }'s (not prefixed by #) done. I'd say that was just a bit easier to write than ANY XML validator.

      As for parsing/reading it? The code that was required to make PHP use the XML parser to grab document the way you want requires far more code than the entire set of code to completely parse my format.

      {
      while($char != '#' && $char != '}') print $char; next $char;

      if($char == '}') return;

      next $char

      while($char != '{') $tag .= $char; next $char;

      do_tag_code($tag);

      me();
      }

      --

      I am unamerican, and proud of it!
    15. Re:WTF???? by vidarh · · Score: 5, Insightful
      The point of XML is not for it to be human readable, but to allow easy automatic processing of various kinds.

      With XML Schema and DTD's, you can validate various aspects of the data without writing a custome validator.

      With XPath and XPointer you can refer to parts of an XML document without needing to understand what the document contains.

      With XSL you can translate all or parts of the document from one format to the other without your application needing to know the structure, and without needing to understand more of the format than the parts you are extracting.

      With SAX and the DOM you can programmatically traverse and extract information from an XML file without having to write a custom parser.

      With CSS an editor or viewer for instance can use a standard mechanism of applying styles to elements without hardcoding the style attributes for elements anywhere.

      With XML namespaces, you can intersperse data in various formats in the same file, and the components handling each of the vocabularies need not know anything about the other components - an example would be embedding SVG in HTML: The HTML renderer doesn't need to understand any of the SVG tags, only that it should delegate contents with other namespaces to another component. And the SVG renderer couldn't care less about the HTML.

      And this doesn't even touch on the benefits of all the various interchange formats that have been specified on top of these base technologies.

      The importance of XML is that it opens up the doors for building interchangable components that operate on data without needing any hardcoded application specific knowledge of the data.

      Most of the time, you still have to write some code to tie it all together, but you don't have to build your own parsers, your own document object model, your own styling system, your own way of handling contained data of other types, your own way of transforming data between formats, etc.

      For me as a software developer XML delivered years ago. I use XML technologies daily, and it saves me work.

    16. Re:WTF???? by spongman · · Score: 2
      yeah, but you trivialize XML. you're whining about an insignificant syntactical aspect of XML. from what you've shown your 'language' has no handling of encodings, namespaces or schema-based validation, all of which form the basis of whats really useful about XML.
      Funny thing is it's trivial to convert this to XML, but yet vice-versa isn't necessarily so easy. (because of the above example)
      it would be simple to write an XSLT stylesheet to render any XML document in your language. how is your example not equivalent to:
      <stuff>
      <things>text</things>
      </stuff>
      ?
    17. Re:WTF???? by ianezz · · Score: 3, Insightful
      WTF!? XML shouldn't need to be documented. The whole point is to create a human readable file that is parseble by computer. If MS Word delivers an XML file that I can't figure out, it's not XML.

      Uhm, it is also the point of source files in the programming language of your choice, I'd say... and still, you need good comments.

      XML is like Lisp, but with sharp parenthesis.

    18. Re:WTF???? by SlamMan · · Score: 2

      I'd throw a comment about obfuscated perl in here, but I think thats perl's default.

      --
      Mod point free since 2001
    19. Re:WTF???? by anshil · · Score: 1

      Because xmllint can do far more than just matching the start and end tags, it also checks if the file matches in example the semantic specified in your DTD. Like _how_ the data is constructed. And it does absolutly not matter how easy/difficult it is to write an XML validator, because you don't have to.

      If you look into how the SAX interface works you'll recoqnize that it isn't any more work than the code you provided.

      I say it again your format is nothing else than RTF with '\' replaced by a '#'. You've merly reinvented a wheel, and honestly I've worked with RTF files (back in the days when windows used the .hlp files before they catched up using HTML, it required RTF), RTF looks like crap (personal opinion)

      The next thing you're missing is a nice unified transformation script for your data. Like in example XSLT provides for XML. Want to see an overview of your data in HTML? You need to write a program for this. With XML just hack a XSLT script.

      --

      --
      Karma 50, and all I got was this lousy T-Shirt.
    20. Re:WTF???? by Anonymous Coward · · Score: 0

      Well-formed (almost), but not valid :P

    21. Re:WTF???? by Anonymous Coward · · Score: 0

      I am a java programmer and am willing to take the lispoids' word and ADMIT that XML is "Lisp-like processing techniques".

      Now since I haven't seen a production lisp system outside of emacs in 15 years, that admission doesn't do me any fucking good what-so-ever, and any data in "Lisp sexps" is completely useless. The fact is that the XML tools are here and in use and the Lisp tools are not and never were.

      It's time that the Lisp people admit to themselves that they lost and move on.

    22. Re:WTF???? by Anonymous Coward · · Score: 0

      Are you in college? Try getting a real job where you have to deal with some Smart Guy's home-rolled interchange format. I guarantee if you personally are not the Smart Guy in question, it won't seem as elegant as it does on slashdot.

      Right now I'm dealing with an entire INDUSTRY that thought it would be smart to encode hierarchical data in CSV files and even after all these years, they are still all like XM-What? Everyone's made up their own Smart Guy solution for whitespace and linebreak handing, and none of them can even think about non-ASCII. And we get to handle all gazillion variations by hand-rolling parsers. It's a huge waste of time and it sucks.

      The nice thing about XML is that eliminates (or, at least moves) the programmer pissing contest that is file-formats. Now, if you are the guy starting the pissing contest (and you are), then it may not seem so keen. But folks like you will be eliminated soon enough.

    23. Re:WTF???? by Anonymous Coward · · Score: 0

      I haven't seen a production Java system in about a year. Doesn't mean they don't exist. You may not have seen a production system based on Common Lisp (which wasn't even around in its present form 15 years ago anyway...), but maybe that's because you're still a java weenie who has never played with the big boys.

    24. Re:WTF???? by statusbar · · Score: 2

      It's time that the Lisp people admit to themselves that they lost and move on.

      Actually, the reverse is true, my friend.

      XML and XSLT are dirty tricks made by bitter lispers on the rest of the computer world! XML is just a way to do "LISP sexps" in a worse syntax. Everyone accepted it because it looked kinda like HTML! They were tricked!

      It is trivial to make a program that converts XML to and from LISP sexps.

      Quite often it is very worthwhile to convert the XML to sexps, do your processing algorithm in lisp, and convert the resulting sexps back into XML.

      --jeff++

      --
      ipv6 is my vpn
    25. Re:WTF???? by Nailer · · Score: 2


      That really depends on your definition of XML and human readable.


      That's a very good point. If your definition of human readable is illogical and bizarre, that could indeed be XML. Or a pair of pants.

  15. What is the format? by pubjames · · Score: 2

    There's lots of speculation here about MS doing stuff to create lock-in with this new format, but I want to actually see the format. Is there any documentation anywhere about it? Or does someone out there have a document in the new format that we can take a look at? Of course, being XML we should be able to just open it and take a look. That would put an end to all this speculation.

    1. Re:What is the format? by Witchblade · · Score: 2

      A starting place. No way to know really how close they'll stick to what they've done up to this point.

  16. interesting! by garglblaster · · Score: 1
    Well does that mean that MS Office will eventually be able to import Openoffice documents which are XML based?

    That might be a big benefit for them. One of the main reasons why I have never considered using MS Office yet is their miserable support for Import filters. They cannot even handle Lotus' fileformats correctly..

    --

    perl -e 'printf("%x!\n",49153)'

  17. XML takes away Microsoft's main advantage by Zeddicus_Z · · Score: 5, Interesting

    As far as I can tell, one of the major reasons many businesses refuse to change over from Microsoft Office to cheaper options is due to file compatability. As our company's IT admin put it recently on the suggestion of using OpenOffice, "I get sent hundreds of Microsoft Word, Excel and Access documents a week. I need to know that I can open and access every single one of those without problems". An example of proprietry file formats helping Microsoft keep the monopoly.

    However, if Microsoft Office documents become "built around an open, internationalized standard", i.e. XML, would this not enable the people behind OpenOffice, StarOffice etc to acheive total 100% file compatability and thus negate Microsoft's largest advantage with Office?

    Of course, this could be yet another Microsoft "embrace and extend" tactic, a la` kerberos. Incorporate the standard in a bastardised form, claim standards compatability, then pollute it so you must be using Microsoft technology to properly interact with it.

    --
    Janie took my gun...
    1. Re:XML takes away Microsoft's main advantage by Anonymous Coward · · Score: 0

      Think about that one for a minute. Microsoft struggling to maintain an -illegal monopoly-..would they do anything that "takes away Microsoft's main advantage?" Are the executives there that ridiculous?

      Microsoft doesn't care about StarOffice or OpenOffice. I mean really, look at what they've already -done- with standards out there..most of them, Kerberos for example, went from being an industry standard to a Microsoft "standard" once they got their hands on it. In Microsoft's mind you wouldn't have trouble opening Excel or Access documents across all your machines because you'd be running whatever version of Windows is in store for us in the coming years..whether you want to or not.

    2. Re:XML takes away Microsoft's main advantage by jgp · · Score: 2, Insightful

      Have you seen the HTML produced by the current "Save as webpage .." options in Word? shudder. The vast majority of semantics are actually embedded in XML islands hidden inside HTML comments. I see no reason why Microsoft would change their tune now (they'll simply change the DTD from one inappropriate document model to another one IMHO).

      <wordDocument>
      <!-- (document content here) -->
      <nonMicrosoftElement>I'm sorry, you don't appear to have a StandardsEnhanced(tm) word processor.</nonMicrosoftElement>
      </wordDocument> --
    3. Re:XML takes away Microsoft's main advantage by Anonymous Coward · · Score: 0

      They won't need to bastardise anything to make unreadable XML. XML was deliberately written so that it can be (ab)used for anything.

      This is a valid XML document:

      F00FC7C8

      It is not "readable" on anything than an Intel Pentium og Pentium MMX, not even on a PPro or a P2/3/4.

      (Yes, I know, lousy example, nobody would want to "read" an XML-document that crashes your computer)

    4. Re:XML takes away Microsoft's main advantage by Anonymous Coward · · Score: 0

      Argh, where did my tags go?

      Isn't "plain old text" supposed to work without writing &lt;?

      The example should be:

      <asm arch="pentium">F00FC7C8<asm>

    5. Re:XML takes away Microsoft's main advantage by k4m3 · · Score: 1

      To my mind, one of the main reason why businesses keep office is that someday someone wrote a cheap macro that everybody use now. The fact that the document generated has the same format that the document having the macro is just a side-effect.

    6. Re:XML takes away Microsoft's main advantage by GT_Alias · · Score: 2, Insightful
      I need to know that I can open and access every single one of those without problems

      Interesting point...when people start buying Office 11 and sending you those XML-saved Word documents, you will have no option but to go out and fork over some cash for an upgrade.

      Unlike now, I can send an Office XP formatted Word document and older versions can still open it. Of course...older versions can't open newer databases, that's been a wonderful source of headaches.

    7. Re:XML takes away Microsoft's main advantage by ryanvm · · Score: 2

      Shhhhhhhh - you're going to ruin it you fool.

      Remember, we tell them their plans suck after they implement them (see SDMI).

    8. Re:XML takes away Microsoft's main advantage by Anonymous Coward · · Score: 0

      Yeah, Office uses Extended HTML designed to round-trip DOC -> HTML -> DOC without losing fidelity.

      Personally, I think this is a great feature because you can make "native" MS Word documents for free with "REN *.HTML *.DOC"

      But there's a big difference between fugly HTML and a pure XML format. The W3C doesn't define what your parser should do with a <nonMicrosoftElement>, and there's no expectation that the XML file will do anything when opened in Netscape/IE 4.0, etc.

    9. Re:XML takes away Microsoft's main advantage by Planesdragon · · Score: 2

      Have you seen the HTML produced by the current "Save as webpage .." options in Word? shudder. The vast majority of semantics are actually embedded in XML islands hidden inside HTML comments. I see no reason why Microsoft would change their tune now (they'll simply change the DTD from one inappropriate document model to another one IMHO).

      Those are there so you can "round trip" a file from doc/xls/ppt to ms-htm and back, without losing all of your MS-only formatting.

      MS has a utility, which I use on a regular basis at work to strip out the HTML for you. It's called HTML filter

      (Go to http://www.mvps.org/word/ for more useful bits about word.)

      XML, unlike HTML, actually can express everything that office does. MS will use it 99%, possibly adding their 1% extra back into the spec, and let their penetration and familiarity secure their market share.

      Trust me--if MS can get Wordperfect 2004 to use XML to "keep up," they'll be able to beat out their last real competitor in the few markets were it's still entrenched. The biggest problem with MS Word has been roundtrips to Wordperfect; XML can solve that problem if done properly, far better than HTML can.



      I'm sorry, you don't appear to have a StandardsEnhanced(tm) word processor.
      --


      Not going to happen. I could cut and paste some code from a MS-Office HTML file right into this slashbox, and you'd be able to read it just fine. (assuming the lameness filter doesn't get me first.)

    10. Re:XML takes away Microsoft's main advantage by Anonymous Coward · · Score: 0

      I get a few Microsoft documents every day, they gravitate towards /dev/null with the rest of the garbage. I'm thinking of rejecting the lot at the gateway, and explaining myself in an incomprehendable proprietry written language of my design.

    11. Re:XML takes away Microsoft's main advantage by hyperturbopete · · Score: 1

      dont be fooled. The XML format is gonna be like 99.5% open and compatible, but there will be one small option that will screw it up... maybe something like "patented smart-page-width- technology", so that if you load the same xml-doc file with open office and office the page width will be subtly different based on a secret formula, which will then make the whole document align differently.

      the community is foolish to expect anything less than the usual embrace, extend, etc.

  18. HTML from Word by e8johan · · Score: 5, Interesting

    Just look at an HTML file exported form Word2k. I would not call that compatible with any HTML I've ever learned. Most probably the XML file exported from Office 11 will be a Microsoft specific file, specifying lots of Office specific ActiveX (aka OLE) info that cannot be emulated. And, hey, they can probably store binary data in XML. The only change is that most competing products will emit files that Word can easily read, i.e. M$ will get the biggest benefits.

    1. Re:HTML from Word by pubjames · · Score: 5, Insightful

      Just look at an HTML file exported form Word2k.

      An excellent point sir. That's a great illustration of how Microsoft approaches 'open' file formats.

      People that think that MS Office is going to move to open, well documented file formats are just plain nuts. But look at many of the comments in this forum - it seems MS has even managed to persuade many Slashdotters that they are going to use open formats. Poor fools.

    2. Re:HTML from Word by superyooser · · Score: 5, Informative
      True. Just a couple days ago, I saved a doc as Web Page in Word (Office XP) hoping that some clipart would be saved in a web-friendly format. (This was originally made in Publisher, NOT by me.) It didn't work; it saved the images as .wmz! For the web?!

      Anyway, there was tons of gibberish in the file, but it displayed fine in IE6. It was a completely blank page in Mozilla! Nothing at all! We always knew the XP didn't stand for cross-platform, but I didn't know it was this bad.

    3. Re:HTML from Word by guybarr · · Score: 1, Troll


      But look at many of the comments in this forum - it seems MS has even managed to persuade many Slashdotters that they are going to use open formats. Poor fools.

      or hired voices ?

      --
      Working for necessity's mother.
    4. Re:HTML from Word by terraformer · · Score: 1

      How about the (') and ("")???
      For non windows users, that is (') and (")...

      --
      Who are you? The new #2 Who is #1? You are #617565. I am not a number, I am a free man! Muhahaha.
    5. Re:HTML from Word by CommandNotFound · · Score: 2

      it seems MS has even managed to persuade many Slashdotters that they are going to use open formats. Poor fools.

      Yeah, in less than five minutes on this thread I've seen the terms "rich API" and "framework", two of the biggest Microsoft-parrot terms of the last few years. BTW, I think it's safe to say that the word "framework" is on its way to be the official buzzword of 2003.

    6. Re:HTML from Word by jonbrewer · · Score: 2

      "Just look at an HTML file exported form Word2k. I would not call that compatible with any HTML I've ever learned"

      The trick is, does it validate to W3C specs? Last time I checked, though it was a disaster to look at, it did indeed validate.

      I frequently receive Word and Excel documents that need to be presented on the web. Generally I leave them as-is, storing them in a document management system and just serving metadata via the web, but on occasion I do a conversion. Though the HTML output from Word 2k is ugly, it is machine readable (for parsing and cleaning) and perfectly compliant.

    7. Re:HTML from Word by Anonymous+Custard · · Score: 1
      Just look at an HTML file exported form Word2k. I would not call that compatible with any HTML I've ever learned. Most probably the XML file exported from Office 11 will be a Microsoft specific file, specifying lots of Office specific ActiveX (aka OLE) info that cannot be emulated.
      Let's hope the word processor competitors (Staroffice, Corel) can make tools like Dreamweaver's "Clean Up Word HTML", only for cleaning up Word XML.

      Let's also hope that although MS Word may produce bloated XML, it can still read and process well-formed, simple XML, as good as (or better than) it can read and process non-word files, such as HTML or TXT files.
    8. Re:HTML from Word by e8johan · · Score: 2

      "Let's also hope that although MS Word may produce bloated XML, it can still read and process well-formed, simple XML, as good as (or better than) it can read and process non-word files, such as HTML or TXT files."

      Wouldn't this just give Word an edge over all other XML producting wordprocessors? It just keep the one-way compatability to M$ products where they can read what others export to them, but no one can (fully) read what they produce.

    9. Re:HTML from Word by Anonymous+Custard · · Score: 1

      It just keep the one-way compatability to M$ products where they can read what others export to them, but no one can (fully) read what they produce.

      Yeah, it probably will. But then again, for most simple to medium complexity Word and Excel files, Corel and others can read the formats well enough. And you have the option of outputting to CSV, TXT, RTF, etc. If your activity insists that you maintain a strict standard, then you should probably be using pure XML with XSL's or something, not just hoping that MS will do it. It'd be nice if they did, considering their market span, but there are alternatives.

    10. Re:HTML from Word by e8johan · · Score: 2

      IMHO:

      1) I think that all alternatives should try to read Word files properly.

      2) I think that all alternatives should support a proper open XML standard that will be truly interchangeable.

      3) I think that the alternatives need to provide viewers for this standard format in an easy-to-install and easy-to-use format for all Word users.

      4) I hope that M$ not will gain more advantages by polluting yet another standard format.

    11. Re:HTML from Word by fzammett · · Score: 2, Insightful

      Yeah, couldn't be that some people actually BELIEVE WHAT THEY WROTE, right??

      Why is it that every OSS zealot has to insist that any point of view contrary to their own is the result of a derranged mind?

      You want to try and convince me that Microsoft is evil and that I should shun absolutely anything coming out of Redmond and that I should embrace the OSS world? Fine, try and convince me. Do it logically and without insulting me. You'll find it's not that hard because I hate Microsoft anyway, but I don't hate every product they produce, in fact I very much like some of them (Win2K, Office in general as two examples).

      BUT DON'T FUCKING DO IT BY TELLING ME I'M A NUTCASE OR A PAID LACKEY OF SOME CORPORORATE ENTITY BECUASE I DON'T CURRENTLY AGREE WITH YOUR WORLD-VIEW!!

      Another group of people acted the way some of you people act... we fought a world war against them...

      --
      If a pion (n-) collides with a proton in the woods & noone is there to hear it, does lamdba decay into the source pa
    12. Re:HTML from Word by Anonymous Coward · · Score: 0

      Another group of people acted the way some of you people act... we fought a world war against them...

      A wonderful call for independent thought and rational debate.

    13. Re:HTML from Word by Pingster · · Score: 1

      Validation is a nice first step, but keep in mind that a valid document isn't necessarily a meaningful document.

      --?!ng

    14. Re:HTML from Word by tshak · · Score: 2

      Although your point makes for a nice +5 on /., it bears little intelligence. First, XML is not like HTML - it's strict. Second, everything MS has done with XML has been open and has strictly supported standards. Finally, did you even read the article? The whole point of the article is that Tim Bray, on of the leading XML guru's, has commented VERY POSITIVELY based on beta versions that he's seen.

      So, before you antt-MS troll, try reading the article and maybe even thinking.

      --

      There is no longer anything that can be done with computers that is nontrivial and clearly legal. -- Paul Phillips
    15. Re:HTML from Word by fzammett · · Score: 1

      Ah, rational debate on Slashdot? Surely you jest.

      And I'm sure your not trying to imply that all Hitler was jealous of was independent thought, are you?!?

      --
      If a pion (n-) collides with a proton in the woods & noone is there to hear it, does lamdba decay into the source pa
    16. Re:HTML from Word by HiThere · · Score: 2

      I'm not sure who was using that first, but I know that it was on the Mac years before it was on the PC. And it was standard in printing before that.

      It's quite plausible that I misunderstand what you mean, but directional quotes have a long history and are considered more readable than non-directional quotes. Imagine, if you will trying to read an expression that used non-directional parenthesis. Ugh! Now I'll agree that this is a more extreme example than the example of quotes, but not as much as is frequently assumed. Part of it is that we have just become habituated to the non-directional quotes.

      And for text processors that render a double quote as two single quotes, I have only one "word": ugh!

      --

      I think we've pushed this "anyone can grow up to be president" thing too far.
    17. Re:HTML from Word by Anonymous Coward · · Score: 0

      try changing the Save As HTML options. You can specify what browsers you want to be compatible with.

      and there are multiple files saved for a graphics. Not only is there a .wmz, but there is also a gif of your clipart.

    18. Re:HTML from Word by shnarez · · Score: 1
      No one is saying you have to change your world view, it's just that Microsoft has a poor track record of doing something `good' even when they've claimed before. In writing. And speech.

      I can point you to all the `standards' that they have taken and bastardized only to have it Microsoftified. I can show you what they have done in the past when they claimed when they are doing something `for the good of the users/citizens/etc'. But if that's not good enough, what is?

      Point is, when Microsoft says ``we'll give you '' that DOESN'T mean they'll just give you something to turn you away from their platform, they want to keep you using their products, it's just plain business sense. Look at the HTML produced by Word. That's not something anyone would post to the web (well, reasonable people wouldn't). Now, if their XML exporter is just as good, what use is it if it's Microsoft-XML?

      Note that it's an IF. When DOC2XML comes out and I see the beautiful output of XML That I can view in $EDITOR without any problems, then sure. However, in the past, Microsoft has not lived up to my expectations, nor what they've claimed. So they have to actually produce some quality this time, and before they ever do, all they say is moot.

    19. Re:HTML from Word by e8johan · · Score: 2

      Even though I've read the article, I remain a sceptic. It wouldn't be much fun if everyone just agreed with the main article, would it!

    20. Re:HTML from Word by guybarr · · Score: 2

      Yeah, couldn't be that some people actually BELIEVE WHAT THEY WROTE, right??

      could be, but I did not say \forall pro-microsoft posts are from hired pen just that \Exist some.

      And MS used this tactic of hired liars to overcome OS/2, so I see no reason why they shouldn't do it again.

      Why is it that every OSS zealot has to insist that any point of view contrary to their own is the result of a derranged mind?

      you find that in my post, where exactly ?

      You want to try and convince me that Microsoft is evil and that I should shun absolutely anything coming out of Redmond and that I should embrace the OSS world? Fine, try and convince me.

      I say you should read /. posts with a HUJE grain of salt. /. is a rumors site, and is as vulnerable to outside manipulations (thru lobiers) as any discussion room, or conference. This is the meaning of my post.

      I said a short, single sentence, the amount of information you infer from it is quite, ahem, impressive.

      Do it logically and without insulting me.

      Where did I insult anyone in particular ? Or specificly you ? I didn't say you are a hired-pen (I don't know you at all), I said I belive MS hires some (as she did in the past)

      (fzammet said)
      BUT DON'T FUCKING DO IT BY TELLING ME I'M A NUTCASE OR A PAID LACKEY OF SOME CORPORORATE ENTITY BECUASE I DON'T CURRENTLY AGREE WITH YOUR WORLD-VIEW!

      like they write in chess: ?!

      again, you infer quite a lot of info regarding yourself from a small general sentence, then curse and shout and deny what you presume you've read.

      drink a glass of water, cool a bit.

      Another group of people acted the way some of you people act... we fought a world war against them...

      I don't know if this is more funny or alarming.

      1) Who are "these people" that I'm supposedly a part of ?

      2) Hey, you're equating me to the Nazis because I think there are liars and hired-pen on /. .

      no personal insult there ... also, IMHO, not a lot of common sense.

      --
      Working for necessity's mother.
  19. MS Office and ML's by FireMotion · · Score: 1

    So MS Office will use XML in the next versions?

    It might be XML yes, but...
    I have seen what MS Office did to HTML.

    And I'm scared.

    --
    http://www.inspirelight.net/
    1. Re:MS Office and ML's by Valluvan · · Score: 1


      MS Word created HTML to display one line : "MS-HTML"

      Starts like this

      ... * Lotsa junk here * ...

      /* Style Definitions */
      p.MsoNormal, li.MsoNormal, div.MsoNormal
      {mso-style-parent:"";
      margin:0in;
      margin-bottom:.0001pt;
      mso-pagination:widow-orphan;
      font-size:12.0pt;
      font-family:"Times New Roman";
      mso-fareast-font-family:"Times New Roman";}
      span.SpellE
      {mso-style-name:"";
      mso-spl-e:yes;}
      @page Section1
      {size:8.5in 11.0in;
      margin:1.0in 1.25in 1.0in 1.25in;
      mso-header-margin:.5in;
      mso-footer-margin:.5in;
      mso-paper-source:0;}
      div.Section1
      {page:Section1;}
      --> /* Style Definitions */
      table.MsoNormalTable
      {mso-style-name:"Table Normal";
      mso-tstyle-rowband-size:0;
      mso-tstyle-colband-size:0;
      mso-style-noshow:yes;
      mso-style-parent:"";
      mso-padding-alt:0in 5.4pt 0in 5.4pt;
      mso-para-margin:0in;
      mso-para-margin-bottom:.0001pt;
      mso-pagination:widow-orphan;
      font-size:10.0pt;
      font-family:"Times New Roman";}

      ... * more junk here *...

      and finally, MS-HTML.. !!

      --

      Science as a way of life.
  20. Palladium by tsa · · Score: 1, Redundant

    I guess they trust on Palladium to make sure that XML-files can only be read and written using MS software.

    --

    -- Cheers!

  21. Typical XML-proponent mistake by Baki · · Score: 5, Insightful

    Just because the file format, instead of binary, is "human readable", does not make it more open.

    For "any programmer with a Perl script and a bit of intelligence" it doesn't make a difference if you read bytes (binary) or XML structures.

    As long as you don't get a DTD with extensive comments on how to interpret the elements, along with some promise/guarantee that the DTD won't change every minor release, there is no real improvement at all.

    The fact that XML is human readable is irrelevant, since no human shall read the files, but programs such as perl scripts shall. For them it makes hardly any difference; it is only marginally easier since you can use an existent XML parser instead of rolling your own (which is no big deal using the right tools such as YACC).

    This 'openness' comes at a good time for Microsoft. They suggest openness in a time that they are criticized and attacked because of file-format lock in. Many 'advisors' shall be mislead, blinded by buzzwords such as XML as they are, and actually believe that this solves the issue.

    1. Re:Typical XML-proponent mistake by smallpaul · · Score: 5, Interesting

      As long as you don't get a DTD with extensive comments on how to interpret the elements, along with some promise/guarantee that the DTD won't change every minor release, there is no real improvement at all.

      Have you ever tried to reverse engineer a binary file format? And have you ever tried to do the same thing with an XML file format? I learned huge chunks SVG yesterday _without_ opening an SVG book, just by mucking around in an existing SVG file and with an SVG viewer. Of course, Microsoft could do something clearly in violation of the spirit of XML, by making the whole thing one tag full of base64ed text or something. But as long as they use tags in a semi-sane way (which is the whole point, for integration with corporate systems), XML will be a big step forward.

    2. Re:Typical XML-proponent mistake by Baki · · Score: 3, Insightful

      One big difference: SVG was designed and is intended to be open and understandable. Office formats, using XML or not, are not. I do not believe MSFT would voluntarily cease their lock-in strategy.

      XML may be easier to reverse engineer, but must not be, this depending on how complex the DTD/Schema is and if the designer intended it to be easily understandable or not. Apart from that, as a purist I don't like reverse engineering, especially not if the subject of reverse engineering is from an uncooperative company known for its dirty tricks.

      A non XML grammar/syntax, if accompandied by a decent and documented EBNF description of it's grammar, is much better to base your program on than an undocumented XML.

    3. Re:Typical XML-proponent mistake by Anonymous Coward · · Score: 0

      I have reverse engineered multiple binary file formats back in the day. It is easy enough, if you have a bit of common sense. You just need a _good_ hexeditor. Most hexeditors you see are to a good hexeditor as notepad is to xemacs.

      And what's wrong with a non-xml textual fileformat? Just because Windows weenies have never heard of lex/yacc or Perl Parse::RecDescent, we have to cater to their insistence on using one particular tree-structured model? (That's become so bloated that its parsers are horrendous anyway, despite having no more actual expressivity than Lisp sexps?)

    4. Re:Typical XML-proponent mistake by Anonymous Coward · · Score: 0

      One caveat: It's easy enough, provided the binary format isn't actively encrypted or heavily compresed with a novel algorithm of course (you'll generally be able to recognise LZW chunks, though).

      If it's not encrypted, then it's just like the SVG-by example experiment. Change a byte, see what happens when you load the file.

    5. Re:Typical XML-proponent mistake by Anonymous Coward · · Score: 0

      The fact that it's open it doesn't mean it is easy to understand.

      Look at the file format specifications of OpenOffice.org, I tried yesterday (we are planning to create some Perl scripts that generate invoices in printable form) and the documents run to 500+ pages.

      You can barely get an idea of what is inside an OpenOffice file but I wouldn't bet on MS making things easy...

    6. Re:Typical XML-proponent mistake by TechnoVooDooDaddy · · Score: 1

      Many 'advisors' shall be mislead, blinded by buzzwords such as XML as they are, and actually believe that this solves the issue.



      I don't think Timothy Bray is looking at XML as just a "buzzword"

  22. Re:What?! by sirius_bbr · · Score: 1

    "all sorts of wonderful new things can be invented that you and I can't imagine"

    What?! I for one thing can imagine a Beowulf cluster.


    Yeah, Imagine a Beowulf cluster of machines all running Windows XP and Office 11!

    ... the horror...

    --
    this sig has intentionally been left blank
  23. The world will be a better place... by slantyyz · · Score: 1

    ... when all you need to convert Office files from one application to another with a simple XSL transformation.

    I won't hold my breath.

  24. Proprietary file formats... by lanalyst · · Score: 2, Insightful

    It seems M$ has done their best over the years to protect their file formats... The implication now is Ballmer's enemy #1 (open office, ximian, koffice, star office, joe's office, etc) will be able to interchange documents seamlessly with M$ Office.

    I don't know about anyone else, but the reason companies hold onto M$ (like grim death) is they receive documents via email in M$ format - defacto proprietary format.

    There has to be an angle here. This can't be construed as a tactic to hold market share.

    1. Re:Proprietary file formats... by lanalyst · · Score: 1

      my bad: redundant

  25. Dogbert the Evil software consultant by choka · · Score: 1, Funny

    Dogbert: Here's the new XML-enabled Micro$oft Office 11 our company should upgrade to.
    PHB: ...
    Dogbert: The suite is saving their documents in a format they have invented called M$ Xtremely Malformed Language (XML), and they are impossible to decipher and reverse-engineer for compatibility.
    PHB: (looks closely at the box Dogbert handed to him)...
    Dogbert: But that's okay, because you don't undestand anyway.

  26. Stalling tactics?... by pubjames · · Score: 5, Insightful

    Perhaps these announcements of XML compatible office file formats are just stalling tactics? MS has done it before.

    MS now has a serious competitor in StarOffice/OpenOffice.org. And that competitor has two compelling advantages - it's cheaper/free, and open XML file formats. So when clued-up IT people say to their Pointy-Haired Bosses that they should use StarOffice/OpenOffice.org, PHBs can respond "but MS is doing that next year. We can avoid all the disruption of changing office suites just by waiting a bit and upgrading to the next version of MS Office. Besides, we're already paying for it." Then when MS actually releases Office 11, they will have used all sorts of devious and subtle devices to keep their lock-in of the file format, and MS and PHBs will be happy.

    1. Re:Stalling tactics?... by Anonymous Coward · · Score: 0

      Why has this post been moderated as a insightful? There is nothing insightful about it at all.

    2. Re:Stalling tactics?... by tshak · · Score: 2

      Perhaps these announcements of XML compatible office file formats are just stalling tactics? MS has done it before.
      Did you actually READ the article? Of course not! I'm sick of moderators giving +5's to A) redundant posts in this thread and B) clueless posters who haven't even read the article. No, this is not a stalling tactic, Tim Bray and other 3rd parties have SEEN this and are very excited about it. This is not vapor-ware or market-ware, it's real, it'll be out, and you'll be able to parse it.

      --

      There is no longer anything that can be done with computers that is nontrivial and clearly legal. -- Paul Phillips
  27. Well Excel in Perl is pretty easy now by twoshortplanks · · Score: 5, Informative
    I've used the excel reading and writing modules for Perl with great success. They're easy to use and do the job. (there are also simpler interfaces if you want them too.)

    Or you could go the whole hog and use a SAX writer like XML::SAXDriver::Excel to create the documents from XML yourself.

    (This is not to say I don't think XML native formats arn't cool and will have many uses, I'm just pointing out what you can do now.)

    --
    -- Sorry, I can't think of anything funny to say here.
    1. Re:Well Excel in Perl is pretty easy now by Anonymous Coward · · Score: 0

      Why not just have Excel export the file as CSV?

      I guess I don't know what you put in them, but CSV is a pretty simple standard for spreadsheets...

    2. Re:Well Excel in Perl is pretty easy now by rbowen · · Score: 1

      And, if you really enjoy a challenge, you can use Win32::OLE to actually open Excel/Word/Etc and pass commands to it, which allows you to do things like export the file to a CSV, save it as HTML, or whatever you want. It's a little cumbersome, but it saves you from actually having to use the products yourself!

      --
      Apache guy, Open Source enthusiast, runner
    3. Re:Well Excel in Perl is pretty easy now by twoshortplanks · · Score: 3, Informative
      Why not just have Excel export the file as CSV?
      Oh, you can do that...but I've come across numerous problems while doing this. For a start, you lose the metadata about cells (i.e. if it's a formula or a string or a number with $foo number of decimal places.) You have problems associated with using multiple workbook speadsheets (annoying if you've ever had to use them.) CSV is okay (and I've used it quite a bit) but it simply doesn't hold as much info as the original file.
      --
      -- Sorry, I can't think of anything funny to say here.
    4. Re:Well Excel in Perl is pretty easy now by twoshortplanks · · Score: 2

      Yes, I've seen people do this. We're using ParseExcel and WriteExcel as our servers arn't running a Microsoft OS (and arn't near ones that are) and both these modules work file under FreeBSD.

      --
      -- Sorry, I can't think of anything funny to say here.
    5. Re:Well Excel in Perl is pretty easy now by RustyTaco · · Score: 1

      All well and good but it doesn't get you anything beyond a big table. You could probably do it much easier with simple perl array of array references and whatever the native dump nethod is called.
      I looked into it at one point but there was no way to throw out chart definitions so all it would get me is a conversion from one list of numbers to another. Again, with more complex code than a CSV dump.

      Yes, I have looked at AvtiveState Perl and their Excel binding but couldn't find the free version on their site. I figure if they don't want me to find it I don't really need to bother looking.

      - Rusty Taco

  28. Been there.... done that... by Kindaian · · Score: 1

    With Open Office... one or two years ago...

    ---
    Nothing here to see... move along... move along...

    1. Re:Been there.... done that... by Anonymous Coward · · Score: 0
      With Open Office... one or two years ago...

      You're right. And with Open Office owning ~90% of the market this means..., oh yeah, never mind.

  29. Too good to be true? by varslot · · Score: 2, Interesting

    The article states that:

    "The important thing," he explains, "is that Word and Excel (and of course the new XDocs thing) can export their data as XML without information loss..."

    Does this mean that MSO will have the same support for XML as currently for RTF? In that case I'm not that excited. If the default will be to save as MS-word format, and not XML (or MS-XML as the case may be), then we are no better off. Only Microsoft is, as they are now able to import OpenOffice/StarOffice documents.

    It's sort of like when Word could read WordPerfect documents in the old days.

    --
    There arises from a bad and unapt formation of words a wonderful obstruction to the mind. (Francis Bacon)
    1. Re:Too good to be true? by Anonymous Coward · · Score: 0

      simple, all that's needed is to spread a virus that modifies the default save format in office.....

      amazes me that it hasn't been done already !

  30. What I heard.... by LarsBT · · Score: 3, Interesting
    I can't remember the reference, but I heard that they will embed binary code for different word-objects within XML tags e.g.

    <equation> 0100100100111101010011010101101010010 </equaition>
    which is allowed in XML (if I understand XML correctly). So not much gain if everything is still in propriety closed binary format.

    I think maybe it was the CEO of Microsoft Denmark. I'm NOT sure though

    1. Re:What I heard.... by AnEmbodiedMind · · Score: 2, Funny
      More like
      <?xml version="1.0" encoding="UTF-8"?>
      <!DOCTYPE msword PUBLIC "-//W3C//DTD WORD 1.0//EN"
      "http://www.microsoft.com/word11.dtd">
      < worddoc >
      <![CDATA[ ??????????You'd be lucky????????? ]]>
      </worddoc>
      ;)
    2. Re:What I heard.... by Anonymous Coward · · Score: 0

      Technically, no that's not valid XML, because you spelled the start and end-tags differently.

      Besides, "Binary data in XML" usually means base64-encoded, not a string of 1s and 0s.

    3. Re:What I heard.... by LarsBT · · Score: 1

      I found the reference. It's in the recent report from The Danish Board of Technology on Open Source in Governement - here bottom page 23 - unfortunately in Danish only (abstract in english). Translation to english is underway.

  31. Skeptical by PizzaFace · · Score: 2, Funny
    Three questions about Word's XML format:
    • How's it encrypted?
    • Do I need a Passport account to open it?
    • Thank you, sir, may I please have another?
  32. The new Word XML document format: by Bazman · · Score: 5, Funny


    <uueWord2kDocument>
    M"@D)("!'3E4 @3$E"4D%262!'14Y%4D%,(%!50DQ)0R!,24-%3 E-%"@D)("`@
    M("`@(%9E7)I9VAT("A#*2`Q.3DQ
    M($9R96 4@4V]F='=A6]N92!I2!A;F0@9&ES=')I8G5T
    M92!V97)B871 I;2!C;W!I97,*(&]F('1H:7,@;&EC96YS92!D; V-U;65N="P@
    </uueWord2kDocument>

    1. Re:The new Word XML document format: by jsse · · Score: 2, Insightful

      Don't laugh yet. That's exactly what'd be happening.

      The new document just needs to have their meta-tags comply with XML, the rest could still be obscure junky as show above.

    2. Re:The new Word XML document format: by Bazman · · Score: 2

      Plus I now realise my uuencoding is broke. There were some '>' signs in the output that didn't get through! Never mind. MS will probably use the same format as tnef attachments :)

      Oh by the way, it was an uuencoding of part of the GPL...

      Baz

    3. Re:The new Word XML document format: by Anonymous Coward · · Score: 0

      UUE? As in Unix-to-Unix Encoding? I'd think MS would rather use Base64... or do the usual trick and roll their own Binary-to-ASCII encoding.

    4. Re:The new Word XML document format: by lubricated · · Score: 1

      you mean

      M"@D)("!'3E4@3$E"4D%262!'14Y%4D%,(%!50DQ)0R!,24- %3 E-%"@D)("`@
      M("`@(%9E7)I9VAT("A#*2`Q.3DQ
      M($9R96 4@4V]F='=A6]N92!I2!A;F0@9&ES=')I8G5T
      Netscape Engineers are weenies
      M92!V97)B871I;2!C;W!I97,*(&]F('1H:7,@;&EC 96YS92!D;
      V-U;65N="P@

      --
      It has been statistically shown that helmets increase the risk of head injury.
    5. Re:The new Word XML document format: by Alsee · · Score: 2
      UUE?!? Christ man! Get with the times!
      <?xml version="1.0"?>
      <yEncWord2kDocument>
      ))=J*:tpsp *++**+*+**)(*Ts????R|SJtzoqJv?????£VJ ??????J[V_V^V]`*)*m*0./0/.00/011024:44334>896:A>B BA>@@DGOIDEMF@@JVJMPQSTSCIWZW
      RZORSR)*m+111424=} 44=}RD@DRRR
      </yEnc2kDocument>
      -
      --
      - - You can't take something off the Internet! That's like trying to take pee out of a swimming pool.
    6. Re:The new Word XML document format: by Anonymous Coward · · Score: 0

      Yup, and being MS, they'd use their own Base64.

      BTW, this isn't theoretical. Have a look at the WMA DRMv2 container format (if you can find a file using it). See those content keys? Non-standard Base64 encoded. Mad.

  33. XML Based? by hpavc · · Score: 1

    Is this XML with buttloads of encryption?

    I just cannot see their whole office suite being like StarOffice or Excel's export to XML... thats to to good to be true.

    --
    members are seeing something, your seeing an ad
  34. Syntax vs. Semantics by mindriot · · Score: 5, Insightful

    Yes, the point of XML files is that their _syntax_ is simple and easily parseable by computers. But that doesn't tell you anything about the _semantics_ of a document. And as long as there is no proper documentation on what the mess of tags in your XML file means, there's hardly any way for you to hack together a Perl script to, say, extract plain text, or convert the Word XML file to an OpenOffice.org XML file, or whatever else comes to mind.

    1. Re:Syntax vs. Semantics by tshak · · Score: 1

      Which is why MS is know for Very Good Documentation. You may think I'm sarcastic, but visit MSDN and you can find almost anything you want regarding any technology. Of course, there's exceptions to every rule (part of Win32), but with .NET, DirectX, "MS DHTML", etc. there's a ton of documentation, just like there will be on the new Office XML format.

      --

      There is no longer anything that can be done with computers that is nontrivial and clearly legal. -- Paul Phillips
  35. hmmm... by Anonymous Coward · · Score: 0

    there is no money in software products anymore. microsoft sees it so should you. that is why they are doing the .net services and palladium thing. who cares about file type monoplies when they can dictate how you use your computers and sell you your monthly subscriptions.

  36. Other MS products using XML by Anonymous Coward · · Score: 2, Insightful

    Other MS products that use XML (Visual Studio.net, for example) actaully do it quite well. The VS.net generated XML, including project files, is clean and very readable.

  37. C'mon People by BurritoWarrior · · Score: 3, Insightful

    Office's MS-XML will be even less compatible with sthe spec than MS-Kerberos or MS-Java/J++. Office is their cash cow. It brings in 30-40% of their revenues all by itself.

    If you think there is even a remote chance in he-double L that MS will loosen their grip on this revenue stream, I have a bridge to sell you.

    You can call this flamebait if you want, but what in MS's history would lead me to believe they are suddenly going to change their historic behavior pattern AND risk a huge amount of revenue at the same time?

    1. Re:C'mon People by Anonymous Coward · · Score: 0

      This Comment is a -1 troll, but "Why the fuck would MS give up their Office file monopoly" is +5 interesting.

      We need something better than meta moderation in its current form, Rob.

  38. I can see it now... by dimator · · Score: 3, Funny
    <?xml version="1.0"?>
    <document type="word">
    <![CDATA[
    @%MYD<V@Q4VEA8^`!AX0>DN6UIJE=^1J;1F\ @! (P@@<$Y(@OL%AS`0B=$<S*
    4&A399HT2*S-@*+U&1)+KCS>J4 HJTZ=^F534G%_S8\6=YS7?#59_.U!YI[_^
    AU$`HOG^5/N3A9 9'<\V/YP`(T*'MZ6)3UVSCDYF&+B;0H?7I3'O7'(2/H(Z>
    U= ;N1:`!4*"4U/ATNK5GOO+^B\O?/\QK^3KE>KVYL"PN-3O2'/^9 3/U)I.PP
    FXG3%*.RR)0.R'/&N!?>U'*;4FK6B,U:B<4@-6O% 1D!!%Z/31&E(R*MCU,HH
    15RT`H`P2H$,O5FB!R,`"*`!J5FJ -4@TNEB5)E:'"D;AO4.?>-Z1FGVQN"3O
    VN6RANM76P&((F=# 3GYM05%C;E%C1F[MQ>P:*".O*3VW,<-9`T:D.^O2BE@*
    4N25 U@$0X#X!(B8*+H-1(3'!Y9'%ZF1B%7P9E#"^90&U72560M1E`R F$1;4$
    :%/(I$JY3"67*"&E5,4&X.2>R]!F@"#7VLH>;5`>@( "`!IX4A`FK)LG*7O%D
    P^$G)10Y"^L:FO_^\,GTP-"V:_R/GL %-,**[?^UIWRK2YT.;70-KW8.LG;)[
    ]]>
    </document>
    --
    python -c "x='python -c %sx=%s; print x%%(chr(34),repr(x),chr(34))%s'; print x%(chr(34),repr(x),chr(34))"
    1. Re:I can see it now... by dmiller · · Score: 4, Redundant

      This is probably dead-on, except it will be:

      <document type="word">
      <ole><![CDATA[ (linenoise) ]]></ole>
      </document>

      I.e OLE blobs embedded in an XML container

    2. Re:I can see it now... by Jester99 · · Score: 2

      Hm. If that's what the office docs will look like, then I think it should be pretty easy for perl hackers to use them... it seems to me that they would just need to strip off the tags, and then they've got a valid perl script that formats the document automatically. ;)

    3. Re:I can see it now... by thatguywhoiam · · Score: 1
      Huh. Lookit that - same XML joke 3X in a row, and nary a Redundant in sight.

      We are SlashBorg. Humour is futile.

      --
      If Jesus wants me it knows where to find me.
  39. They already did this for two other products by pvera · · Score: 5, Interesting

    SQL Server has had an XML web gateway since version 2000. You can run any query and output it as xml or have an xml template pull the query and transform the results with XSL, all without one line of server side script.

    ASP.net uses XML for all the human-readable files, and the IIS in windows.net server finally uses Apache-style configuration files which are also XML.

    --
    Pedro
    ----
    The Insomniac Coder
    1. Re:They already did this for two other products by Anonymous Coward · · Score: 0

      Yes, it finally uses Apache code.
      Now you can pay for Apache! Grrrreat

  40. Yeah, right by Alex+Belits · · Score: 3, Insightful

    XML is a format with nearly infinite possibilities for obfuscation, convolutedness and poorly defined standards. The most we can expect is the possibility to validate a file to absolutely certainly determine if it is compliant with the new Word format or not.

    --
    Contrary to the popular belief, there indeed is no God.
    1. Re:Yeah, right by Anonymous Coward · · Score: 0

      Thank god someone said it.

      I've been using various open source tools that have switched to XML formats for their config files. After looking at the -ahem- "mess", I can only ask, how is this supposed to be BETTER than, say, a nice, logical Windows-style INI file?

      Even worse, there are certain programs (like Kismet) that output their result sets in XML. Well that's just great, except that there is no way to parse this and actually RENDER it in any sensible format. Why not HTML? At least then I can throw it into a web browser. Can't do a damn THING with XML though.

  41. In word 2000, by Anonymous Coward · · Score: 0

    They bloated out all your HTML with meaning less XML and Useless Stylesheets.

  42. M$ only cooks with water too. by Qbertino · · Score: 3, Insightful

    I'm working with that weedy Word 2k at the office. And we use Outlook as a standard communication Platform. Believe me, that their Software often is such a pain isn't that much of a greater plan to rule the world, but more the flat-out ineptitude of delivering products with a conceptual consitency.
    Looking at Frontpain and Word HTML and extrapolating XML from that, tells me they're gonna do just a crappy job as usual and really think they've done a great thing.
    Just like the people sending me source code additions and DB content as Wordfiles. Nothing but simple inemptitude, I say.
    Not that my System of choice, Linux, is that much more consistent. Mind you. With a bazillion Font methods, every single one of them looking crappier than the next and QT, GTK+, Motif, Lesstif, Inbetweentif, Swing, TK and whatnot and none of them following the same Clipboard behaviour it's just as weedy. Only it is under *my* control to change it.
    That way, the bottom line is: With OSS if it doesn't work, there's another way. With M$ it's 'Game Over' with the first "Error in module [fill in random hexcode here]".
    That's the simple difference.

    --
    We suffer more in our imagination than in reality. - Seneca
  43. what a cool codename by Anonymous Coward · · Score: 4, Funny

    code-named 'Office 11'

    awesome. Apparently the next version of the linux kernel is code named 2.6! Wow!

    1. Re:what a cool codename by Anonymous Coward · · Score: 1, Funny

      Yep and the release name will be linux 3.0 :-)

    2. Re:what a cool codename by dumbArtMajor · · Score: 1

      Think about it: Is it **really** the eleventh release of MS Office? I don't remember and frankly it doesn't matter.

      It's called "Office 11" so they can release it right before Apple releases OS 11, and thus will look "behind the times" compared to MS.

      Simple yet effective marketing technique.

    3. Re:what a cool codename by joshuac · · Score: 2

      ---snip
      It's called "Office 11" so they can release it right before Apple releases OS 11, and thus will look "behind the times" compared to MS.

      ---snip

      Nice theory, but it actually is their 11th release of Office. Trying to make Apple look "behind the times" probably did not even occur to them. Click help/about in an office application to see it's version. Word 2000 is part of Office 9.

    4. Re:what a cool codename by Salsaman · · Score: 1

      Was there ever a system 8 or system 9 ?

    5. Re:what a cool codename by yoderm · · Score: 1

      But you see, this one is different. It goes up to 11.

      (With apologies to Spinal Tap)

      -Mike

      --
      This sig no verb.
  44. How to convert Word to XML by Korth · · Score: 5, Informative

    I've recently been reviewing a dozen of different software to convert from Word to XML.

    So far the best tool I found is upCast (free for personal use) from http://www.infinity-loop.de/ .

    To convert a Word file:
    * Use Word's AutoFormat feature to convert visual formatting to Word styles
    * Redefine all the text as Word styles
    * Run upCast to convert to XML using the "XML (content, no DTD)" filter
    * Run HTML Tidy from http://tidy.sourceforge.net/ with the parameters -xml -utf8 -clean -bare .

    Other tools that might be worth a second look:
    * Majix (Open Source) - http://www.tetrasix.com/
    * WorX SE - http://www.xyvision.com/
    * XML MarkupKit (in German) - http://www.eds.schema.de/download/MarkupKit/
    * DocSoft LLC Word-to-XML - http://www.docsoft.com/w2xml.htm

    1. Re:How to convert Word to XML by frank249 · · Score: 2

      To convert MS Word docs to XML, I just open the doc in WordPerfect 10 than save to XML. Whats the big deal? WordPerfect has been able to do it for the past couple years. BTW, It also publishes to pdf.

      --

      Today's vices may be tomorrow's virtues.

    2. Re:How to convert Word to XML by Anonymous Coward · · Score: 0

      The problem is the huge amount of junk and lack of structure you'll get inside the XML files.

    3. Re:How to convert Word to XML by croftj · · Score: 1

      If you have to use MS autoformat, then you lost the battle on step 1! I have yet to click on autoformat and have produce any document suitable for use.

      Maybe this will work for the simplest of documents, but not for anything mre complex than a letter home to mom.

      --
      -- Many men would appreciate a woman's mind more if they could fondle it
    4. Re:How to convert Word to XML by Anonymous Coward · · Score: 0

      Most Words users don't use any of the Word styles. They just change the font size and so on. While the AutoFormat is far from perfect, it helps putting things into some initial structure. This step is optional, but it will make your life easier, because it cleans up a lot of the Word junk in the process.

  45. Hype! Hype! Hype! by RobotWisdom · · Score: 5, Interesting
    This article is pure PR, with no new content. The XML-cult will keep waving their hands and promising great payoffs 'RSN' (real soon now) until people actually start trying to implement uniform semantic tags in their data and documents... at which point universal disillusionment will set in because the problem is way too hard even for trained AI-PhDs. [more]

    The thread a couple of weeks ago about the death of META headers will apply 1000 times worse for semantic tags-- if the semantic web is going to work at all it needs to start from headers describing the webpage as a whole.

    (Also, what's with XML-Journal's claim the article has three pages when it only has two?)

    1. Re:Hype! Hype! Hype! by greenrd · · Score: 2
      From your linked page:

      The central problem of AI is to find a finite vocabulary that can be used to express any idea.

      MS's promises have nothing whatsoever to do with "understanding" the semantics of a letter to your girlfriend or whatever and expressing your sentiments as an XML tree. If you think it is, you have failed to understand the article! It is not an attempt to mark up semantics, it is an attempt to convert things like bold, italic, font size into XML representations.

    2. Re:Hype! Hype! Hype! by Anonymous Coward · · Score: 0
      The XML-cult will keep waving their hands and promising great payoffs 'RSN' (real soon now)
      Er - I've been using XML for 3 years now, and it's given me great payoffs every single time. I read your page and it's nothing to do with anything I've ever tried to do with XML, so I don't think there'll be any problem from that. Why should we try to implement uniform semantic tags? All we need to do is implement a few tags for a particular class of "document" or data structure - there's no requirement for universality at all. I think your "trained AI-PhD's" need to climb down that ivory tower and go outside in the fresh air for a while.
    3. Re:Hype! Hype! Hype! by RobotWisdom · · Score: 2
      "MS's promises have nothing whatsoever to do with 'understanding' the semantics of a letter to your girlfriend or whatever and expressing your sentiments as an XML tree. If you think it is, you have failed to understand the article!"

      This is just the old XML bait-and-switch again. Bray writes of "all sorts of wonderful new things [that] can be invented". TimBL touts the Semantic Web as the immediate justification of XML.

      "It is not an attempt to mark up semantics, it is an attempt to convert things like bold, italic, font size into XML representations."

      No, you are simply wrong.

  46. Why do people get so excited about XML? by Viol8 · · Score: 2, Insightful

    Yes so its portable. Yes so its (mostly) human readable. So what? So is GWBASIC. XML is just a data description format (I wont grace it by calling it a language , its not) and there have been plenty of portable DDFs in the past. Pdf , postscript (though the latter is actually a language). So why all the hoo-ha about XML? Seems to me that various marketing types have jumped on the bandwagon with this one and are going to ride it till the wheels fall off and take all the suckers along with them.

    1. Re:Why do people get so excited about XML? by Anonymous Coward · · Score: 0

      I agree it isn't magic. But what better format can you think of for structured data?

    2. Re:Why do people get so excited about XML? by Viol8 · · Score: 0

      Something a bit less long winded and ugly. Mimicking HTML using and is pointless and reduces readability. There are probably dozens of better control character combinations that could denote block start & ends that would aid readbility.

    3. Re:Why do people get so excited about XML? by Anonymous Coward · · Score: 0

      So go for it? Lets see some better code!
      I'm not saying your are wrong.
      If you can come up with something better
      maybe people will jump on that.

      Here's something. I am using square
      brackets instead angle brackets.

      XML:

      [xml version="1.0"]
      [stuff attrib="1"]Hello world[stuff]
      [/xml]

      An alternative:

      xml version=1.0
      {
      stuff attrib="1" {Hello World}
      }

    4. Re:Why do people get so excited about XML? by CodeMunch · · Score: 1

      Amen to that. It's glorified text delimiting. The hypewagon got us, but the "human readable" bit, I think, is the real kicker. no longer do we need to use smaller & more efficient single characters to delimit, we can use a whole word & "write" out a chunk of data fairly quickly. Also, using a whole word lets make the data self-describing. So, yet again, we sacrafice efficiency for bloat & useability, but that may make it more efficient in the end, who knows!

  47. Bigger picture by Cheese+Cracker · · Score: 3, Insightful

    Look at the bigger picture of where Microsoft is heading. They're diversifying their line of business.
    In the past, MS Office was the cash cow at Microsoft, but the market for office packages is rather
    saturated... companies and governments are looking for cheaper alternatives etc. Not much room to
    grow. Now they can afford playing the good guys by opening up their file formats, since they got
    new markets to capture... mobile phones, handheld computers, home entertainment etc.

    1. Re:Bigger picture by Melantha_Bacchae · · Score: 2

      Cheese Cracker wrote:

      > In the past, MS Office was the cash cow at
      > Microsoft, but the market for office packages is
      > rather saturated... companies and governments are
      > looking for cheaper alternatives etc. Not much
      > room to grow.

      How quickly you all forget. Office 11 is to be on the subscription plan. Microsoft said so long ago, and Licensing 6 makes it reality.

      > Now they can afford playing the good guys by
      > opening up their file formats, since they got
      > new markets to capture... mobile phones,
      > handheld computers, home entertainment etc.

      Now they have new markets to subsidize. They need their cash cows more than ever. This Christmas season could be the demise of the X-Box, long before it is ever paid off.

      Of course the customers mostly saw Licensing 6 for what it was and two thirds of them refused to be exhorted of "unearned profits" on a regular basis.

      That's the ironic part about thousand year kingdoms: when they barely last a day. ;)

      Shinoda: "The age of Millennium."
      Io: "What does that mean?"
      Shinoda: "A thousand year kingdom. It wants to create a home for itself. There is one flaw in its plan: Godzilla."
      "Godzilla 2000 Millennium" (Japanese version)

    2. Re:Bigger picture by Cheese+Cracker · · Score: 1

      Office 11 is to be on the subscription plan.

      Those who takes this subscription plan better sign up for a prescription plan for Prozac as well.
      The rest looks into Windows alternatives as well as moving everything over to Linux. ;)

    3. Re:Bigger picture by Melantha_Bacchae · · Score: 1

      Cheese Cracker wrote:

      > The rest looks into Windows alternatives as well
      > as moving everything over to Linux. ;)

      Some (not all) of the Office alternatives (for the benefit of the 40% that are looking for alternatives):

      Windows: Word Perfect, Smart Suite, Star Office

      Mac OS X: Apple Works, Think Free, Open Office Beta (I *think* it still requires an X Server), MS Office X (slightly less evil at a higher price)
      (Jaguar is a thing of beauty and a joy forever, but we do need better alternative office suite support.)

      Linux: Star Office/Open Office, KOffice, loads of other software I'm sure the Linux users will be glad to fill us in about ;)

      *BSD: not personally sure, but probably runs a lot that Linux runs.

      What happens when you embrace and extend Godzilla? Nuclear heartburn!
      See "Godzilla 2000" (released in Japan as "Godzilla 2000 Millennium") for details.

    4. Re:Bigger picture by Cheese+Cracker · · Score: 1

      Really nice list of alternatives! I'll keep these in mind when I move to Mac and Linux. I'm going for
      both since I need it for testing compatibility of my web applications. Thanks a lot! :)

  48. Information wants to be free by djupedal · · Score: 1

    Don't be so fast...how does he know what goes on in someone else's mind. I've been imagining all sorts of good things in that regard, for quite some time.

    And it's not the documents we're talking about...it's the content. The content that has been held captive by MS for so long.

    That big sucking sound you hear it all those U-Hauls leaving Seattle.

    1. Re:Information wants to be free by djupedal · · Score: 2

      ...like I said.

  49. What we need is a ISO standard by javilon · · Score: 5, Interesting

    The open office group should get together with the rest of the guys (abyword, koffice and maybe wordperfect) and work out a format that can be submitted to the ISO. Possibly based on the open office format.
    Then goverments and corporation will adopt it for official documents so they can read their own documents in ten years.

    --


    When his defense asked, "Which computer has Jon Johansen trespassed upon?" the answer was: "His own."
    1. Re:What we need is a ISO standard by pubjames · · Score: 2

      The open office group should get together with the rest of the guys (abyword, koffice and maybe wordperfect) and work out a format that can be submitted to the ISO. Possibly based on the open office format.
      Then goverments and corporation will adopt it for official documents so they can read their own documents in ten years.


      You are absolutely correct sir. This is something I've been ranting about for ages.

    2. Re:What we need is a ISO standard by AnEmbodiedMind · · Score: 2, Informative
      From OpenOffice:

      The OpenOffice.org XML project contains support for and implementation of the XML based file format.

      Mission
      Our mission is to create an open and ubiquitous XML-based file format for office documents and to provide an open reference implementation for this format.

      Core Requirements (these items are absolutely required)

      • The file format must be capable of being used as an office program's native file format. The format must be "non-lossy" and must support (at least) the full capability of a StarOffice/OpenOffice document. The format is likely to be used for document interchange but that use alone is not enough.
      • Structured content should make use of XML's structuring capabilities and be represented in terms of XML elements and attributes.
      • The file format must be fully documented and have no "secret" features.
      • OpenOffice must be the reference implementation for this file format.
      Core Goals (these items are highly desired)
      • The file format should be developed in such a way that it will be accepted by the community and can be placed under community control for future development and format evolution.
      • The file formats should be suitable for all office types: text processing, spreadsheet, presentation, drawing, charting, and math.
      • The file formats should reuse portions of each other as much as possible (so for example a spreadsheet table definition can work also as a text processing table definition).
      Standardization and Inter-Office Cooperation
      There is a office_standards mailing lists hosted on this site, intended to foster cooperation between the various office suites. At this early state no results have been achieved, but we are certainly excited about the prospects. For details, look at http://xml.openoffice.org/standardisation/ .
      Its on its way... maybe
    3. Re:What we need is a ISO standard by lambda9999 · · Score: 0

      The ISO tried it. It was called ODA
      and was a complete failure.

    4. Re:What we need is a ISO standard by Anonymous Coward · · Score: 2, Informative

      This is already in hand. Sun are taking the OpenOffice XML file format to OASIS for standardisation. Something should be announced about the formation of a working group on this real soon now.

    5. Re:What we need is a ISO standard by pubjames · · Score: 5, Insightful

      The ISO tried it. It was called ODA
      and was a complete failure.


      So? Formats come and go all the time. Just because the ISO failed in the early nineties doesn't mean someone else would fail today.

    6. Re:What we need is a ISO standard by pubjames · · Score: 3, Interesting


      This may interest you:

      http://www.1dok.org/eng/index.html

    7. Re:What we need is a ISO standard by Jason+O'Neil · · Score: 2, Interesting
      That's actually a really good idea. If all the OSS Word Processors created a file format that worked seamlessly from program to program, it would be a major plus for all the smaller word processors.

      It would allow for competition in Linux word processors, without having to worry about file format compatibility problems.

      Then if someone just creates a script which converts MS Office docs (on mass, like every one inside the directory structure) to this wonderful new format (Should be possible thanks to Open Office) and it would be much easier to then switch to OSS.

      I personally have no problems with the current open office format, but if they made it human readable, so it can be created from plain text editors if necessary...

      Quick somebody suggest it to them

  50. XML co-inventor- pleeease by Anonymous Coward · · Score: 1, Insightful

    The article give Tim Bray XML "co-inventor" status. Come on. Ever since HTML was around people have been extending it with fake tags like , , etc. Sure XML is useful but hardly an invention.

    1. Re:XML co-inventor- pleeease by Anonymous Coward · · Score: 0

      ooops, I meant tags like [sarcasm] and [humor]

  51. Whats wrong with html/css2 ? by captainclever · · Score: 0, Troll

    Whats wrong with HTML and CSS2 for all your word processing? Then its totally cross platform and web-ready...

    just a thought.
    RJ

    --
    Last.fm - join the social music revolution
    1. Re:Whats wrong with html/css2 ? by Jugalator · · Score: 3, Informative

      Whats wrong with HTML and CSS2 for all your word processing?

      I don't think the new XML format is meant for documents you wish to publish on the web. Office already support the HTML format pretty well (with some extensions.. ahem) since Office 2000. HTML support works even better in Office XP since it allow you to save the document as "filtered HTML", where Office filters most of the Office-specific tags and attributes at the cost of loosing some information in the document.

      I think the XML format is being added since XML represent the document with a much more meaningful structure that's easier to parse by third party software for use in electronic commerce and other automated systems, something that's inappropriate to use HTML code for, as it was designed to make pretty layouts, not to describe content for easy parsing.

      I think it's pretty obvious why MS would want to add XML support - to spread their Office document format and make Office useful in places such as web services where it wouldn't be as useful before.

      --
      Beware: In C++, your friends can see your privates!
    2. Re:Whats wrong with html/css2 ? by Anonymous Coward · · Score: 0

      Lack of wysiwig editors and IN-FILE Object embedding standard. HTML takes linking too far -

      I should be able to include an image, say, as a byte stream WITHIN the HTML document.

      Star Office solves this elegantly - a star office document is a compressed directory tree.

    3. Re:Whats wrong with html/css2 ? by captainclever · · Score: 1

      Ahh yes i can see that not being able to imbed an image into a file would be a real pain in the arse.. fair enough. didnt really think about that.

      --
      Last.fm - join the social music revolution
    4. Re:Whats wrong with html/css2 ? by whovian · · Score: 2

      I don't think the new XML format is meant for documents you wish to publish on the web.

      Just being curious here and not a troll, I thought mozilla supported XML. Take a look at this page, where it appears that XML style sheets can be used to impart some BibTeXian features to information perhaps meant for the web. It looks potentially very useful.

      --
      To-do List: Receive telemarketing call during a tornado warning. Check.
    5. Re:Whats wrong with html/css2 ? by Lil'wombat · · Score: 1
      Office already support the HTML format pretty well (with some extensions.. ahem) since Office 2000. HTML support works even better in Office XP since it allow you to save the document as "filtered HTML", where Office filters most of the Office-specific tags and attributes at the cost of loosing some information in the document.

      Excellent point and true. If you read the documentation, MS Word does not clain to produce HTML for the web only. They claim to produce HTML that would allow for Round-Trip-Processing. Word to Web back into Word without loss of presentation information

      --

      Truth: If it's not one thing, it's another

    6. Re:Whats wrong with html/css2 ? by Jugalator · · Score: 2

      Yeah, Mozilla support XML and I even think it supports XSLT for any layout needs. However, I still think XML in Office 11 is meant for parsers and automated systems, since I don't think MS would bother with it when they already have implemented HTML for web purposes. But I can't be completely sure of what Steve Ballmer & co are planning. :-)

      --
      Beware: In C++, your friends can see your privates!
  52. Nice, but redundant statement by cwernli · · Score: 4, Funny

    any programmer with a Perl script and a bit of intelligence

    and I thought intelligence was a prerequisite to be able to handle perl ? :)

    1. Re:Nice, but redundant statement by Anonymous Coward · · Score: 0

      any programmer with a Perl script and a bit of intelligence

      and I thought intelligence was a prerequisite to be able to handle perl ? :)

      Hmm, depends on the perl script. I've got one here that decrypts DVD copy protection, and I've got the intelligence not to use it on a word doc.

      What he should have said is "any perl programmer with a bit of intelligence and 20 XML-handing perl modules, an XSLT utility and a large jug of strong coffee".

    2. Re:Nice, but redundant statement by tanveer1979 · · Score: 2
      any programmer with a Perl script and a bit of intelligence
      and I thought intelligence was a prerequisite to be able to handle perl ? :)

      But you can have a perl script even if you dont know how to handle perl. This is the beauty of perl, intution, a person who dosent know how to handle perl, but has used a bit of regex(read intelligence) can actually make a perl script work.

      --
      My Aurora : http://www.youtube.com/watch?v=o91ZsGwJYyg
      FB : https://www.facebook.com/TanveersPhotography
    3. Re:Nice, but redundant statement by poot_rootbeer · · Score: 2


      You obviously haven't had to work with the kind of legacy code I have...

  53. Incompatible XML? by Dexter77 · · Score: 2, Redundant

    It's very easy to make an XML document that can't be processed with any common parser library. It will make programmers work extremely hard if they have to make different XML parser for M$-XML.

    Now if the M$-XML isn't compatible with the standard XML what's the use? You still have to save it in M$-XML format to be able to use it with Word. If most coders want to use M$-XML it might even brake down XML standard since there are more Word documents in the world than XML documents put together!

    1. Re:Incompatible XML? by Coriolis · · Score: 1

      Nice straw man argument/troll there. Possibly you ought to wait and examine the documents it puts out before leaping to wild conclusions?

      Tim Bray's comments seem to indicate that the XML is indeed valid and parseable by anything.

      Please mod the parent down.

      --
      Rgasuya aata! : I have been coding Perl and cannot tell where my fingers are now!
    2. Re:Incompatible XML? by Anonymous Coward · · Score: 0

      It's not easy to make XML documents that XML parsers can't process - such a document wouldn't be XML, by definition.

      Also, Microsoft's current XML parsers (MSXML4 and System.Xml) wouldn't be able to read the document.
      And that's not going to happen.

  54. Re:MS moderators? by pubjames · · Score: 2


    Why has this post been moderated as a troll? There is nothing trollish about it at all.

  55. Is he crazy? by Lumpy · · Score: 2, Redundant

    when the huge universe of MS Office documents becomes available for processing by any programmer with a Perl script and a bit of intelligence, all sorts of wonderful new things can be invented that you and I can't imagine.'"

    so a new software release will "magically" convert every document ever made to XML? I dont think so. The fact that they will finally have compatability with the rest of the planet is nice, but I'll bet a $100.00 that they will bastardize xml to their liking just like how they did it with IE and HTML.

    --
    Do not look at laser with remaining good eye.
  56. Political XML by Wheelie_boy · · Score: 2, Insightful

    Looks like M$ has found a way to placate those various governments that are beginning to insist on open file formats for data storage.

  57. I don't believe it. by Anonymous Coward · · Score: 0

    Obsfucation of the word file format (and FUD that staroffice/openoffice etc won't read their files correctly) is, for a lot of people, the only thing keeping them from switching from M$ to Linux. M$ just wouldn't throw away a central part of their business model.

    I don't believe it, and I won't until I see an XML file of a complex word document that is actually completely understandable.

  58. There is some documentation of Office XML already. by frleong · · Score: 5, Informative
    Here at MSDN

    It is simply not what others is claiming: <?xml version="1.0"><data>blahblah</data>

    --
    ¦ ©® ±
  59. Yes! So portable! by Anonymous Coward · · Score: 0

    This will be so useful!

    <xml format="ms">
    USODIFU(@#*&$*&@#*&@#($&*FHS*H(*SDYF (*SDF8234587*& 348734
    </xml>

    I cannot wait!

  60. Imagining Wonderful Things by superyooser · · Score: 1
    "... the newly VBA-enabled version of Microsoft Office, code-named 'Office 1.1' and tells VB-Journal that 'when the huge universe of MS Office documents becomes available for processing by any programmer with a VB script and a bit of intelligence, all sorts of wonderful new things can be invented that you and I can't imagine.'"

    XSL - the new MS vulnerability?

  61. MS-standards (oxymoron?) by billybob2001 · · Score: 2, Flamebait
    There appears to be some confusion surrounding M$ using the acronym XML.

    Rather than assuming this relates to eXtensible Markup Language, consider the following insider information:

    M$ have been basing their business model on XML for years.


    It stands for Kiss My License!


    X

  62. Re:Too good to be true -- WRONG by Anonymous Coward · · Score: 1, Insightful

    it doesn't matter if everyone is able to read, modify and generate Office-compatible files.

    For many businesses, the ONLY thing keeping them using MS is file compatability. They can't change because it's industry standard, and they need to be able to share docs with their suppliers and customers.

  63. Here is my idea of what they are trying to do by rikkards · · Score: 1

    I think this is a case of where they are abusing their monopoly situation.
    I think they realize that companies are planning on moving to Linux from Windows and that if they can placate the masses with more accessible file formats for their Office suite with XML, then since it works only with their OS (especially with the SQL database that is going to be incorporated) then it makes it more difficult for users to move.

    Don't know if this makes sense....

  64. What about DRM ? by Anonymous Coward · · Score: 0
    The last big thing they announced was "DRM-in-your-pants", so I truly wonder how they'll keep the access rights locked away from Joe Sixpack and his freeware XML editor...



    John Smith
    WriteOnly
    ReadWrite
    None ...

  65. XML the Microsoft way by Anonymous Coward · · Score: 0



    "%$RGTKFDBGUIT&%TGBHG(%TGJ
    ETGWESBJYSDGDRU%"$QGBTHJWE&QATGHAQT

    Yes sure wonderful things could happen... :-)

  66. Offtopic by JohnFluxx · · Score: 1

    I used to love that quote - I would put it after every program I wrote. I used to get several patches a day on some of the more popular software to remove it. It drove the users nuts. :)

  67. It's not going to be multi platform by Anonymous Coward · · Score: 0

    They'll do something to ensure only MS Office 11 can read these files. Strong encryption perhaps?

  68. What are you all complaining about? by nmg196 · · Score: 5, Insightful

    Microsoft is switching from a proprietary file format, to XML, and the first 100 comments are all flaming MS. WTF does it take to make you people happy?

    They've already shown with .NET that they can make an entire programming framework (and at least 3 assocated languages) into an open standard and even have them ratified by the ECMA and maybe even ISO. Because of this people have already managed to port Perl, Python and many other languages to this framework before it even came out of beta! The guys at Ximian have even managed to port quite a bit of the framework itself as part of the Mono Project.

    So perhaps instead of perpetually slating Microsoft, you could get off your arse and do something useful instead.

    Nick...

    1. Re:What are you all complaining about? by Goose+In+Orbit · · Score: 1

      Two words...

      Leopards
      Spots

    2. Re:What are you all complaining about? by Anonymous Coward · · Score: 0

      No, they DIDN'T shown with .NET that they can make ENTIRE programming framework into open standard, just portions of it (small ones, actually). And, about MONO, we'll see. I've read somewhere one of Microsoft officials mentioned all kinds of IP rights hooked to .NET and their intentions to use those to protect their interests.

    3. Re:What are you all complaining about? by nmg196 · · Score: 2, Interesting

      "somewhere" - that really good reliable source of information.

      "about MONO, we'll see" - go and see then - you only had to click the fscking link that I put there for you. Even a Windows user should be able to manage that.

      "all kinds of IP rights" - and you reckon Sun doesn't have those for Java?

    4. Re:What are you all complaining about? by TummyX · · Score: 3, Informative


      They've already shown with .NET that they can make an entire programming framework (and at least 3 assocated languages) into an open standard and even have them ratified by the ECMA and maybe even ISO.


      That's not true. Only C# has been submitted to ECMA. VB and JScript.NET have NOT.

      The CLI submissions are only a small subset of the .NET framework. This is for a good reason, most of the .NET framework relies on Windows services (System.DirectoryServices, System.Windows.Forms, System.EnterpriseServices, ...).

      C# and the CLI does NOT make up a platform like Java. It's more like C. Both C# and C provide a basic set of classes. Anything more 'advanced' is provided through extension libraries that may or may not be cross platform (just like C). You could write a sound library for C# that uses DirectX and it would only work on Windows. On the other hand, you could write a sound library for C# that uses OpenAL. It would work on all platforms where OpenAL is supported.

      Many features that Java has such as GUIs, Telephony, Speech, Sound, 3D etc aren't supported by .NET and certainly won't be standardised. Sound support will be added by Microsoft in the future but it will use DirectX (obviously NOT cross platform).

      The cross platform hopes for C# pretty lie in OSS hands. It is up to the OSS community to write 'standard' cross platform libraries for C# (just like we have for C). C# interfaces nicely with C so it is likely that many cross platform libraries for C# will use the corresponding C libraries.

      As you can see, the CLI is much more like C+GLIB than the "Java Platform".

      Java is a meta-operating system. It a huge set of APIs consistantly on all platforms.

      C#/CLI does not always provide a consistant API on all platforms but it allows and encourages you to rely and exploit on the native APIs available on the underlying operating system.

      Which is better? It really depends on what you want. Java is obviously the only choice for cross platform development (atm). C# however appears to be a good replacement for C -- especially on the client side. It complements the underlying operating system whereas Java tends to hide it. That's why you will see a lot of C#/GTK# applications for Gnome in the future but not many Java/GTK applications.

    5. Re:What are you all complaining about? by Anonymous Coward · · Score: 0
      > Microsoft is switching from a proprietary file format, to XML, and the first 100 comments are all flaming MS. WTF does it take to make you people happy?

      Yes, how dare we use our accumulated memory and experience!

      >They've already shown with .NET that they can make an entire programming framework (and at least 3 assocated languages) into an open standard

      Yes, Microsoft claiming this really makes it so! Did you also know they invented GNU and the whole open source phenomena?

      >So perhaps instead of perpetually slating Microsoft

      *gasp* Young man, I'm going to wash out your mouth with soap!

    6. Re:What are you all complaining about? by Anonymous Coward · · Score: 0

      Microsoft is switching from a proprietary file format, to XML, and the first 100 comments are all flaming MS. WTF does it take to make you people happy?


      Actions, not words. We'll see what happens WHEN it happens. I.e. the actual implementation. Not the promises. It's Microsoft. Experience shows they are not to be trusted. Clear?


      F.

    7. Re:What are you all complaining about? by King+of+the+World · · Score: 2

      The only parts submitted to ECMA are C# and the CLI which excludes Windows Forms, Web Forms, and many base libraries which are proprietary and owned by Microsoft. As Nelson would say, HA! HA!

    8. Re:What are you all complaining about? by Anonymous Coward · · Score: 0

      Great troll, you'll go far. And it's been even scored as +5 Insightful, instead of -1 Troll -- I'm really impressed. You must have read my /. troll HOWTO, good move. Keep up the great work my friend. We need more people like you.

  69. Re:What?! by Anonymous Coward · · Score: 0

    Why are all anonymous comments at +3?

    Cowboy Neal finger trouble. This one's OK - isn't it?

  70. a middle road? by Simon · · Score: 1
    but giving a correct interpretation of tags and attributes is something that only Microsoft can do, unless it publishes the full specifications (present and future: after all, XML is eXtendible, right?)

    I think you are right here. Actually I have a feeling that MS are aiming at a middle point between fully open and fully closed w.r.t. the exact format. Open enough such that people can index, summerise and process Word docs for content and document management systems. While at the same time being closed enough such that passing Word documents around is painful for those not using Office. (Think formating and printing)

    1. Give the correct interpretation to the bytes representing the document content, in order to import the Office document in some other office suite using a different representation. This is mostly solved (thanks to years of trials and errors).

    ...and moving to another poorly defined format (even XML) moves the competition back to square one w.r.t. understanding all the quirky behaviour/interpretations Word has when it reads it's documents.

    --
    Simon

  71. Well, that's "Embrace" taken care of... by Queuetue · · Score: 2

    I presume we can expect "Extend" at Office 11's release, and then we can pencil in "Extinquish" sometime late next year?

    Is that good for everybody?

  72. perversion of standards by crm114 · · Score: 1
    Look at the HTML format currently output by the ms word application.... it is a read-only and unneccesarily complex implementation due to massive bloating of embedded formatting.

    Clearly, any adoption by ms of open standards is an attempt to co-opt the standard.

  73. arg! by photon317 · · Score: 1, Redundant


    I've been wanting to process word docs with my perl scripts for years, and they fscking know it. They don't have to have some down the road conversion to XML to allow me either, all they have to do is open their fscking standards. What I wouldn't give for a microsoft word document api on linux that was reliable instead of what we have: reverse-engineer peices of cruft that enever get things quite right.

    Since they haven't opened up in the past, I don't expect them to know either. Either (1) the project will get buried, (2) Microsoft will use a subverted MSXML standard somehow to make sure it's not usable by us, or (3) the xml documents will be encrypted and protected by Palladium so that your only hope of realizing this perl promise is to use a licensed copy of Microsoft Visual Perl#++.

    --
    11*43+456^2
    1. Re:arg! by photon317 · · Score: 1, Offtopic


      Wow, it's funny how many grammar and spelling mistakes I can make per second this early in the morning.

      --
      11*43+456^2
  74. Re:umm by Arimus · · Score: 2, Insightful

    As he is one of the people responsible for XML and Office 11 is going to be using XML as its native file format have you spotted the link (hint think of three letters...)

    That aside, if MS do adopt XML as their file format AND they don't screw the way the HTML formatted output did then it is about time, and I would imagine that the people who came with XML are going to be happy to see their work being put to good use.

    --
    --- Users are like bacteria -> Each one causing a thousand tiny crises until the host finally gives up and dies.
  75. Word 11 by jasonditz · · Score: 1

    Since its HTML capable why don't they call it "Word X11"

  76. Export to HTML... by Anonymous Coward · · Score: 2, Insightful

    ...to see where they're going with this. Word has been exporting to HTML, which is really some funky XML/XHTML with stylesheets that IE can read and display, for a while.

  77. the most wonderful thing... but it's not happening by g4dget · · Score: 3, Interesting
    The most wonderful thing that would happen would be that people can finally dump that messy piece of software and move to a better toolset.

    Unfortunately, Microsoft won't let it happen. The data may be "in XML", but that doesn't mean you can read it or generate it well. Instead, Microsoft will give you just enough to serve their business interests and nobody else's.

    How? Office will probably stick undocumented base64 encoded binary stuff into the output, containing formatting information. You can use the document content, for example, with a database, but you can't load it into another word processor and preserve all the formatting. And in the other direction, sure, you can generate simple documents that Office will import, but you can't generate arbitrary Word documents--they will, again, have weird, undocumented tags and binary stuff.

    In short: don't hold your breath. Microsoft isn't stupid.

  78. Comment removed by account_deleted · · Score: 2

    Comment removed based on user account deletion

  79. Lose the fight, win the war by PackMan97 · · Score: 2, Insightful

    Sure, IBM lost control of the PC market...but is that better than what's happened to Apple?

    Let's go back in time to 1985 and you can choose which company to invest in...IBM or Apple. Hmmm...tough choice isn't it? Their stocks have both appreciated almost the same amount since then! Shocking isn't it.

  80. Simple way to use XML and hide your document by Anonymous Coward · · Score: 0

    Here's a valid XML document
    [document]
    [formatting]FASDFASDASDASDAS DASDASDASDWQWE[/format ting]
    [text1>Hello World[/text1]
    [text1>Testing 123[/text1]
    [text1>This is a test of the emergency broadcasting association[/text1]
    [/document]

    You'd be extract the text from the document, but you can't format it without knowing what the "formatting" tag means. It would be a huge step backwards for OpenOffice import of MSDocuments.

    Let's say that they eventually manage to decrypt the formatting and interpret it, chances are that the formatting text would decipher into VB.NET calls into the MS Win32 operating system. You couldn't format the document unless you emulated .NET, VB.NET, and Win32. That basically means, OpenOffice would have to find a way of including both Mono, VB.Mono, and WINE and have to deal with all the compatibility issues that result from all the re-engineering.

  81. Exactly, you can embed platform specific code by Anonymous Coward · · Score: 0

    Exactly, you can embed platform specific code in the XML. In particular, you can embed VB.NET, .NET, and Win32 calls in XML tags. The only way to interprete what the other tags actually refer to, you need to run on a platform that supports all these (i.e. MS Windows.NET).

  82. kill PDF? you mean that bloated pig will die? by Anonymous Coward · · Score: 0

    what a shame that PDF will be killed considering what a bandwidth pig it is.

  83. Misleading by Anonymous Coward · · Score: 0

    First of all XML requires a DOCTYPE, which I am pretty sure MSFT will closely guard through copyright, patents et al.

    Second, You can't attempt to understand the XML intentions because that would be in violation of the DCMA. Knowledge is Death.

    Third, XML for identification of tags does not in any way implicate an open format for the documents themselves. I can create XML documents using trakemarked DOCTYPES and wrap it into a PGP encrypted file and still claim Open Standards use of XML.

    Until I can open a .DOC file in notepad or less and be able to read it completely, I will never for an instant believe this kind of statement. It's either another FUD joke, or MSFT has truly repented and will forever more do everything in the open. Yeah right. And I have a 16" penis!

  84. XML file on Windows Messenger by Elementalor · · Score: 1

    Hola :)

    Playing a bit with Windows Messenger, I found an option that lets you save your contact list under the "File" menu.

    It creates a .ctt file that looks like this:

    XXXXXXX@hotmail.com
    XXXXX@hotmail.com
    XXXXXXXXXXX@hotmail.com

    Looks pretty interesting :)

    Best wishes from Valencia (by the Mediterranean Sea in Spain / España)...

  85. Late to the party, but... by mmcshane · · Score: 1

    It is unimportant that any average Word doc can be exported to xml because the average Word doc does not carry semantic meta-information. It carries stuff like "make this line bold and indent it 4 pixels" That kind of info is pretty much useless unless you're Google and you spend your days writing algorithms that parse semantics out of display information. The best case scenario for legacy Word documents would be the ability to save as FO.

    The key feature is "It seems Word can also edit arbitrary XML languages under the control of an XML Schema" This, coupled with IE5+'s "Web Folders" (really WebDAV) Means that I can point my users to a schema/stylesheet combination, let create a compliant XML document in WYSIWYG mode in Word and then save it directly to my webserver over HTTP. On the server-side, I do ACLs, Versioning, etc.

    XML content creation has long been the missing link in CMS software. XMLSpy has been doing this for a while now but they're f'ed now because they never quite got it right and now the 500 lb. gorilla is about to sit on them.

    1. Re:Late to the party, but... by mmcshane · · Score: 1

      I should learn to preview. That's http://www.webdav.org

      The scenario gets even better as Subversion moves forward.

  86. XML not transparent by Anonymous Coward · · Score: 1, Insightful
    XML does not mean an open format. You can invent tags that mean things only to you, and you can wrap an existing binary docuement in an XML file.

    Why not a tag like
    <902358r9838239hjfs98>Data</902358r9838239hjfs98>?
  87. Re:There is some documentation of Office XML alrea by eetu · · Score: 2, Interesting

    The document at MSDN doesn't seem to have anything to do with MS Office 11 or the new "built around XML" Office file formats. It simply explains how files can be imported to/exported from Access and Excel of MS Office XP.

    --
    "If I can't have a revolution, what is there to dance about?" - Albert Meltzer
  88. Perl Marco Fun by jiminim · · Score: 1

    So now more powerful viruses will entertain the masses!

  89. [OT] Re:XML file on Windows Messenger by aderuwe · · Score: 1
    Best wishes from Valencia (by the Mediterranean Sea in Spain / España)...
    What's the weather there like? I'm going to Barcelone this Saturday for a week. ;)
  90. One word, DCMA by N8F8 · · Score: 2

    They havn't opened the office document standards, they might just make then more parsable. You would still be breaking the law if you built a product with ability to parse an office document without paying a MS royalty.

    --
    "God fights on the side with the best artillery." - Napoleon, Marshal of France - speaking truth to power
  91. You omitted one by Black+Perl · · Score: 2
    Check this out:

    YAWC Pro (http://www.yawcpro.com/)

    This can output XML according to any DTD (by default it uses the Simplified DocBook DTD).

    --
    bp
    1. Re:You omitted one by Anonymous Coward · · Score: 0

      I had problems with the buggy VBA. It didn't work for me.

    2. Re:You omitted one by Black+Perl · · Score: 1

      I had problems with the buggy VBA. It didn't work for me.

      Did you try their support?

      --
      bp
  92. Man, I hope by sirshannon · · Score: 1

    Microsoft never tells you guys that you can't breathe under water... "it was as if a million voices cried out at once, and then drowned" btw, using Yukon as the file system will only help accessibility because every file on the system will be as accessable as a SQL Server database is now.

    1. Re:Man, I hope by arkanes · · Score: 2

      You mean I just have to logon using "sadmin" and I'll have total access to the file system?

    2. Re:Man, I hope by darylp · · Score: 1

      You mean like logging on as 'root' gives you total access to a Linux file system?

  93. Hello World.MSXML by Tsali · · Score: 1

    <xml>
    <clippy autoinstance="true" kill="false">
    Hi, I'm Clippy! I'm inserted into every
    XML document to help you migrate to Microsoft
    products...
    </clippy>
    <virus>
    RunNimda()richedit.dll&
    </virus>
    <virus>
    RunNimda()richedit.dll&
    </virus>
    <staroffice runat="false">
    Shouldn't you use Microsoft products?
    </startoffice>
    <linux runat="false">
    See above.
    </linux>
    <datacheck>
    <DMCA>
    www.fbi.gov/reportmusictheft
    </DMCA>
    <MS>
    www.microsoft.com/reporteulaviolation
    &nbsp ; </MS>
    </datacheck>
    <clippy autoinstance="true" kill="false">
    Hi, It's me again... Clippy! I'm inserted
    into every XML document to help you migrate
    to Microsoft products...
    </clippy>
    <doc>
    Yep... there's a ton of possibilities for Joe
    user with this one. Where can I buy me a copy
    of Office 11?
    </doc>
    </xml>

    --
    This space for rent.
  94. Competition from below forces Microsoft to open up by Jeppe+Salvesen · · Score: 2

    Openoffice is XML-based, and extended into suit-compability by StarOffice. It is to my best knowledge rather xml-based, easily parseable and well documented.

    That alone is a unique feature that adds a lot of value to openoffice in the medium to long perspective. Microsoft would certainly not risk one of their big cash cows by clinging too tightly to their paradigms. They are many things, but not they are not complete idiots.

    So, opening up the format would remove some of the reasons why customers might want to migrate to other systems.

    It's a defensive move, really. A rather good one for all parties, too, especially if they refrain from their anti-open-source licensing. If they allow open source projects to process their documents, we will add value to their product. I certainly hope they will see it this way, though I'm not convinced.

    --

    Stop the brainwash

  95. XML = gigabyte documents by edxwelch · · Score: 1

    XML is a text format and therefore isn't suitable for encoding huge chunks of data. That's why JPEG, MPEG are in binary. Users with 100 page documents are going to have to store them in the old Word format, either that or contend with gigabyte documents. You could try Compressing, but that would be a huge performace hit every time you try to save and open a document.

    1. Re:XML = gigabyte documents by pavera · · Score: 1

      I dunno,
      I have multiple documents that I open in word and OOo, and save them in their native formats... OOo's files are consistantly 90% smaller than Words, even large files (granted my largest file is about 60 pages not the 100 you mention, but even it is about 85% smaller in OOo than in Word).
      So I don't quite see this happening.

    2. Re:XML = gigabyte documents by edxwelch · · Score: 1

      I never heard of OOo, but this only shows that OOo saves data in a more efficient manner than Word. Documents saved in Word XML will be bigger than normal Word documents

    3. Re:XML = gigabyte documents by pavera · · Score: 1

      OOo = Open Office,
      its file format is an XML file format,
      and it is significantly smaller with the same text/formatting as word's binary format.
      So I disagree with the idea that XML is less efficient that Word's binary format

  96. Bye bye excel by plopez · · Score: 2

    Who needs XML?

    my $handle= new Win32::OLE('Excel.Application.9') || die "died: $!\n";
    #version 9 is ofc 2k, version 10 is office xp.
    if($source_file=~/\.xls$/i)
    {
    $handle->Workbooks->Open($source_file);
    my $worksheets_count=$handle->Sheets->Count;
    #print "Count: $worksheets_count\n";
    #note that a) excel sheet tabs are #numbered from '1'
    #(YAR VBA should not be considered a real #programming
    #language)
    #and b) for my purposes the first 3 were garbage. #Season to taste.
    for(my $i=4;$iActiveWorkbook->Worksheet($i);
    $sheets->Ac tivate;
    my $temp=$source_file;
    $temp=~s/\.xls$//i;
    my $target_file= $temp . "S$i" . '.' . "txt";
    #-4158 is the MS magic number for tab delimited.
    $handle->ActiveWorkbook->SaveAs($targe t_file,-4158 );
    #not quite sure what the line below does any more.
    $handle->{XLSaveAction}=2;
    push @target_names,$target_file;
    }
    $handle->ActiveWorkbook->Close(0);

    This is one of the things I put under the ruberic of 'Stupid Perl Tricks' Saved as text and data locked in a SS can then be easily imported into any database. After assorted data munging to normalize it, of course...

    --
    putting the 'B' in LGBTQ+
    1. Re:Bye bye excel by plopez · · Score: 2

      Damn the tiny /. interface hosed my script. the for loop line should read:

      for(my $i=4;$i=$worksheets_count;$i++)

      --
      putting the 'B' in LGBTQ+
  97. OpenOffice is XML, now! by magi · · Score: 5, Informative

    Doing XML stuff with OpenOffice is supergreat. It took me half-an-hour to study the format enough to write a XSLT parser that extracts all strings from an OO document.

    Now I wrote, just for demonstration, the following XSLT example in just a few minutes, useable directly with xsltproc in Linux.

    The example prints all the Heading paragraphs in a OO Writer document, indented according to the header level.

    <?xml version='1.0'?>
    <xsl:stylesheet
    xmlns:xsl="http: //www.w3.org/1999/XSL/Transform"
    xmlns:office="ht tp://openoffice.org/2000/office"
    xmlns:style="htt p://openoffice.org/2000/style"
    xmlns:text="http:/ /openoffice.org/2000/text"
    xmlns:table="http://op enoffice.org/2000/table"
    xmlns:draw="http://openo ffice.org/2000/drawing"
    xmlns:fo="http://www.w3.o rg/1999/XSL/Format"
    xmlns:xlink="http://www.w3.or g/1999/xlink"
    xmlns:number="http://openoffice.org /2000/datastyle "
    xmlns:svg="http://www.w3.org/2000/svg"
    xmlns:c hart="http://openoffice.org/2000/chart"
    xmlns:dr3 d="http://openoffice.org/2000/dr3d"
    xmlns:math="h ttp://www.w3.org/1998/Math/MathML"
    xmlns:form="ht tp://openoffice.org/2000/form"
    xmlns:script="http ://openoffice.org/2000/script"
    version='1.0'>

    <xsl:output method="text" encoding="ISO-8859-1"/>

    <!-- Print all headings, indented. -->
    <xsl:template match="text:h">
    <xsl:value-of select="substring(' ', 1, (@text:level - 1) * 2)"/>
    <xsl:text>* </xsl:text>
    <xsl:value-of select="text()"/>
    <xsl:text>&#xa;</xsl:text>
    </xsl:template>

    <!-- Don't output any other text. -->
    <xsl:template match="text()">
    </xsl:template>
    </xsl:stylesheet>

    The result would be something like:

    * Top-level heading such as a chapter
    * Second-level heading (section)
    * Another section
    * Subsection
    * Subsubsection
    * Yet another section

  98. Microsoft invented open formats! by cyba · · Score: 1

    "So it seems to me," he concludes, in delightfully prophetic mode, "that when the huge universe of MS Office documents becomes available for processing by any programmer with a Perl script and a bit of intelligence, all sorts of wonderful new things can be invented that you and I can't imagine."

    Actually I can imagine. I'm doing this with HTML files for years. Thanks to openess of HTML format.

  99. But you need to buy it! by Anonymous Coward · · Score: 0

    If you're in love with this new XML support from Microsoft, don't forget you have to purchase an upgrade or buy the new version to get that XML support! Don't send you money to M$! OpenOffice and other such are a wiser choice. Come on - let's just forget about M$ and do without them.

  100. No, it doesn't by alispguru · · Score: 3, Interesting

    Look up at this. Putting information in XML makes the first baby step of reverse engineering easier, nothing else.

    XML helps only if the creator of the document wants the information to be easily accessible by programs other than their own.

    --

    To a Lisp hacker, XML is S-expressions in drag.
    1. Re:No, it doesn't by JordanH · · Score: 2
      • Look up at this [slashdot.org]. Putting information in XML makes the first baby step of reverse engineering easier, nothing else.

      I think calling it a baby step is an exaggeration.

      It's far far easier parsing XML documents with tags vs. some binary format. Without XML, you have no idea, whatsoever, of the size of the fields of data you are dealing with.

      For the purposes of reverse engineering, it's the roughly the difference between having source code and a binary executable.

      Some people have experimented with Source Code obfuscators, sometimes called Shrouds, but have found that these are always reverse engineered rather quickly due to the availability of Source Code parsing tools. While binary executables are sometimes reverse engineered, I would hardly characterize the difference as being just baby steps away.

      In any case, it will be very difficult for MS to justify purposeful obfuscation of the XML. If they do this, it will give competitors more ammunition that MS talks open but really is lock-in.

      There are already really good tools that will diff XML docs for you. A fairly junior programmer, working alone, with those tools, could discover the meaning and data types of all the tags with a little exploration.

      If MS does come out with XML documents, they will be reverse engineered really quickly, I'd bet. The binary Word formats, by comparison, often take quite a while to reverse engineer and there're often problems with the conversions.

  101. I'm not so pessimistic by phasm42 · · Score: 1

    It's seems that this should be regarded as a good thing, but a lot of opinions here seem to regard the whole thing as an evil scheme. I don't think openness is their whole motive for moving to XML, but that doesn't make it a bad move. It may be easier for them to create and maintain Office's code if the format is XML rather than a binary format. Since storage space isn't such a premium these days, programs can afford the luxury of a file format that trades efficiency for ease of development.

    --
    "No one likes working in a hamster wheel, and your shop smells of cedar shavings from here." - TaleSpinner
  102. office HTML by avandesande · · Score: 2, Insightful

    Anyone looked at the HTML output from an office program? It's terrrible. Do you think their xml will look any better?

    --
    love is just extroverted narcissism
  103. Hmmm.... by BuffJoe · · Score: 2, Insightful

    I have a feeling that Microsoft "XML" will use Microsoft "Unicode." That is, any character in the range of 0x82 to 0x95, which Unicode reserves for extra control characters, will be littered with "smart" quotes, emdashes, and other proprietary extensions to Unicode that ensure that nothing works with it. I ran into this problem when I tried converting FrontPage generated HTML into XHTML so I could do conversions with XSLT. Needless to say, it took a lot of effort, even with HTML Tidy, to get Microsoft's generated HTML to get converted into XHTML! HTML Tidy constantly complained about the HTML, and looking at what FrontPage generates, it's not hard to see why it complained.

    I ran across the demoroniser, which fixes Microsoft Unicode problems, but it still doesn't fix the invalid HTML that FrontPage generates.

    Microsoft XML? Hah! I'll believe it when I see it.

  104. Microsoft's intentions by affenmann · · Score: 2

    Well, one consequence is that many people will be forced to upgrade to the new office, since all the Word-attachments will require the new word to be readable (and editable)... Now, this is a good motivation for M$.

  105. Tim here with a bit more background by tbray · · Score: 5, Informative

    I've seen the native Word XML format (alpha mind you, so it might get changed). It isn't exactly pretty, and if I had to write code to extract all the paragraphs that contained the word "foo" in bold it would give me a bit of a headache, but I could do it.

    The word "foo" in bold single-underline looks something like

    <r>
    <rf>
    <rp class="bold" />
    <rp class="underline" lines="1" />
    </rf>
    foo</r>

    Yeah, it's pretty verbose.

    Near as I can tell, it is 100% round-trip-able, i.e. you save as that file format, you read it in again, you hit ctl-S and it saves again; about as good as a native format. Now someone needs to write some script-ware to run Word in batch mode to xml-ify server directories with zillions of office docsl

    I think the reason MS is doing this is obvious. Look at their financials - they *really* need people to upgrade to the new version of Office. End-users don't buy Office any more, CIOs and the like do. These people are just not gonna be impressed by another new word-processing feature, but they might be motivated to upgrade if they thought that they were opening up all their data to re-use by other programs.

    I expect that with any luck we'll get a secondary industry built around doing cool unexpected stuff to Office docs. Don't want to sound over-excited here, but a huge amount of all the intellectual capital in the world is sitting around in Office docs, and this makes it noticeably more re-usable. Has to be a good thing.

    Cheers, Tim

    1. Re:Tim here with a bit more background by Anonymous Coward · · Score: 2, Funny

      Hey, no fair injecting actual information, from a primary source, no less, into a /. discussion! That's totally cheating! #@$! karma whore...

    2. Re:Tim here with a bit more background by Compuser · · Score: 2

      Round-trip-able is fine and all but is _any_
      formatting lost between XML version and binary
      format? In so many words, from what you have
      seen, is there a point of writing a script to
      run Word in batch-convert mode? Is the XML
      version more faithful to original formatting
      than, say, OO import filter?

    3. Re:Tim here with a bit more background by donutello · · Score: 3, Informative

      I think the reason MS is doing this is obvious. Look at their financials - they *really* need people to upgrade to the new version of Office. End-users don't buy Office any more, CIOs and the like do. These people are just not gonna be impressed by another new word-processing feature, but they might be motivated to upgrade if they thought that they were opening up all their data to re-use by other programs.


      Uhh.. from this article.

      Information Worker turned in healthy revenue growth of 26 percent, reflecting customer adoption of Microsoft Office XP through multi-year licensing programs. Customers acquiring Office this quarter included ChevronTexaco, Lockheed Martin, MetLife, Newell Company (Rubbermaid) and the US Department of the Army, Program Executive Office, Aviation.

      and

      Microsoft Corp. today announced revenue of $7.75 billion for the quarter ended Sept. 30, 2002, a 26 percent increase over revenue of $6.13 billion for the same quarter last year. Operating income for the first quarter was $4.05 billion, compared to $2.90 billion in the same period last year. Net income and diluted earnings per share for the first quarter of fiscal year 2003 were $2.73 billion and $0.50, which included an after-tax charge for investment impairments of $291 million or $0.05. For the same period of the previous year, net income and diluted earnings per share were $1.28 billion and $0.23, which included an after-tax charge for investment impairments of $1.22 billion.

      "Results for the first quarter were exceptionally strong, exceeding our expectations. During the quarter, we saw broader customer adoption of our licensing programs than we anticipated, as customers recognized the value of entering into long-term licensing agreements for our products. This strength in licensing led to solid growth for Windows® XP, Office XP and .NET Enterprise Servers," said John Connors, chief financial officer at Microsoft. "Consistent with our view at the outset of this year, the global economic outlook continues to be uncertain, however we remain committed to making the investments necessary to drive long-term product innovation and customer value across our businesses."

      --
      Mmmm.. Donuts
    4. Re:Tim here with a bit more background by iggymanz · · Score: 1

      Tim, if Microsoft does this the *right* way, "intelligent programmers with Perl scripts" will make alternative 100% compatible, free, open source office tools that will kick Bill Gate's cheating, manipulative, lying asshole right out of the corporate office tool market. (I thus don't believe Microsoft's XML "standard" will be good or too useful)

  106. Adopt and extend by drxenos · · Score: 1

    So what happens when MS starts changing XML?

    --


    Anonymous Cowards suck.
  107. Ummmm... diminishing sales? by Drakonian · · Score: 1
    From the article:

    Microsoft Corp. today announced revenue of $7.75 billion for the quarter ended Sept. 30, 2002, a 26 percent increase over revenue of $6.13 billion for the same quarter last year. Operating income for the first quarter was $4.05 billion, compared to $2.90 billion in the same period last year. Net income and diluted earnings per share for the first quarter of fiscal year 2003 were $2.73 billion and $0.50, which included an after-tax charge for investment impairments of $291 million or $0.05.

    --
    Random is the New Order.
    1. Re:Ummmm... diminishing sales? by aaarrrgggh · · Score: 2

      Elsewhere, you will find these sales largely attributed to the new license terms. Sales of Office were supposedly down. Most analysts expect the revenue growth to slow.

  108. "Codename Office 11" by verloren · · Score: 1

    Whew, the boys in marketing must have had a hell of an all-nighter coming up with that one!

    1. Re:"Codename Office 11" by hey · · Score: 2
      Your comment is +5 Funny - I'd mod you up if I had the points. Maybe the real release will be "Office XI"

      I like the theory that as soon as a product passes version 10 it has lived too long. There is probably something else that can do the jobs better. Time for some lateral thinking. Probably the case with MS Office.

  109. Structure vs Presentation? by 4of12 · · Score: 3, Insightful

    MS Office saving its data in XML format is a great start.

    But will this really be enough?

    Previous complaints about how versions of Office didn't disclose the format were often referred to a specification that Microsoft made available to describe what was in a Word document.

    The key problem, IIRC, was the the description was not sufficient for one to predict how the Word document was actually formatted and rendered on the page.

    Because XML is very much like SGML or TeX, it has the potential for much more exhaustively describing document structure. But whether the new Word XML format (or OpenOffice format, for that matter) contains sufficient information for developers to reproduce the "right" format is a different issue.

    I hope I'm wrong and that the format is specified comparably to the level you'd find in say PostScript or PDF.

    Maybe MS is willing to let rendered Office douments change, just as HTML rendered documents change whenever one resizes the browser window.

    But I doubt it.

    --
    "Provided by the management for your protection."
  110. M$ XML by thesfinx · · Score: 1

    Micro$oft already does some things with XML. They (sort of) EXTENDED the XML spec (I'm not sure here) to make sure they could embed binary data in it.

    This way they can put a M$ Word file inside an XML body, but still be a binary file.

    This is what I think is likely to happen.

    1. Re:M$ XML by JDBrechtel · · Score: 1


      Yea...that makes a lot of sense.

  111. I need to drink my coffee before reading slashdot by Servo · · Score: 1

    My half asleep brain managed to come up with what sounded quite logical...

    "Tiny Brain in Microsoft Office"

    --
    A slip of the foot you may soon recover, but a slip of the tongue you may never get over. -Benjamin Franklin
  112. Wonderful new things? by SnarfQuest · · Score: 1

    all sorts of wonderful new things can be invented that you and I can't imagine.

    I'm guessing that the Anti-Virus groups have finally been able to catch most Word virii, so MicroSoft now needs something new to be able to generate the quanity and quality of this type of software that they demand.

    --
    Who would win this election: Andrew Weiner vs Andrew Weiner's weiner.
  113. Serialized Objects. by pH7.0 · · Score: 0

    You can always serialize a set of objects into xml. Now, how to use that xml without the original code is left as an exercise for reader. In that case, most likely you need a bug for bug compatible MSOffice clone.

  114. Inightful my ass! by Anonymous Coward · · Score: 1, Interesting

    You haven't got a clue about this have you?

    Your post is just a bunch of paranoid, slashbot FUD. No wonder you got modded up!

  115. Hook, line and sinker by Anonymous Coward · · Score: 0

    Let's wait until we SEE the finalized XML document format before we declare this a good thing. For all we know, it's going to be WhK39AHE@KEH+=J9017ELDHJH+! -- totally unintelligible yet 100% XML!

  116. Exactly by Anonymous Coward · · Score: 0

    What do you think paladium and .NET are all about.

  117. XML had an inventor? by blair1q · · Score: 2

    It's a stupid nested name-value pair text databasing system.

    That'd be like inventing the sentence.

  118. Wow! by grub · · Score: 2



    ..code-named 'Office 11'

    It must have taken Microsoft months to come up with that ultra-secret code name.

    --
    Trolling is a art,
  119. Scripts, not GUI by captaineo · · Score: 2

    My guess is, the XML format will make it much easier to manipulate Office documents from scripts, but it will still be very difficult to construct an actual WYSIWYG editor for them.

    e.g. Say that there is a tag with extremely complex, undocumented, formatting and display rules. It might be easy to add or remove things from tags, but only Office would actually know how to *display* a table correctly.

    This would allow MS to say "we have an open file format" without really endangering their core business, GUI document editing tools.

  120. 4 Words by M$+Mole · · Score: 1

    "Software As A Service"

    Anyone remember hearing this term from M$ before? That's where they're going with this. They want to be able to offer the word application as little more than a front end to a series of web services that they'll be offering for a fee. This makes an XML-based file format much more attractive to MS because it's more effecient to sent data that is already in an XML-based format to a WS than it is to take a binary format, serialize it to SOAP, and then send to the WS and have to deserialize said object.

    Do I believe that MS will actually use a real XML format? Sure. Do I believe for one second that this is to be more open? Hell no.

    --
    Karma: Non-existant. Due mostly to the fact that you smell funny and nobody likes you.
  121. Totally meaningless by ChaosDiscord · · Score: 2
    The important thing...is that Word and Excel...can export their data as XML without information loss.

    Oooooh, yay, Microsoft added another export filter to Word and Excel. The world is a better place.

    The reality is that unless the XML format is the default format, this change is useless to most users. The cry against new word processors is always, "If it doesn't import every single Word file ever created with every single feature supported, it's worthless." Unfortunately the insane complexity of the Word file format, the lack of documentation, and the constant churn as new versions of Word come out mean that you'll never see perfect conversions, yet too many people whine that it has to be perfect. (Completely ignoring the fact that most users would never notice the (in most cases) ever so slightly inaccurate translation, the minority that push the documents to their limits refuse to admit any value in an imperfect translation.) An XML would make the translation easier, but it's useless if it's not the default. Microsoft's monopoly on office productivity software is based on the massive numbers of existing Word and Excel files.

  122. MS should be more careful... by jonadab · · Score: 1

    If they continue to allow trade secrets like this to leak out, who knows what could happen. I mean, if the world knows that MS Office uses XML-based file formats, that could be a huge disaster! If MS doesn't act quickly to stifle this leak, cross-platform software developers might copy this innovation and take away their competitive advantage!

    --
    Cut that out, or I will ship you to Norilsk in a box.
  123. Government Contracts Might be The Reason by bobaferret · · Score: 4, Interesting

    I think the reason that they are switching over is probably due to the trend in emerging foriegn markets. Peru being a prime example. Countries are starting to enact legislation that requires any government procurments of software to only be for software that uses an open file format. Due to the long term storage problems.
    This tied to the fact that US sales are going to slow down or are already, due to the complete inundation of PC, they need new markets, and unless they use an open format they won't be able to get them. I'd be panicked Linux and Java eroding their server market. Governments are eroding their Office market. They only way they can grow is add value.

  124. Wow. by mindstrm · · Score: 1

    Does everyone miss that MS will have XML, certainly.. and they will have tons of proprietary data in between those xml tags, that they are under no obligation to document for anyone.

    You will be able to see the structure of the file, but not make sense of it.

    That's what I'd do, if I was ms.

  125. Genuine XML? by J.+Random+Software · · Score: 4, Interesting
    Good in theory, but HTML support in Office 2000 was such a debacle that there are third-party tools designed just to unmangle the markup. They compltely ignored Processing Instruction syntax, which is intended to do just what they wanted, and
    <![if !supportEmpty Paras]>
    wasn't even well-formed SGML.
  126. Re:MS moderators? by Mad+Bad+Rabbit · · Score: 2

    Yesterday, when I attempted to moderate something as "Interesting", the confirmation page showed a
    moderation of "Overrated" instead. I'm pretty sure
    I selected the right value from the pulldown list,
    and suspect there may be a bug in the moderation system.

    --
    >;k
  127. He answers this in the article by drew_kime · · Score: 2
    "The important thing," he explains, "is that Word and Excel (and of course the new XDocs thing) can export their data as XML without information loss."
    Emphasis added. So the answer to What will be the default save format? is a resounding not XML.
    --
    Nope, no sig
  128. funny url, althought invalid by Baikala · · Score: 1

    Funny, but that url is invalid because of the second "?" (it should be a "&").
    I asume that the space in "Ge tDoc.aspx" is a result of the previusly mentionated long word separator script. I've read a lot of complains about that "bug", when are the slashcode maintainers going to disable that for url's?

    --
    16,777,216 comments ought to be enough for any forum!
    1. Re:funny url, althought invalid by Soul-Burn666 · · Score: 1

      Dunno why, but first time I read "Ge tDoc.aspx" it looked like "goatse.cx"..

      Hrmm!

      --
      ^_^
    2. Re:funny url, althought invalid by Anonymous Coward · · Score: 0

      Had to mess up my joke with a typo. It probably should be https:// as well.

    3. Re:funny url, althought invalid by King+of+the+World · · Score: 2

      The ""bug"" is by design to stop page wideners. If they didn't apply it when it was an url the page wideners would just write long and fake urls. Right?

  129. Advisable concern. by Perianwyr+Stormcrow · · Score: 2

    Perhaps, when confronted with carrying a snake across the river on our backs, we are properly wary?

    --

    What we call folk wisdom is often no more than a kind of expedient stupidity.-Edward Abbey

  130. but for semantics by rodentia · · Score: 2

    A non XML grammar/syntax, if accompandied by a decent and documented EBNF description of it's grammar, is much better to base your program on than an undocumented XML.

    Except that an undocumented XML file is in an exhaustively EBNF-documented syntax already. Not to mention that the constraints upon QNames mean that the semantics of the schema will be available for disclosure via existing tools even if obfuscated. The same cannot be said for an arbitrary syntax, ANTLR notwithstanding.

    --
    illegitimii non ingravare
  131. Inventing by Anonymous Coward · · Score: 0

    'all sorts of wonderful new things can be invented that you and I can't imagine.'

    Too bad most of the things will come from lawyers.

  132. You may be onto something... by megaduck · · Score: 2

    Does it somhow become encrypted on its way out of the database, remains scrambled on it's way over the internet, and reassembles itself into nice XML once it arrives on the recepients computer?....

    I think you just described Palladium.

    --
    This .sig for rent.
  133. MSWord - OO - Save File by fferreres · · Score: 2

    That's the easiest way, really. And the benefit of having nicely documented DTDs. OO is the true compatibility XML file format for office files.

    That's why MS need to have their own. Because if they don't do it, many companies will use OO as a gateway (many not just yet, but soon).

    So they have to do XML, plus MS is wanting to integrate Office + Windows Programing + WEB Frontends + EVERYTHING in an interoperable way. They can dictate the standard of what the WORLD will have to use in the future.

    They will always be in the middle, and their revenue models will adapt to it just fine. The MS layer if you want to call it.

    --
    unfinished: (adj.)
  134. Thank you thank you thank you by Anonymous Coward · · Score: 0

    This HTML filter is so useful! Now I can actually make web pages out of Word docs! Thanks again! Someone mod this guy up informative!

  135. Tim Bray - inventor of XML????? by Anonymous Coward · · Score: 0

    This is a huge stretch.

    XML derived from SGML.

    Tim is a disgrace to the community - with all this marketing spin.

    1. Re:Tim Bray - inventor of XML????? by Anonymous Coward · · Score: 0
  136. Re:MS moderators? by jandrese · · Score: 1

    Do you navigate the page with the arrow keys? If so it is very easy to choose a moderation from the pulldown box, and then forget to click on the page before hitting down a couple of times and changing your moderation. I've almost done that a few times myself.

    --

    I read the internet for the articles.
  137. Why MS can't win that one by Anonymous+Brave+Guy · · Score: 2
    A company that has for 18 years been trying to lock people in to their technology, will cause some people to be a bit paranoid.

    But, as the saying goes, it's only paranoia if they're not out to get you. ;-)

    The business world is a harsh and, ultimately, fickle one. Microsoft got to the top by doing good things, but you can't abuse your position for long or people will start to notice. As the world comes to depend ever more (rightly or wrongly) on IT to get its business done, following standards and maintaining open sources of information will become ever more important. Even a company as big as Microsoft won't survive by locking people in forever. They rose to the top in a remarkable period of time, and they now have near 100% market share in certain fields, but convincing companies to continue upgrading is becoming harder and harder.

    One of the major drives to get new versions of many products today is the promise of greater power to get data from A, where it is, to B, where you want it. If everyone else is playing ball (because, being minor players, they have to just to stay on the scale) and Microsoft doesn't, then sooner or later, Microsoft will lose market share to everyone else. No company survives by not giving its customers what they want, and right about now, Microsoft's customers most want the two things they can't, or won't, give: security and interoperability. All the UI reworks in the world aren't going to change that, and they know it.

    --
    If you disagree, post your argument. (-1, Overrated) isn't your personal censorship tool for views you don't like.
    1. Re:Why MS can't win that one by sirsnork · · Score: 1

      How can you possibly say they got to the top by doing the right things?? Did you never try to run any version of Windows before 95 on an OS that wasn't Microsofts? It'd crash and burn for no reason, and they programmed it to do exactly that. Not to mention OS/2, yeah we'll hook up with IBM and build a nice OS, then we'll steal all the good bits and make Windows and hire a crack marketing team to get all the hardware manufacturers to write drivers for our OS.

      My god I could go on and on /turns and leave wondering how so many people could be so naive

      --

      Normal people worry me!
    2. Re:Why MS can't win that one by Anonymous+Brave+Guy · · Score: 2
      How can you possibly say they got to the top by doing the right things??

      Because, for quite a long period of time, they really did produce some of the best mainstream software around.

      Windows -- in its 3.1 days -- was a catalyst for many of the good things we now take for granted on a PC. Whether or not MS ripped the ideas or not, it was MS that produced the product so many people used, and they used it because it let them do things they couldn't do before. To hell with the instability, it wasn't really that bad, and today it isn't really that much better. It got jobs done that otherwise couldn't be, as far as Joe User was concerned.

      Much the same could be said of early versions of Winword and Excel. Early versions of both made genuine, useful advances over the other software of the day (DOS- or Windows-based). Wordperfect, Lotus and co. dropped the ball, while MS ran with it, for a period of several years.

      Sling rocks all you like, but VB isn't one of the most popular development environments on the planet today by accident, either. For its time, it was revolutionary, and it's hardly fair to accuse MS of ripping off ideas without noting the number of other "visual" development environments that sprang from that one. Too bad they couldn't do it themselves with VC++, and Borland had to do it for them, but hey, you can't have everything. ;-)

      I honestly believe that MS have, in the past, produced some very good software. If it weren't for certain attitudes they currently exhibit, I would still say they produce software that is among the best in the world on many counts. The problem is that now, they have serious competition for the first time in a while, and the advances in useful features and UI tweeks aren't enough to make up for the security flaws, the lack of interoperability, the poor performance, the licensing concerns and that damned paperclip. Why would people upgrade perfectly usable Office 97 apps to Office XP, with all the downsides Microsoft have caused that to entail? If it ain't broke...

      The only way MS will continue to be successful in the face of good opposition from the open source community, as even Mr Ballmer himself has noted, is to provide genuinely more useful/useable applications that are worth paying for. They'll try legal moves to buy themselves time, but in the long term, both screwing your customer base, and buying politicians and asking them to screw their electorate, are losing strategies. They aren't stupid, and they know this. The interesting question is whether they'll actually act upon it effectively before they lose the faith of their customer base with the delaying actions.

      --
      If you disagree, post your argument. (-1, Overrated) isn't your personal censorship tool for views you don't like.
    3. Re:Why MS can't win that one by Charm · · Score: 1
      Sling rocks all you like, but VB isn't one of the most popular development environments on the planet today by accident, either. For its time, it was revolutionary, and it's hardly fair to accuse MS of ripping off ideas without noting the number of other "visual" development environments that sprang from that one. Too bad they couldn't do it themselves with VC++, and Borland had to do it for them, but hey, you can't have everything. ;-)

      Microsoft didn't create VB
      This guys company did. Cooper.com
      Microsoft just bought it off them.

      --
      -- RTFM:Slackware::Beer:Saturday
  138. possibly by Anonymous Coward · · Score: 0

    that describes the use of XML in office XP and 2k. and while the ability to export to xml is nice, it is not what we are talking about. in work you can export to plain text or html. this does not stop the default and "recomended" format being the beast known as ".doc". my dad uses office all the time but does not want to "mess up his files with weird formats". this is a common mentality about office formats. Microsoft would not allow their ".doc" format to be read by other programs after they have worked so hard to stop that.

  139. Whoa (Re:They already did this ) by Doctor+Faustus · · Score: 1

    SQL Server has had an XML web gateway since version 2000. You can run any query and output it as xml

    I hadn't even thought about that. If the XML format is anything reasonable, you could get query results as XML, toss them through a little XSLT, and have your report as a Word or (probably) Excel file. That would be pretty slick.

    1. Re:Whoa (Re:They already did this ) by Anonymous Coward · · Score: 0

      Fuck. I would rather use VBA or anything than trying to write in in XSLT.
      Have you seen XSLT in "action"? - Perl looks like a fucking Python compared to that.

    2. Re:Whoa (Re:They already did this ) by pvera · · Score: 2

      It could save a lot of time to asp programmers. Instead of taking an html template and adding asp code to pull the recordset and loop thru it, you can do this:

      1. Put your query in an xml file and drop it into the xml gateway folder at the iis server. This xml file is tiny, since it only holds the query and a link to the xsl.

      2. Use XSL to make your template.

      3. Done!

      --
      Pedro
      ----
      The Insomniac Coder
  140. Great and wonderous things, indeed. by Eneff · · Score: 1

    I send this to you to have your advice.

  141. Microsoft reasons for doing this by kune · · Score: 2

    (1) Strategic value of proprietary Word format decreases. Most texts written today are E-Mails not Word-Documents. Word becomes more and more an editing format. Documents are published as ASCII texts, HTML and PDF. Word douments can't be combined with Web services, I've never seen a Web application creating Word documents. (2) Microsoft can't create a new proprietary format, that can't be read by Word 97. Everybody will accept that Word 97 doesn't read XML. If you want XML, you have to buy the new Office. (3) Outlook and Internet Explorer are examples how Microsoft can dominate a market starting with standard formats and protocols.

  142. A new "use" for XSLT by Brent_DS · · Score: 1

    Does this mean we'll see the very first XSLT Virus soon? I mean, VirusBasic scripts are getting so tiresome...

  143. How come slashdot comments are duplicated in link? by crush · · Score: 2

    If you follow the XML Journal link and look at the "feedback" at the bottom it appears to be the comments that are appearing here on slashdot. Is there some sort of reciprocal exchange of comments going on between the two sites? Is this kosher?

  144. Why Microsoft will do the right thing with XML by marhar · · Score: 2
    If they do the "right thing", then they will be able to position Office 11 as a generic frontend for any XML you happen to generate, regardless of the source. Imagine how convenient it would be to generate a nice spreadsheet from your backend perl script or nicely formatted form letters from your database application.


    The logic is: Everybody goes to XML, and Office becomes the universal front-end for everything XML.


    If on the other hand they screw it up, then that leaves a potential "underserved market" for somebody to step in and get some leverage in the newly created "xml frontend" segment of the business.

  145. Why is MS doing this...featuresets by InnovATIONS · · Score: 3, Insightful
    By taking the initiative in this MS can create an XML schema that neatly includes ALL of the featureset and terminology of MS Word/Excel/etc.

    Which then by virtue of market share becomes standard. It is actually in their best interest to publish it clearly. Then the other potential competitors will feel strong pressure to fit their software to match MS and have no real excuse why they can't. If MS waited there would be some other standard emerging and MS would be pressured by customers to adopt it. Then it would be MS having to shoehorn its document logic into some other form and not the other way around.

    While other potential competitors are playing catch-up with making their documents fit into the MS schema MS can be busy thinking about the next thing to do.

    So frankly I expect the word document xml (and excel and the rest) to actually be quite clear and documented but very aligned to how MS Word sees a document, which will likely impress others as obtuse.

  146. Yeah. Right. by rice_burners_suck · · Score: 2

    Yeah. And when Microsoft embraces and extends XML so it only works with Windows by obfuscating the format to the extent that nobody wants to parse it except the 20,000 monkeys beating away at Microsoft's very own 20,000 keyboards, nothing good will come of it. Oh well.

  147. Dreamy by abdulla · · Score: 1

    'when the huge universe of MS Office documents becomes available for processing by any programmer with a Perl script and a bit of intelligence, all sorts of wonderful new things can be invented that you and I can't imagine.'

    He also promised me the tooth fairy would pay me $2 for my tooth!

  148. Microsoft documents well-formed?? by BeforeCoffee · · Score: 1

    HAHAHAHAHAHAHAHAHAHAHAHAHAHAHAHA!!! OMG, I almost fell outa my chair this is so funny. This isn't April fools day.

    Seriously though, with the change in file formats could come some decent structure and less loosy-goosy WYSIWYG! (Yay structure!)

    Unfortunately, office will essentially be a version 1.0 product, buggy as all hell. Even worse, now that the Microsoft documents will be saved in an open, self-describing, and flexible data format, I'm sure we can look forward to a new level of sophistication in the macro viruses that will attack this new platform. Life will get very interesting after this new Office comes out. (Boo viruses!)

  149. [OT]:Weather in the Mediterranean Spanish coast... by Elementalor · · Score: 1

    Hola!

    Weather is quite hot during these days. In fact I'm going to the University with t-shirts and short trousers. The temperature is about 25 Celsius degrees and there are only some white clouds in the sky.

    Nights are a bit colder (around 10-15 Celsius degrees :)

    Hope this helps!

    Best wishes from Valencia (by the Mediterranean Sea in Spain / España)...

  150. Re:[OT]:Weather in the Mediterranean Spanish coast by aderuwe · · Score: 1

    You were right, weather was great. ;) The Dali museum in Figueres is magnificent, visit it if you haven't yet.