California Joins Open Document Bandwagon
Andy Updegrove writes "A legislator in California has decided that it's time for California to get on the open formats bandwagon. If all of the bills filed in the last few weeks pass, California, Texas, and Minnesota will all require, in near-identical language, that 'all documents, including, but not limited to, text, spreadsheets, and presentations, produced by any state agency shall be created, exchanged, and preserved in an open extensible markup language-based, XML-based file format.' What type of formats will qualify? Again, the language is very uniform (the following is from the California statute): 'When deciding how to implement this section, the department in its evaluation of open, XML-based file formats shall consider all of the following features: (1) Interoperable among diverse internal and external platforms and applications; (2) Fully published and available royalty-free; (3) Implemented by multiple vendors; (4) Controlled by an open industry organization with a well-defined inclusive process for evolution of the standard.'"
Minnesota also is considering open documents.
Why not just require the format to be in ANY published standard format? "XML" by itself is meaningless, "extensible" is a loaded term (and a very bad idea when trying to write a way to keep things compatible). Why do lawmakers always have to over-specify things until the purpose of the law is lost?
-- 'The' Lord and Master Bitman On High, Master Of All
<user="wwwillem">
<subject>we should do this too</subject>
<content>
What is good for government documents is also good for Slashdot posts.
</content>
</xml>
Browsers shouldn't have a back button!! It's all about going forward...
Format is irrelevant - since these documents will contain legal-speak, they'll be unreadable anyway. ;)
biopowered.co.uk - catalytically cracking triglycerides for home automotive use since 2008. Just say no to big oil!
N00b: Hey we have this data representation problem, we'll use XML!
Greybeard: Son, now you have two problems.
I want to delete my account but Slashdot doesn't allow it.
In other news, Microsoft is quickly subsidizing 3 small companies to write quick and meaningless stupid plug-ins using OOXML as input, just to pretend that their format is "Implemented by multiple vendors" and on "diverse (...) platforms" (ie.: Windows 98, Windows ME, Windows 2000, Windows XP *and* Windows Vista)...
"Sufficiently advanced satire is indistinguishable from reality." - [Tips: 1DrYakQDKCQ6y52z6QbnkxHXAocMZJE61o ]
There's only one reason, and that's because the higher-ups think that EVERYTHING should be XML based. Of course, they have no idea what this actually means. They just know that it needs to have XML in it, because that's what the other guys are doing.
This reminds me of my boss, who keeps saying that we need to publish things in XML, but can't give me any reason why we should. Then again, two years ago I kept on hearing about how our company needed a blog, again with no justification as to how it would help us. Thankfully, that passed. Eventually, the XML thing will, too. Of course, this isn't meant to belittle the things out there that actually can benefit from utilizing an XML format.
This guy's the limit!
Well, obviously it doesn't need to be xml, but XML does have one nice self-documentation property that plain text lacks: the character encoding.
If you've looked at project gutenberg texts, you can see why this is a problem. Not a huge problem, but a problem. When a source text has a non-ascii character in it, they have to put some sequence of ascii characters which will suggest what the glyph is supposed to be. This doesn't really preserve the information in the source document, nor does it make the document easy to read.
So, you could have a trivial text XML format that has only one defined tag. It's still useful:
<xml version="1.0"? encoding="us-ascii">
<text>
This is my text. It has no wacky glyphs so ASCII is fine.
</text>
vs.
<?xml version="1.0"? encoding="utf8">
<text>
This is my text. It has wacky glyphs therefôre ascii sücks for it!
</text>
Post may contain irony: discontinue use if experiencing mood swings, nausea or elevated blood pressure.
Just specifying XML doesn't mean much, really:
... more binary crap...
<document>
Description of MS Open Format
<![CDATA[
37642364 78346478 23465789 34657834 65783465 78934653 47895634 78563478 65347856
56347825 63478256 34786578 34567893 45678934 65783456 78465783 46578346 57834567
34895723 48957348 90578934 75890347 58934758 93475892
]]>
</document>
- For the complete works of Shakespeare: cat
I am, of course, talking about Microsoft. They refuse to accept the Open standard.
Until that happens, there will be problems. Yes, you could have .odt documents sent internally, but what if someone has to send a document to someone outside the company? Microsoft Office does not recognize .odt, and if you think that you can train someone to remember to send .doc files to outside users, and keep internal documents to .odt, then I have a bridge to sell you.
Let's stop dilly-dallying and just change "-1: Overrated" to "-1: Disagree" or "-1: Doesn't Subscribe to Groupthink".
XML means it is readable by humans. You don't even NEED any kind of a program to get the text.
I don't get what all the hoo-haw is and why we need courts or lobbying for any of this. I find it very difficult to write anything when my term paper or [insert your document here] isn't open. Sounds like a bunch of people just need to learn how to double-click.