U.S. House of Representatives Makes Resolutions in XML

Yee haw! Crappy laws in better format! by shodson · 2002-07-04 06:30 · Score: 2

Now we can all make our own crappy laws using XML! More downloads for Xerces.

Ugh. DTDs?!? by Aquaman616 · 2002-07-04 06:32 · Score: 2, Insightful

I guess that's the government for ya... why in the *hell* would you use DTDs when XML Schemas are so much better???

Oh well... at least it's a step forward - I'll applaud them for that.

--
A|Q|U|A

DTD is sooo 1999. by km790816 · 2002-07-04 06:32 · Score: 3, Insightful

This is the government for you.

When every tool under the sun is using XML schemas, the House is announcing their support for DTDs.

I guess it's still a step forward.

--
A speech...

Re:DTD is sooo 1999. by ftobin · 2002-07-04 07:21 · Score: 2

When every tool under the sun is using XML schemas, the House is announcing their support for DTDs.

Jeezus, why would you even consider using Schemas when there is there is Relax-NG, a much better, simply, and based on theory system. Note the author of that document I gave; it's James Clark; if you are using an XML parser, chances are good it was written by him (expat). Heck, there is not even any normative spec for XML-Scheme!
Re:DTD is sooo 1999. by SirSlud · 2002-07-04 07:41 · Score: 5, Insightful

Your government must make an attempt to stick to standards when they are dealing with accessibility. They have to use technologies that have had some time to settle. By virtue of you pointing out that DTDs are 3 years old and you consider them obsolete, you reinforce the point that by selecting bleeding-edge formats/technologies/etc, they might be investing time and some of your money into something that wont be around in a year or two.

And then in a year or two, you'd just complain how the government cant choose their technologies right.

Start thinking about where you're getting this 'government is stupid/terrible/lazy/blah/blah' message from - alot of it is from private interests that enjoy the freedom and lack of public accountability to select their technological infrastructure based on higher demoninators than your government should. While the 'saavy' factor will always be higher in the private sector, dont *always* take this as an indication that government must be technologically inept (although, like anybody who's core competancy isn't technology, they frequently are) ... often they are doing something much smarter than private interests give them credit for. All of this is moot, of course, when discussing moves the government makes on _behalf_ of powerful private interests, but thats another argument and does not apply in this situation.

It's like being a private teacher vs public. Private teachers can probably be more 'progressive', but at the cost of maybe teaching in ways that might soon be proven to be ineffectual or bad, while public systems generally must move slower in order to ensure that the ideas have been vetted and that everyone has a moderately equal opportunity to access the fruits of the system.

Like parents, sysadmins, anybody who has an onus to cater to the greater good rather than the richer good, sometimes you have to make decisions that are going to be publicly derided even if its for the common good. Sometimes you have to just give the benifit of the doubt, though I realize this kind of attitude is in short supply these days.

Ok, rant off.

--
"Old man yells at systemd"

Uhhh.... by Verizon+Guy · 2002-07-04 06:37 · Score: 4, Interesting

Going to http://xml.house.gov/Members/mbr107.xml renders a perfectly viewable directory of representatives in Internet Explorer, but Mozilla dumps it all as raw text in one giant paragraph. What gives?!?

--

Aw, fuck it. Let's go bowling. - The Big Lebowski

Re:Uhhh.... by josh+crawley · 2002-07-04 06:41 · Score: 2

Maybe because IE supports the xml STANDARD more than mozilla.
Re:Uhhh.... by llamalicious · 2002-07-04 06:43 · Score: 2

<?xml:stylesheet type="text/xsl" href="member-sorter-vb.xsl"?>
<?xm-well_formed path="m:\xmltech\billres1\00-11-01\Members\mbr107. dtd"?>
<ushousemembers xmlns="x-schema:member-schema.xml">
Re:Uhhh.... by jaaron · 2002-07-04 06:46 · Score: 2

No, it's because of the way they use the XSL stylesheet. IE does not support the XML "standard" any more than Mozilla. Quit posting FUD.

--
Who said Freedom was Fair?
Re:Uhhh.... by MiTEG · 2002-07-04 06:51 · Score: 5, Informative

It's all screwed up with Opera 6.01 also.

--
The future isn't what it used to be.
Re:Uhhh.... by perlfool · 2002-07-04 08:31 · Score: 2, Informative

The main reason it doesn't render in Mozilla is they used an old XSLT Working draft namespace "http://www.w3.org/TR/WD-xsl". The XLST 1.0 namespace should be: "http://www.w3.org/1999/XSL/Transform"
See Unofficial MSXML XSLT FAQ" for some info about the old Working Draft, XSLT 1.0 and Internet Explorer.

How Slashdot-like by jaaron · 2002-07-04 06:39 · Score: 5, Funny

So the government tries to update their use of technology to use an open format like XML and publish the DTD's and inevitably the first 10 slashdot posts complain that the government is too behind the times because that don't use new (and better) XML schemas! Talk about geeks! :)

--
Who said Freedom was Fair?

DTDs by Citizen+of+Earth · 2002-07-04 06:40 · Score: 2

It reports that the HR has made 100 DTDs and uses Microsoft Word and a special converter to do the job.

But if they really want an intractible problem, they should use XML/Schema!

Stylesheet issues... by jaaron · 2002-07-04 06:44 · Score: 5, Informative

It's because of the XSL style sheet they use. You can find it at http://xml.house.gov/Members/member-sorter-vb.xsl. (Use view source to see the actual XSLT). Notice that they use VBScript!

--
Who said Freedom was Fair?

They DO use schemas... by jaaron · 2002-07-04 06:49 · Score: 2

Check out the source for http://xml.house.gov/Members/mbr107.xml and then the corresponding schema: http://xml.house.gov/Members/member-schema.xml

--
Who said Freedom was Fair?

Re:They DO use schemas... by jaaron · 2002-07-04 06:58 · Score: 2

Good point, I didn't notice that when I first posted. Still though, they're using namespaces which isn't part of the DTD definition. So the issue isn't that they're using outdated technology, it's that they're using proprietary extentions.

--
Who said Freedom was Fair?

Lawmakers who don't understand the law by kuroth · 2002-07-04 06:49 · Score: 4, Interesting

From the cited page...

Pursuant to Title 17 Section 105 of the United States Code, these DTDs are not subject to copyright protection and are in the public domain.
...
These DTDs can be redistributed and/or modified freely provided that any derivative works bear some notice that they are derived from it, and any modified versions bear some notice that they have been modified.

Sorry, cupcakes, that's not how the public domain works. If you release it into the public domain, you no longer have *any* control whatsoever upon the modification, reuse, or redistribution of the work. The required notice clause listed above in invalid.

Cite, cite (#3), cite.

Kuroth

Re:Lawmakers who don't understand the law by foniksonik · 2002-07-04 19:14 · Score: 2

I was thinking GPL myself... public domain with copyright. Wouldn't that be interesting if the US Gov starting using GPL for all documents? Just put it in the metadata and a quick notice at bottom.

hmmm makes me think I want to do that with all my documents. Is there a license attribute for meta-data tags in html... if not I'll make one.

--
A fool throws a stone into a well and a thousand sages can not remove it.

I say... by numbuscus · 2002-07-04 06:55 · Score: 2

...even if they are using a what some on this site would consider 'suboptimal' technology, the government's incorporation of ANY technology is better than none at all. Hell, the Senate doesn't allow laptops on the Senate floor! Hopefully, as the 'mainstream' government begins to use more open-standards technology and technology in general, they will be more willing to defend it against M$ and any other company that tries to 'embrace and extend' it.

My $0.02

Example of the new markup by crucini · 2002-07-04 07:00 · Score: 5, Funny

<bill status="proposed" name="CBDTPA">
<sponsor name="Fritz Hollings" constituency="Disney">
<violatesAmendment number="1">
<violatesAmendment number="4">
<contribution donor="Disney" amount="24500.00">
<contribution donor="AOL" amount="33000.00">
<contribution donor="National Association of Broadcasters" amount="25000.00">
<excuse>Promote broadband adoption</excuse>
<excuse>Save the arts from extinction</excuse>
</bill>

Re:Example of the new markup by SirSlud · 2002-07-04 07:09 · Score: 2

> Save the arts from extinction

Thats the best part! I always hated that excuse, especially considering how insulting it should be to artists.

Stop and think about this - claiming the arts will die if hollywood dies is like saying the habit of breathing oxygen will die if the SCUBA industry goes belly up.

--
"Old man yells at systemd"
Re:Example of the new markup by Guppy06 · 2002-07-04 07:51 · Score: 2

You forgot the default value of "Save the children" in your tags there...
Re:Example of the new markup by Megane · 2002-07-04 08:23 · Score: 2

Don't forget the usefulness of the <pork> tag.

--
#naabhaprzrag, #sverubfr-000, #agi-fcbafberq, negvpyr[pynff*=' negvpyr-ary-'] { qvfcynl: abar !vzcbegnag; }
Re:Example of the new markup by danro · 2002-07-04 21:19 · Score: 3, Interesting

Neat idea...
Just write a http proxy that applies an XSLT to the document. Generate the tag-values from the opensecrets.org database (if they have one).
Could probably be done by one person in a week or two, if opensecrets keep a reasonable usable database, and are willing to cooperate.

If I were an american I would be tempted to write the thing myself...
It would be great to just go to a website and see all bills with a header that indicated which elected officials was involved, and their voting record and ties to special interests.

Hell, if anyone wants to do this, I am willing to contribute just because it's cool...

--

"First lesson," Jon said. "Stick them with the pointy end."

Re:Was there any doubt they wouldn't be free? by Ivan+Raikov · 2002-07-04 07:01 · Score: 2

If the government creates something original for it's use how can there be any arguement as to if it should be availible to the people..?

Considering the current government's flirtations with Big Business (not to be confused with Big Brother), I'm actually surprised that they didn't just publish their bills as Word documents.

And looking at the XML documents, it does appear that they're using some non-W3C, Microsoft-like XML stylesheet format. I'd argue that this is favoring one commercial product (Internet Exploder) at the expense of all others.

--
Bush Lies Watch

They are using WordPerfect Too by frank249 · 2002-07-04 07:06 · Score: 3, Informative

It reports that the HR has made 100 DTDs and uses Microsoft Word and a special converter to do the job.

The article actualy says It shows how each line, name and term has an identifying tag, created by exporting the document from a word processor such as Microsoft Word or Corel WordPerfect into a special XML template.

That would make sense since most of the US government still uses WordPerfect. WordPerfect comes with extensive XML publishing functions including making your own DTDs.

BTW Corel just announced that a new version of Ventura Publisher is coming out in the fall with cross platform XML publishing built in. The next version of WordPerfect is also going to have a much better XML publisher now that they bought XMetaL.

--

Today's vices may be tomorrow's virtues.

don't even validate by Steve+X · 2002-07-04 07:07 · Score: 2, Interesting

heh, their XML documents don't even come close to validating. they say it's all beta, but wow, that's impressive. good to know my taxes are being put to good use - high-quality design. i think nsgmls says it best about their design:

value of attribute "regeneration" cannot be "yes"; must be one of "yes-regeneration", "no-regeneration"

DTDs, Schema, and XDR by jaaron · 2002-07-04 07:11 · Score: 4, Informative

Actually, if you check the source, you'll see that they are using XML namespaces and schemas. Actually, they're using something called XDR (XML-Data-Reduced) which was developed by Microsoft and is upwards compatable with XML schema. I'm familiar with schema but not XDR. For more information, you may want to check out these links:

And thanks to this poster for pointing it out.

--
Who said Freedom was Fair?

Re:DTDs, Schema, and XDR by smallpaul · 2002-07-04 11:29 · Score: 2

In what sense is XDR "forwards compatible" with XML Schema? In the sense that you can rewrite all of that Microsoft-proprietary stuff into XML Schema if you care to put in the effort?
Re:DTDs, Schema, and XDR by deblau · 2002-07-04 16:35 · Score: 2

Just so no one is confused, that's Microsoft's XDR, not the real XDR.

--
This post expresses my opinion, not that of my employer. And yes, IAAL.
Re:DTDs, Schema, and XDR by vidarh · 2002-07-04 23:25 · Score: 2

No, in the sense that there is a publicly available XSL stylesheet that will do the conversion for you. XDR was a stopgap thing for Microsoft to get schema support out the door before the XML schema spec was finished.
Re:DTDs, Schema, and XDR by smallpaul · 2002-07-04 23:53 · Score: 2

I was speaking with a Microsoft employee on the Schema team today. He reacted in horror to the view that XDR is "upwards compatible" with XML Schema.

Great! by Rombuu · 2002-07-04 07:12 · Score: 5, Funny

And it looks like the DTDs will be free to use and distribute!

Great, now I can make my own crazy laws! Yipee!

--

DrLunch.com The site that tells you what's for lunch!

Re:Great! by mdemeny · 2002-07-04 07:21 · Score: 2

Great, now I can make my own crazy laws! Yipee!
Actually it's so that lobbyists can make their own crazy laws. Yipee, indeed.

Re:It's the XSLT by Abcd1234 · 2002-07-04 07:18 · Score: 2

Ummm... what about Transformiix? That would be the Mozilla XSLT engine, which is built right into Moz 1.0. Check out the project website here.

Another Use for Microsoft crap by codeguy007 · 2002-07-04 07:18 · Score: 3, Insightful

I thought the US Government was starting to learn that Microsoft software was to be avoided. By finding more uses for it, I am afraid that it is obviously not true.

Re:Another Use for Microsoft crap by DunbarTheInept · 2002-07-04 09:30 · Score: 3, Funny

Yes, they are using MS software, but this once they are using it to export things into a well documented, open format that could be made to work with anything (unlike a Word document). Sure, maybe different browsers aren't good at reading the XML the government is putting it out in the way that makes IE most comfortable, but at least it is in a DOCUMENTED format this time, one that the open source community can respond to and implement fairly quickly if there's incentive to (and I think having all major US government stuff in that format would be a big enough incentive.)

Is it still biased in favor of IE users right now? Absolutely, I won't deny that. But if it is actually a properly documented format for once then that bias won't last. This isn't a perfect situation, but it's a major step up from publishing things in proprietary binary word processor formats like they did in the past.

--
Don't label something "offtopic" unless you know the topic well enough to tell what's on topic.

What part about public domain don't they get? by ClarkEvans · 2002-07-04 07:19 · Score: 5, Insightful

Dig the notice at xml.house.gov -- The document type definitions (DTDs) presented on this site were developed at the U.S. House of Representatives by employees of the Federal Government in the course of their official duties. Pursuant to Title 17 Section 105 of the United States Code, these DTDs are not subject to copyright protection and are in the public domain. These DTDs are in draft form. The U.S. House of Representatives assumes no responsibility whatsoever for their use by other parties, and makes no guarantees, expressed or implied, about their quality, reliability, or any other characteristic. These DTDs can be redistributed and/or modified freely provided that any derivative works bear some notice that they are derived from it, and any modified versions bear some notice that they have been modified. (emphasis mine)

Either these DTDs are copyrighted and they can place restrictions upon distribution or they arn't. This need people have to control everything is just driving me crazy. The whole reason for Title 17 Section 105 is so that the Government can't put restrictions on this kind of stuff (bills, laws, etc.) ...

Re:What part about public domain don't they get? by Maserati · 2002-07-04 10:34 · Score: 2

They can't enforece it, but they can ask, preferrably nicely. I can't think of any reason to steal it and distribute it without attribution (not that someone else couldn't) so I'm not real worried at this point. Besides, stealing from COngress torques them off, they hate the competition.

--
Veteran, Bermuda Triangle Expeditionary Force, 1992-1951
Re:What part about public domain don't they get? by ClarkEvans · 2002-07-04 11:42 · Score: 2

I can't think of any reason to steal it and distribute it without attribution (not that someone else couldn't) so I'm not real worried at this point. (emphasis mine)

And how could I possibly steal something that is in the public domain? Just beacuse they wrote it they own it? The framers of the consitution rejected natural-rights thought with regard to intellectual property. Who owns it anyway? The public of the U.S. paid for it, so don't we own it? If I copy it and use it for my own purposes why would this make me a thief?

I think you have fallen into the group-think that the RIAA wants everyone to succumb to.

Schema war is not over...W3C XML-Schema is bloated by ClarkEvans · 2002-07-04 07:23 · Score: 3, Insightful

Why use DTDs?

Have you ever tried to use XML Schema? It's a bloated peice of ****. Relax is tons better. And for the government's purposes, DTDs work much better and are an ISO standard.

So does this mean... by neonzebra · 2002-07-04 07:31 · Score: 2, Funny

.... that the president can use an XSLT to make a bill into law?

ddt free to use? huh??? by CProgrammer98 · 2002-07-04 07:38 · Score: 3, Insightful

"And it looks like the DTDs will be free to use and distribute"

Ummmmm if you're using a validating xml parser, you HAVE to have access to the dtd!!! All DTDs have to be free to use!

--
And the people shall be oppressed, every one by another, and every one by his neighbour Isaiah 3:5

Indeed, it's not free by twitter · 2002-07-04 07:45 · Score: 3, Informative

The mention of M$ Word put me on alert, as have previous stories here which have demostrated that XML will simply be a container for propriatory data formats like M$ Word. Closer examination, however, reveals a much more horrible arangement.

XML is dependent on unicode, as the US Government site's reference states. Follow the W3C to unicode ,

Unicode is required by modern standards such as XML, Java, ECMAScript (JavaScript), LDAP, CORBA 3.0, WML, etc., and is the official way to implement ISO/IEC 10646.

Unicode is owned by Unicode Incorporated and all of it's documents and standarts are issued under a restrictive license with a unilaeral change clause:

Modification by Unicode Unicode shall have the right to modify this Agreement at any time by posting it to this site. The user may not assign any part of this Agreement without Unicodes prior written consent.

Dare I compare this evil arangement to ASCII and other predecesors? To have IBM, M$, Sun and other OWN the very format your data takes and to be able to change it and break previous implimentations at whim, and YOU may not? Who wants to be a plump nickle that any thing vaugly resembling unicode in the future will be called a "derivative" and it's distribution halted? Is this not a collusion of comercial software vendors to control information at it's most basic representation? Does anyone else here see this as the ultimate extention of copyright? Evil, Evil, Evil.

I'd rather see the US government continue to publish in the American Standard for Information Interchange. This extensible standard is no standard at all.

--

Friends don't help friends install M$ junk.

Re:Indeed, it's not free by smallpaul · 2002-07-04 11:26 · Score: 2

Unicode is owned by Unicode Incorporated [unicode.org] and all of it's documents and standarts are issued under a restrictive license [unicode.org] with a unilaeral change clause:

Have you looked at the copyrights for most standards? Try to get a free copy of the SGML or EDI standards? Unicode is wide open comparitively. Plus, if you're going to complain about vendor-owned consortia, you might as well whine about the W3C itself.
Re:Indeed, it's not free by RennieScum · 2002-07-04 19:33 · Score: 3, Insightful

Paranoia.
It shows how each line, name and term has an identifying tag, created by exporting the document from a word processor such as Microsoft Word or Corel WordPerfect into a special XML template
They're usign a *tool* to help convert .doc and .wpd files to XML. They're just leveraging their assets (MSW*rd being an, ahem, asset) so that secretaries and regular folk can do the work of text entry in tools they are familiar with, which then gets converted into a useable format.

Settle down, they're not trying to use MSXML engines to do the work. Sheesh.

--
...Time is the best teacher, unfortunately it kills all of its students.

Re:Schema war is not over...W3C XML-Schema is bloa by ClarkEvans · 2002-07-04 08:16 · Score: 2

Using XML to describe XML simply makes sense.

In this case RELAX is far superior, it has both an XML and a non-XML represenatation and is build on top of a clean model by some brilliant fellas.

XML Schema, OTOH, is just a bloated mess.

DTD's are antiquated

Perhaps, but they are readable. XML Schema is anything but readable.

and I can't even transform against them for meta-meta-data tasks

Oh, now that's something you do every day. Using XML syntax for everything is just plain stupid. IF you have to do transforms, use RELAX, it has a cleaner model anyway... doing transforms on XML Schema is like pulling teeth.

Why didn't they just use standard HTML? by moncyb · 2002-07-04 08:47 · Score: 2

Standard HTML is just as searchable as long as you use the tags properly. One does have to wonder if M$ "encouraged" them to use this format.

Re:Why didn't they just use standard HTML? by Ravagin · 2002-07-04 09:41 · Score: 2

Why not html? Because they're not just describing text here. There're all sorts of data associated with a piece of legislation, and an extensible - not a hyptertext - markup language is the best way to do it.

--
Karma: T-rexcellent.
Re:Why didn't they just use standard HTML? by moncyb · 2002-07-04 13:07 · Score: 2

What is this mysterious data that can't be expressed in HTML???? Blipverts!!!??!!?? Maybe they'll put cartoons into the bill--to help explain why they passed it. Oooo...maybe they can put in complex equations so everyone will think they are smart.
I think some people just believe XML is some sort of magical file format that should be used no matter what. I expect MPEG 5 will be in XML, then they'll wonder why the files are so much larger and takes 10x the processing time and memory to decode.
XML may be useful in some places, but not everywhere. Replacing it with binary formats is bad because it will unnecessarily increase the filesize and resources to decode them. Using it for config files will require all programs to run an XML parser and make the config files less human readable. Using it to express laws will just make them inaccessible to the common person by requiring them to have expensive proprietary software (or software made by an illegal monopoly) to even view them.
If they want bills to be searchable, they should be designing database tables for them, and allow the public to export the database (or subsets of it) in a standard database format. For online viewing, they could easily export the data into HTML (or XML) using PHP.
Using "Microsoft Word and a special converter to do the job" is just stupid. Creating a program that allows some intern to key the data into the database would probably be easier and more effective in the long run.
Re:Why didn't they just use standard HTML? by moncyb · 2002-07-05 11:32 · Score: 2

Oh yeah, just make up some contrived obviously biased answer! Do you make infomercials???? Or maybe you just don't know anything about html.
The html version of your "example" would probably look more like this:
<p><a name="para1">(1)</a> blah, blah, blah
...and for you information, browsers already search that way--the paragraph in question can be referenced by appending a #para1 to the document's url.

I get this in Netscape 7 Preview: by ImaLamer · 2002-07-04 09:02 · Score: 2

I get seperate paragraphs (yet mashed together), yet I can paste the data to notepad or this text box and it looks even worse.

I can't post it because of this error:

Your comment has too few characters per line (currently 6.2)

--
Get your Unix fortune now!

Check this with IE though: by ImaLamer · 2002-07-04 09:04 · Score: 2

http://xml.house.gov/hr100_eh.xml
http://xml.hous e.gov/hr6_ath.xml
http://xml.house.gov/hr10.xml

all just code

--
Get your Unix fortune now!

Re:Check this with IE though: by ImaLamer · 2002-07-04 14:01 · Score: 2

No, IE shows code which is just ghey.... who wants to go surfing the net reading HTML the whole time?

--
Get your Unix fortune now!

Re:Just use IE6 by DunbarTheInept · 2002-07-04 09:22 · Score: 2

Why not just use IE? Because it only works if you are using a shitty Operating System underneath it, and the OS you use affects a lot more stuff than just your web browser. There are reasons completely unrelated to web browsing that make me want to be running Linux most of the time except for the occasional game. I think that this is the primary reason for the IE hostility a lot of geeks have. To use it we have to dumb-down *everything* we use (which is what happens it feels like to use Windows after being used to using Unix), just to get a particular web browser. If I.E. was produced by a company other than the one that has a vested interest in keeping the Windows monopoly in place, it wouldn't be a problem because they would make a Linux version.

--

Don't label something "offtopic" unless you know the topic well enough to tell what's on topic.

HR has made 100 DTDs by Ilan+Volow · 2002-07-04 09:45 · Score: 5, Funny

Congress has always been full of lyahs and chetahs. That it's now full of schemas is really no surpise.

--
Ergonomica Auctorita Illico!

The Importance of DTDs by The+Monster · 2002-07-04 10:37 · Score: 2

DTDs are obselete by now

They may not be bleeding edge, but what's important about this is that the House is making
a commitment to open data formats. Even where we don't get open source code, this guar-
antees that we don't get the most virulent form of 'vendor lock-in', where failure to pay the
latest rent increase means we can't even access our own data anymore.

---
Fight Page Widening! Make your own line <br>:reaks.

--

[100% ISO 646 Compliant]
SVM, ERGO MONSTRO.

Re:Was there any doubt they wouldn't be free? by feronti · 2002-07-04 14:33 · Score: 2, Informative

Um, did you read the source? Or did you just open it up in IE? Because the source is clean (though not prettily formatted:), pure, 100% XML. In fact, there's only one namespace declaration in the entire thing (XLink, which they use to embed hyperlinks between various parts of the documents). All in all, this is some of the cleanest XML I've ever seen (including XML I've written myself by hand:)

But if you opened it up in IE, IE applies a stylesheet to all xml documents which gives you a nice collapsible view of the document tree (which is often easier to read than the source:)

Even HTML would be a HUGE improvemt by ahfoo · 2002-07-04 18:18 · Score: 2

--aything with links is essential to reforming legal texts into something useful. In the US, the laws are written in English. It should be the case that anybody with a high school education could read them and understand them with ease. The main reason lawyers get so involved in anything that has the slightest concern with the law is the twisted textual markup that is currently used makes the documents incomprehensible and extremely difficult to understand in full because of the need to obtain the hundreds of essential external references. This is wonderful news.
Even the stilted style of language referred to as legalese is partly a product of the need for a meta context within legal writing. This is long overdo, but awesome nonetheless.

Re:DTD may be old by foniksonik · 2002-07-04 19:22 · Score: 2

Well that is if you don't count the Bill of Rights and the rest of the AMENDMENTS to the Constitution.

Seems to me like it's been at 2.0 RC X.x for quite some time.

--
A fool throws a stone into a well and a thousand sages can not remove it.

XML creaps in another place by thogard · 2002-07-04 22:19 · Score: 2

Didn't any of the XML supporters every study parsing in their CS classes? Or are they just web control freaks that didn't bother with anything past highschool. Oh wait, I'm talking about w3c so of course they are contorl freaks. At least most people ingored them.

The problem with XML is that it diverges into two dinstict worst cases. One requires and infinite amount of memory, the other and infinite amount of time. Both of these are bad things and much study of algorithms is about avoiding both of these conditions. Odd thing is most people in the IT field today have no clue about why this happens or even that it can happen. Of course these are the same programmers that coudn't describe a quicksort if they had to or descibe something in BNF grammar. And we wonder why most programmers today just produce garbage.

Re:XML creaps in another place by vidarh · 2002-07-04 23:16 · Score: 2

Can you elaborate? I can't see what part of parsing XML you are referring to - parsing XML for the most part seems relatively simple, though I haven't written a complete XML parser or spent the time to read through the complete specification.
Re:XML creaps in another place by vidarh · 2002-07-05 01:22 · Score: 2

I don't see the problem. If the closing tag is missing and you are using a Sax parser the only effect is one more scope indicator, and the parser will plod along happily until you try to close the surrounding tag at which point it will know right away that it should give an error.
Whether it will allow you to try to recover or not at that stage would be up to the parser.
Recovering from malformed input is regardless a difficult task, and typically you don't want to go there - that's not a parsing issue, but an issue of trying to predict how an error should be recovered.
For a DOM parser, the parser would do the same thing, and just fail and free the tree once it found the surrounding tag (or the end of the file). However using a DOM parser with a scenario like the one you suggested would be plain stupid.
In either case, handling a missing closing tag is trivial with XML, and I certainly can't see any justification for the claim that you'd either need unlimited memory or unlimited time based on that
Anyway, you've just given an example of a case where ANY grammar based on nested blocks will have to have thought put into it when it is fed bad data, with no justification for why it should make XML bad from a parsing standpoint.
Do you have a better example?
Re:XML creaps in another place by vidarh · 2002-07-05 04:29 · Score: 2

This is a very different problem from what you suggested in the other message, and is can be just as real with REAL documents.
So what you are really saying is that your problem is with ANY system that allow scoping, and where state is required for each scope until the scope is closed?
The problem with that is that scoping is useful and makes it a lot easier to represent a whole lot of data in a structured form that seems natural to humans.
In other words, an XML parser may require more resources than a parser for a grammar without scoping. But the scoping is allowed for a reason - it provides structure that is hard to provide without it.
The reason you can't make a file that breaks grep is that grep doesn't care about structure. You can easily work on XML files withouth running into the problem as well if you ignore structure. But then you are also losing a whole lot of advantages.
I still don't see this as a problem. You need to handle resource limits regardless. If you have 1MB available, as you originally used in your example, then when you have used that 1MB then you have to fail gracefully. If the only case where you use the whole 1MB is a broken document, then whether you fail because the parser detects it or fail because you don't have more memory is irellevant - the parse failed.
If you need to give more specific error messages, you can do that fairly easily, by, when you've filled memory scanning the remainder of the document to determine whether any of the outer tags will EVER get closed.
If you want to recover from unclosed tags, the standard way of doing that for HTML and XML is to define which start tags you want to autoclose which types of open tags for.
This is a straightforward mechanism that works well, in particular in the presence of a schema or DTD where you can easily determine where leaving a tag open means the document is malformed where it may possibly be wellformed if the tag is closed.
I haven't implemented it for XML, but I have implemented in an HTML filter that needed to handle particularly broken HTML.
In the real world this is a problem only if you don't think about it and design your software to handle it, just as not thinking your design through in general leads to broken software.
Re:XML creaps in another place by vidarh · 2002-07-07 20:13 · Score: 2

The scoping issue and the stack depth issue are the same, and the solutions I described are solutions in common use.
And I'm used to dealing with users on the input side. The company I work for operate the .name TLD. Registrars interact with us via XML. Our subcontractors interact with us via XML. We're dealing with far from perfect XML and errors needs to be communicated.
We did use to have an ASCII based format, and we had more problems with that. The advantage of XML is that the users can validate the XML generated pretty well on their side by running it through an XML parser with schema validation support.

Re:Just use IE6 by DunbarTheInept · 2002-07-05 15:30 · Score: 2

1. The only poeple who give a flying fuck about the fact that linux isn't technically legallt allowed to be called unix are lawyers and trolls like you and that "Rev Don Cool" idiot on usenet.

2. IE support on the few unixen where it does run is awful and the thing is too bloated to be practical (since instead of porting IE to unix APIs they ported parts of the Windows API and put IE on top of that, the executable is gigantic on unix.)

3. You did say "IE 6", which even on the few unixes where IE 6 exists, it doesn't go up to that version number, so clearly you are lying.

--

Don't label something "offtopic" unless you know the topic well enough to tell what's on topic.

Re:Just use IE6 by DunbarTheInept · 2002-07-05 15:32 · Score: 2

Err, delete that "6" from the second "IE 6". The dangers of cutting and pasting.

--

Don't label something "offtopic" unless you know the topic well enough to tell what's on topic.

Slashdot Mirror

U.S. House of Representatives Makes Resolutions in XML

67 of 164 comments (clear)