Slashdot Mirror


U.S. House of Representatives Makes Resolutions in XML

RennieScum writes: "The House of Representatives is turning to technology with their test of XML for use with resolutions according to this article. It reports that the HR has made 100 DTDs and uses Microsoft Word and a special converter to do the job. Testing has begun and their goal is to start using it in January of next year. See also http://xml.house.gov/ And it looks like the DTDs will be free to use and distribute!"

164 comments

  1. Yee haw! Crappy laws in better format! by shodson · · Score: 2

    Now we can all make our own crappy laws using XML! More downloads for Xerces.

  2. And now ged rid of the legacy by Anonymous Coward · · Score: 0

    Great, DTDs are obselete by now and Schemes have taken over.

    1. Re:And now ged rid of the legacy by cifey · · Score: 1

      Schemas can be developed with backwards compatability to the dtd's. When implemented they would just find errors in the existing documents to be corrected.

      --
      Hello Cruel World
  3. Was there any doubt they wouldn't be free? by smcavoy · · Score: 1

    If the government creates something original for it's use how can there be any arguement as to if it should be availible to the people..? (top secret, national security stuff aside)??

    1. Re:Was there any doubt they wouldn't be free? by Ivan+Raikov · · Score: 2

      If the government creates something original for it's use how can there be any arguement as to if it should be availible to the people..?

      Considering the current government's flirtations with Big Business (not to be confused with Big Brother), I'm actually surprised that they didn't just publish their bills as Word documents.

      And looking at the XML documents, it does appear that they're using some non-W3C, Microsoft-like XML stylesheet format. I'd argue that this is favoring one commercial product (Internet Exploder) at the expense of all others.

    2. Re:Was there any doubt they wouldn't be free? by feronti · · Score: 2, Informative

      Um, did you read the source? Or did you just open it up in IE? Because the source is clean (though not prettily formatted:), pure, 100% XML. In fact, there's only one namespace declaration in the entire thing (XLink, which they use to embed hyperlinks between various parts of the documents). All in all, this is some of the cleanest XML I've ever seen (including XML I've written myself by hand:)

      But if you opened it up in IE, IE applies a stylesheet to all xml documents which gives you a nice collapsible view of the document tree (which is often easier to read than the source:)

  4. better laws? by crossconnects · · Score: 0

    will it help create better laws?

    --
    no big sig
  5. Ugh. DTDs?!? by Aquaman616 · · Score: 2, Insightful

    I guess that's the government for ya... why in the *hell* would you use DTDs when XML Schemas are so much better???

    Oh well... at least it's a step forward - I'll applaud them for that.

    --
    A|Q|U|A
    1. Re:Ugh. DTDs?!? by Anonymous Coward · · Score: 1, Insightful

      Well, DTDs are just a less-expressive form of Schemas, correct?

      Why couldn't you just take all of their DTDs and rewrite them as schemas? You could then donate that back to them, and i'm sure they'd be happy to offer it as a download option.

      Hell, maybe someone could make an XSL stylesheet to turn DTDs into schemas :)

      -- super ugly ultraman

    2. Re:Ugh. DTDs?!? by Bazz · · Score: 0

      XSL stylesheets won't work on DTDs because DTDs do not utilize XML syntax.

      BTW - DTDs are not obsolete either. They are far easier to read, use, and maintain than W3C XML Schema.

      Take a look at RELAX NG. RELAX NG is far more elegant (e.g. easier to learn, use and maintain) than W3C XML Schema. RELAX NG provides the capabilities of W3C XML Schema - and more.

  6. DTD is sooo 1999. by km790816 · · Score: 3, Insightful

    This is the government for you.

    When every tool under the sun is using XML schemas, the House is announcing their support for DTDs.

    I guess it's still a step forward.

    1. Re:DTD is sooo 1999. by Anonymous Coward · · Score: 0

      Emacs still stick to DTD in XML editing mode, at least on which is in the distro. I am glad that the govt sticks to the same old-good standard.

    2. Re:DTD is sooo 1999. by Anonymous Coward · · Score: 0

      Oh, give it a freaking rest. Given my experience with out lovely government, I'm absoultely shocked that anyone in the House even knows what XML *is*.

      Not only does someone there know what it is, but they're actually pushing toward the implementation of an open standard. That's the more important step.

    3. Re:DTD is sooo 1999. by ftobin · · Score: 2

      When every tool under the sun is using XML schemas, the House is announcing their support for DTDs.

      Jeezus, why would you even consider using Schemas when there is there is Relax-NG, a much better, simply, and based on theory system. Note the author of that document I gave; it's James Clark; if you are using an XML parser, chances are good it was written by him (expat). Heck, there is not even any normative spec for XML-Scheme!

    4. Re:DTD is sooo 1999. by SirSlud · · Score: 5, Insightful

      Your government must make an attempt to stick to standards when they are dealing with accessibility. They have to use technologies that have had some time to settle. By virtue of you pointing out that DTDs are 3 years old and you consider them obsolete, you reinforce the point that by selecting bleeding-edge formats/technologies/etc, they might be investing time and some of your money into something that wont be around in a year or two.

      And then in a year or two, you'd just complain how the government cant choose their technologies right.

      Start thinking about where you're getting this 'government is stupid/terrible/lazy/blah/blah' message from - alot of it is from private interests that enjoy the freedom and lack of public accountability to select their technological infrastructure based on higher demoninators than your government should. While the 'saavy' factor will always be higher in the private sector, dont *always* take this as an indication that government must be technologically inept (although, like anybody who's core competancy isn't technology, they frequently are) ... often they are doing something much smarter than private interests give them credit for. All of this is moot, of course, when discussing moves the government makes on _behalf_ of powerful private interests, but thats another argument and does not apply in this situation.

      It's like being a private teacher vs public. Private teachers can probably be more 'progressive', but at the cost of maybe teaching in ways that might soon be proven to be ineffectual or bad, while public systems generally must move slower in order to ensure that the ideas have been vetted and that everyone has a moderately equal opportunity to access the fruits of the system.

      Like parents, sysadmins, anybody who has an onus to cater to the greater good rather than the richer good, sometimes you have to make decisions that are going to be publicly derided even if its for the common good. Sometimes you have to just give the benifit of the doubt, though I realize this kind of attitude is in short supply these days.

      Ok, rant off.

      --
      "Old man yells at systemd"
  7. Uhhh.... by Verizon+Guy · · Score: 4, Interesting

    Going to http://xml.house.gov/Members/mbr107.xml renders a perfectly viewable directory of representatives in Internet Explorer, but Mozilla dumps it all as raw text in one giant paragraph. What gives?!?

    --

    Aw, fuck it. Let's go bowling. - The Big Lebowski

    1. Re:Uhhh.... by josh+crawley · · Score: 2

      Maybe because IE supports the xml STANDARD more than mozilla.

    2. Re:Uhhh.... by llamalicious · · Score: 2


      <?xml:stylesheet type="text/xsl" href="member-sorter-vb.xsl"?>
      <?xm-well_formed path="m:\xmltech\billres1\00-11-01\Members\mbr107. dtd"?>
      <ushousemembers xmlns="x-schema:member-schema.xml">

    3. Re:Uhhh.... by bobtheprophet · · Score: 0

      Since they're making the files with microsoft products on windows machines, from their point of view it doesn't make sense to test anything but mozilla. Since IE is used by most of the people on the web now, seeing if it renders the pages being designed is more logical than testing them on mozilla or netscape, even though mozilla and netscape are far cooler than IE. Such is life.

      --
      Don't give me none of this "nature theme" business.
    4. Re:Uhhh.... by jaaron · · Score: 2

      No, it's because of the way they use the XSL stylesheet. IE does not support the XML "standard" any more than Mozilla. Quit posting FUD.

      --
      Who said Freedom was Fair?
    5. Re:Uhhh.... by Anonymous Coward · · Score: 0

      Maybe not.

      The reason is that they don't use XSLT (which *is* the W3C-recommended language and which *is* supported by IE6 and Mozilla).

      Rather, they use an outdated, Microsoft-proprietary dialect.

    6. Re:Uhhh.... by evalhalla · · Score: 1

      I think that's because IE uses a default stylesheet for xml documents, while Mozilla strictly complies to the standard and just shows the contents of the tags, without any style.

    7. Re:Uhhh.... by Anonymous Coward · · Score: 0

      Nah!

      It's just that they use an "XSL" language variant which isn't a W3C recommendation and only exists in Microsoft products. Even *Microsoft* has deprecated it's use (it's not supported in MSXML4 anymore).

      FAQ on http://www.netcrucible.com

    8. Re:Uhhh.... by MiTEG · · Score: 5, Informative

      It's all screwed up with Opera 6.01 also.

      --
      The future isn't what it used to be.
    9. Re:Uhhh.... by Anonymous Coward · · Score: 0

      Maybe it's because you're a stupid cunt.

    10. Re:Uhhh.... by perlfool · · Score: 2, Informative
      The main reason it doesn't render in Mozilla is they used an old XSLT Working draft namespace "http://www.w3.org/TR/WD-xsl". The XLST 1.0 namespace should be: "http://www.w3.org/1999/XSL/Transform"

      See Unofficial MSXML XSLT FAQ" for some info about the old Working Draft, XSLT 1.0 and Internet Explorer.

    11. Re:Uhhh.... by 1000101 · · Score: 0, Flamebait

      that's because mozilla sux ass

    12. Re:Uhhh.... by Anonymous Coward · · Score: 0
      Maybe because IE supports the xml STANDARD more than mozilla.

      It's amazing how a remark like that containing nothing but complete FUD is given a mod rating of 2. The guy offered no proof, no explanation.. IE actually supports a standard??

    13. Re:Uhhh.... by Anonymous Coward · · Score: 0

      stupid cunt

    14. Re:Uhhh.... by Anonymous Coward · · Score: 0

      XSLT isn't a standard, it's a W3C *recommendation*.

      And yes, XSLT support in IE6 actually is better than in Mozilla.

  8. How Slashdot-like by jaaron · · Score: 5, Funny

    So the government tries to update their use of technology to use an open format like XML and publish the DTD's and inevitably the first 10 slashdot posts complain that the government is too behind the times because that don't use new (and better) XML schemas! Talk about geeks! :)

    --
    Who said Freedom was Fair?
    1. Re:How Slashdot-like by idletask · · Score: 1

      > the government is too behind the times because that don't use new (and better) XML schemas!

      Well, this is an administration, you know... So actually they can be credited for having been aware of XML at least a year ago. Had they been aware of XML schemas that it'd have taken another 6 months before the site got up, don't you think?

      I'm quite confident that nowadays the average PHB doesn't even know what XML stands for and is used for...

  9. DTDs by Citizen+of+Earth · · Score: 2

    It reports that the HR has made 100 DTDs and uses Microsoft Word and a special converter to do the job.

    But if they really want an intractible problem, they should use XML/Schema!

  10. Oh Boy! by rbeattie · · Score: 1, Offtopic


    Free DTDs!!! I LOVE DTDs! Wooohoo! We definitely don't have enough of those already!

    And who says a Republican government is only out to help the big guys. Free DTDs for all!

    Happy 4th everyone! Damn I'm proud to be an American today. Free DTDs!!

    -Russ

    --
    Me
    1. Re:Oh Boy! by p3d0 · · Score: 1

      Oh man, this is the funniest thing I have read in a while. You almost made me burst out laughing out loud here at work, which would have been very embarrassing...

      --
      Patrick Doyle
      I mod down every jackass who puts his moderation policy in his sig. Oh, wait a sec....
    2. Re:Oh Boy! by Anonymous Coward · · Score: 0
      XML Schema is too innovative format and as such cannot be disclosed and/or exported to Europian, African, Latin-American, Australian, Asian and any other enimy countries.

      Publishing old DTD is in strategic interests of USA. Can't you understand? It's all done to confuse enimies. Don't protect Earth from the green-house effect, ignore International Crime Court and stay with old standards - that's way to win!

  11. Stylesheet issues... by jaaron · · Score: 5, Informative

    It's because of the XSL style sheet they use. You can find it at http://xml.house.gov/Members/member-sorter-vb.xsl. (Use view source to see the actual XSLT). Notice that they use VBScript!

    --
    Who said Freedom was Fair?
  12. They DO use schemas... by jaaron · · Score: 2

    Check out the source for http://xml.house.gov/Members/mbr107.xml and then the corresponding schema: http://xml.house.gov/Members/member-schema.xml

    --
    Who said Freedom was Fair?
    1. Re:They DO use schemas... by Anonymous Coward · · Score: 0

      That's not XML Schema. It's "XDR", another Microsoft-proprietary (and indeed deprecated) technology.

    2. Re:They DO use schemas... by jaaron · · Score: 2

      Good point, I didn't notice that when I first posted. Still though, they're using namespaces which isn't part of the DTD definition. So the issue isn't that they're using outdated technology, it's that they're using proprietary extentions.

      --
      Who said Freedom was Fair?
  13. Lawmakers who don't understand the law by kuroth · · Score: 4, Interesting

    From the cited page...

    Pursuant to Title 17 Section 105 of the United States Code, these DTDs are not subject to copyright protection and are in the public domain.
    ...
    These DTDs can be redistributed and/or modified freely provided that any derivative works bear some notice that they are derived from it, and any modified versions bear some notice that they have been modified.

    Sorry, cupcakes, that's not how the public domain works. If you release it into the public domain, you no longer have *any* control whatsoever upon the modification, reuse, or redistribution of the work. The required notice clause listed above in invalid.

    Cite, cite (#3), cite.

    Kuroth

    1. Re:Lawmakers who don't understand the law by user32.ExitWindowsEx · · Score: 1

      Well, hearing about the above clause makes me think of the BSD license. Same principle.

      --
      "Evil will always triumph because good is dumb." -- Dark Helmet
    2. Re:Lawmakers who don't understand the law by jordan_a · · Score: 1

      No since the BSD license doesn't say "This is public domain" anywheres in it. Very diffrent principles

    3. Re:Lawmakers who don't understand the law by Anonymous Coward · · Score: 0

      that's because you're an idiot. are you, by any chance, the guy who plans to launch himself into space?

    4. Re:Lawmakers who don't understand the law by foniksonik · · Score: 2

      I was thinking GPL myself... public domain with copyright. Wouldn't that be interesting if the US Gov starting using GPL for all documents? Just put it in the metadata and a quick notice at bottom.

      hmmm makes me think I want to do that with all my documents. Is there a license attribute for meta-data tags in html... if not I'll make one.

      --
      A fool throws a stone into a well and a thousand sages can not remove it.
  14. CmdrTaco - US flag desecrator and Anti-Delawarian! by Anonymous Coward · · Score: 0
    As noted on the Smithsonian Institution's site, the first official American flag had thirteen stars and thirteen stripes, each representing one of the thirteen original states.

    The flag icon for Slashdot's 'United States' section is missing its first stripe - the stripe that represents Delaware, the first state admitted to the Union. While a simple oversight could be forgiven, it should be known from here on out that Slashdot is in fact aware of the missing stripe, and even worse, refuses to do anything about it!

    This vulgar flag desecration and rabid anti-Delawarism must be put to a stop. Let the Slashdot crew know that we will not accept a knowingly mutilated flag or the insinuation that Delawarians deserve to be cut out of the union. I ask you, what has Delaware done to deserve this insolence, this wanton disregard, this bigotry?

    This intentional disregard of a vital national symbol is unpatriotic. Why, the flippant remarks CmdrTaco made about our flag border on terrorism! I urge you to join the protest in each 'United States' story. Sacrifice your karma for your country by pointing out this injustice. Let's all work together to get our flag back. Can you give your country any less?

  15. CmdrTaco - US flag desecrator and Anti-Delawarian! by Anonymous Coward · · Score: 0
    As noted on the Smithsonian Institution's site, the first official American flag had thirteen stars and thirteen stripes, each representing one of the thirteen original states.

    The flag icon for Slashdot's 'United States' section is missing its first stripe - the stripe that represents Delaware, the first state admitted to the Union. While a simple oversight could be forgiven, it should be known from here on out that Slashdot is in fact aware of the missing stripe, and even worse, refuses to do anything about it!

    This vulgar flag desecration and rabid anti-Delawarism must be put to a stop. Let the Slashdot crew know that we will not accept a knowingly mutilated flag or the insinuation that Delawarians deserve to be cut out of the union. I ask you, what has Delaware done to deserve this insolence, this wanton disregard, this bigotry?

    This intentional disregard of a vital national symbol is unpatriotic. Why, the flippant remarks CmdrTaco made about our flag border on terrorism! I urge you to join the protest in each 'United States' story. Sacrifice your karma for your country by pointing out this injustice. Let's all work together to get our flag back. Can you give your country any less?

  16. I say... by numbuscus · · Score: 2

    ...even if they are using a what some on this site would consider 'suboptimal' technology, the government's incorporation of ANY technology is better than none at all. Hell, the Senate doesn't allow laptops on the Senate floor! Hopefully, as the 'mainstream' government begins to use more open-standards technology and technology in general, they will be more willing to defend it against M$ and any other company that tries to 'embrace and extend' it.

    My $0.02

  17. Example of the new markup by crucini · · Score: 5, Funny


    <bill status="proposed" name="CBDTPA">
    <sponsor name="Fritz Hollings" constituency="Disney">
    <violatesAmendment number="1">
    <violatesAmendment number="4">
    <contribution donor="Disney" amount="24500.00">
    <contribution donor="AOL" amount="33000.00">
    <contribution donor="National Association of Broadcasters" amount="25000.00">
    <excuse>Promote broadband adoption</excuse>
    <excuse>Save the arts from extinction</excuse>
    </bill>

    1. Re:Example of the new markup by SirSlud · · Score: 2

      > Save the arts from extinction

      Thats the best part! I always hated that excuse, especially considering how insulting it should be to artists.

      Stop and think about this - claiming the arts will die if hollywood dies is like saying the habit of breathing oxygen will die if the SCUBA industry goes belly up.

      --
      "Old man yells at systemd"
    2. Re:Example of the new markup by Guppy06 · · Score: 2

      You forgot the default value of "Save the children" in your tags there...

    3. Re:Example of the new markup by Megane · · Score: 2

      Don't forget the usefulness of the <pork> tag.

      --
      #naabhaprzrag, #sverubfr-000, #agi-fcbafberq, negvpyr[pynff*=' negvpyr-ary-'] { qvfcynl: abar !vzcbegnag; }
    4. Re:Example of the new markup by zrodney · · Score: 1

      this is meant as a joke, but it's a good idea!

      someone could have a metaserver which puts these
      additional tags into the offical descriptions of
      the bills.

      each could have links to the sponsoring groups of
      lobbists or grassroots which in turn could be
      crosslinked to show which bills are obviously
      just kickbacks, and which are really concerned with
      issues.

    5. Re:Example of the new markup by Anonymous Coward · · Score: 0

      shouldn't it be:

      <bill status="proposed" name="CBDTPA">
      <sponsor name="Fritz Hollings" constituency="Disney"/>
      <violatesAmendment number="1"/>
      <violatesAmendment number="4"/>
      <contribution donor="Disney" amount="24500.00"/>
      <contribution donor="AOL" amount="33000.00"/>
      <contribution donor="National Association of Broadcasters" amount="25000.00"/>
      <excuse>Promote broadband adoption</excuse>
      <excuse>Save the arts from extinction</excuse>
      </bill>

    6. Re:Example of the new markup by danro · · Score: 3, Interesting

      Neat idea...
      Just write a http proxy that applies an XSLT to the document. Generate the tag-values from the opensecrets.org database (if they have one).
      Could probably be done by one person in a week or two, if opensecrets keep a reasonable usable database, and are willing to cooperate.

      If I were an american I would be tempted to write the thing myself...
      It would be great to just go to a website and see all bills with a header that indicated which elected officials was involved, and their voting record and ties to special interests.

      Hell, if anyone wants to do this, I am willing to contribute just because it's cool...

      --

      "First lesson," Jon said. "Stick them with the pointy end."
  18. They are using WordPerfect Too by frank249 · · Score: 3, Informative
    It reports that the HR has made 100 DTDs and uses Microsoft Word and a special converter to do the job.

    The article actualy says It shows how each line, name and term has an identifying tag, created by exporting the document from a word processor such as Microsoft Word or Corel WordPerfect into a special XML template.

    That would make sense since most of the US government still uses WordPerfect. WordPerfect comes with extensive XML publishing functions including making your own DTDs.

    BTW Corel just announced that a new version of Ventura Publisher is coming out in the fall with cross platform XML publishing built in. The next version of WordPerfect is also going to have a much better XML publisher now that they bought XMetaL.

    --

    Today's vices may be tomorrow's virtues.

  19. America's Army / Senate / Resumes by Anonymous Coward · · Score: 0

    Don't you get it? Think about it.. the Army creates a computer game and releases it July 4 in an attempt to get good will with the 31337 QU4K3 D00dz..

    This is just the same thing: the House of Representatives releases support and DTDs for an awesome, buzzword-compliant, flashy XML technology on July 4 in an attempt to get good will with the hacker nerds :)

    A quick question: the url just talks about the House of Representatives. Will the Senate be using these as well? Why not? Wouldn't you think that Congress would want uniform reporting requirements between both houses?

    Either way, Senate or no, newest-standards-compliance or not (be polite and maybe they'll even fix that), this is really, really, cool. Maybe this is a first step toward making it so that at times that the government wants you to submit electronic documents (for example, resumes, or comments during the public comment period of an antitrust trial) they will accept DocBook.

    VOTE

    -- super ugly ultraman

  20. don't even validate by Steve+X · · Score: 2, Interesting

    heh, their XML documents don't even come close to validating. they say it's all beta, but wow, that's impressive. good to know my taxes are being put to good use - high-quality design. i think nsgmls says it best about their design:

    value of attribute "regeneration" cannot be "yes"; must be one of "yes-regeneration", "no-regeneration"

  21. It's the XSLT by Anonymous Coward · · Score: 1, Informative
    in the second line of the xml:
    <?xml:stylesheet type="text/xsl" href="member-sorter-vb.xsl"?>
    in the 6th line of the above-referenced xsl document being used to transform the xml:
    <xsl:stylesheet xmlns:xsl="http://www.w3.org/TR/WD-xsl" language="VBScript">

    basically, they're using the MSXML parser to do their XSLT on the client-side. I've been working with this stuff for a while, and there are a lot of advantages to doing this. The MSXML parser is a lot more mature & well documented than whatever comes built into NS6 & Mozilla(if you know better, please point me to some good resources for working with client-side XSLT on these browsers-- i've looked everywhere).

    But it seems to me that public accessibility to to these documents should preclude this, and demand that the parsing be done on the server-side.

    Beyond that, the fact that they're using VBScript instead of JavaScript for their scripting is indicative of the fact that the people in charge of this initiative are hardcore MS-Heads -- ther's no reason for it, you can do some extremely complex stuff with the MSXML parser and JavaScript.

    I know this is paranoid, but my past experience has been that even people inside MS use JScript if they can avoid VBScript... unless they're forced to use it for marketing reasons. Wonder who's in charge of this initiative.

    1. Re:It's the XSLT by Verizon+Guy · · Score: 1

      I know this is paranoid, but my past experience has been that even people inside MS use JScript if they can avoid VBScript... unless they're forced to use it for marketing reasons. Wonder who's in charge of this initiative.

      IIRC, the ASP pages on microsoft.com use JScript; VBScript is great because if you know VB, you can learn VBScript in an hour.

      --

      Aw, fuck it. Let's go bowling. - The Big Lebowski

    2. Re:It's the XSLT by Abcd1234 · · Score: 2

      Ummm... what about Transformiix? That would be the Mozilla XSLT engine, which is built right into Moz 1.0. Check out the project website here.

  22. Open standard, yes. by Anonymous Coward · · Score: 0

    Open tools? Not likely:

    <Schema xmlns="urn:schemas-microsoft-com:xml-data"
    xmlns: dt="urn:schemas-microsoft-com:datatypes">

  23. save a buck or two by jrs+1 · · Score: 1

    i think that they could have saved a buck or two by using open office. although, if it's not their money that they're spending, i doubt they care.

    1. Re:save a buck or two by Anonymous Coward · · Score: 0

      i think that they could have saved a buck or two by using open office. although, if it's not their money that they're spending, i doubt they care.

      Unfortunately congresscritters, like a lot of the citizenry, now think that "free country" now means "free lunch" rather than "freedom of action".

      Freedom in our lifetime

    2. Re:save a buck or two by jrs+1 · · Score: 1

      right, but are they aware that open office is free as in software?

  24. DTDs, Schema, and XDR by jaaron · · Score: 4, Informative
    Actually, if you check the source, you'll see that they are using XML namespaces and schemas. Actually, they're using something called XDR (XML-Data-Reduced) which was developed by Microsoft and is upwards compatable with XML schema. I'm familiar with schema but not XDR. For more information, you may want to check out these links:

    And thanks to this poster for pointing it out.
    --
    Who said Freedom was Fair?
    1. Re:DTDs, Schema, and XDR by Anonymous Coward · · Score: 0

      It's not upwards-compatible. It's just older and shares some features (mainly to use XML as schema format, that's it).

    2. Re:DTDs, Schema, and XDR by smallpaul · · Score: 2

      In what sense is XDR "forwards compatible" with XML Schema? In the sense that you can rewrite all of that Microsoft-proprietary stuff into XML Schema if you care to put in the effort?

    3. Re:DTDs, Schema, and XDR by deblau · · Score: 2

      Just so no one is confused, that's Microsoft's XDR, not the real XDR.

      --
      This post expresses my opinion, not that of my employer. And yes, IAAL.
    4. Re:DTDs, Schema, and XDR by vidarh · · Score: 2

      No, in the sense that there is a publicly available XSL stylesheet that will do the conversion for you. XDR was a stopgap thing for Microsoft to get schema support out the door before the XML schema spec was finished.

    5. Re:DTDs, Schema, and XDR by smallpaul · · Score: 2

      I was speaking with a Microsoft employee on the Schema team today. He reacted in horror to the view that XDR is "upwards compatible" with XML Schema.

  25. Great! by Rombuu · · Score: 5, Funny

    And it looks like the DTDs will be free to use and distribute!

    Great, now I can make my own crazy laws! Yipee!

    --

    DrLunch.com The site that tells you what's for lunch!
    1. Re:Great! by mdemeny · · Score: 2
      Great, now I can make my own crazy laws! Yipee!

      Actually it's so that lobbyists can make their own crazy laws. Yipee, indeed.

    2. Re:Great! by Anonymous Coward · · Score: 1, Funny

      No problem. My crazy law is that no-one (especially not the RIAA) can make crazy laws except me.

  26. Another Use for Microsoft crap by codeguy007 · · Score: 3, Insightful

    I thought the US Government was starting to learn that Microsoft software was to be avoided. By finding more uses for it, I am afraid that it is obviously not true.

    1. Re:Another Use for Microsoft crap by DunbarTheInept · · Score: 3, Funny

      Yes, they are using MS software, but this once they are using it to export things into a well documented, open format that could be made to work with anything (unlike a Word document). Sure, maybe different browsers aren't good at reading the XML the government is putting it out in the way that makes IE most comfortable, but at least it is in a DOCUMENTED format this time, one that the open source community can respond to and implement fairly quickly if there's incentive to (and I think having all major US government stuff in that format would be a big enough incentive.)

      Is it still biased in favor of IE users right now? Absolutely, I won't deny that. But if it is actually a properly documented format for once then that bias won't last. This isn't a perfect situation, but it's a major step up from publishing things in proprietary binary word processor formats like they did in the past.

      --

      Don't label something "offtopic" unless you know the topic well enough to tell what's on topic.

  27. What part about public domain don't they get? by ClarkEvans · · Score: 5, Insightful

    Dig the notice at xml.house.gov -- The document type definitions (DTDs) presented on this site were developed at the U.S. House of Representatives by employees of the Federal Government in the course of their official duties. Pursuant to Title 17 Section 105 of the United States Code, these DTDs are not subject to copyright protection and are in the public domain. These DTDs are in draft form. The U.S. House of Representatives assumes no responsibility whatsoever for their use by other parties, and makes no guarantees, expressed or implied, about their quality, reliability, or any other characteristic. These DTDs can be redistributed and/or modified freely provided that any derivative works bear some notice that they are derived from it, and any modified versions bear some notice that they have been modified. (emphasis mine)

    Either these DTDs are copyrighted and they can place restrictions upon distribution or they arn't. This need people have to control everything is just driving me crazy. The whole reason for Title 17 Section 105 is so that the Government can't put restrictions on this kind of stuff (bills, laws, etc.) ...

    1. Re:What part about public domain don't they get? by Maserati · · Score: 2

      They can't enforece it, but they can ask, preferrably nicely. I can't think of any reason to steal it and distribute it without attribution (not that someone else couldn't) so I'm not real worried at this point. Besides, stealing from COngress torques them off, they hate the competition.

      --
      Veteran, Bermuda Triangle Expeditionary Force, 1992-1951
    2. Re:What part about public domain don't they get? by spotter · · Score: 1

      IIRC the law only applies in the US (or for US citizens, forget which). Outside the US, they are still copyrighted.

    3. Re:What part about public domain don't they get? by ClarkEvans · · Score: 2

      I can't think of any reason to steal it and distribute it without attribution (not that someone else couldn't) so I'm not real worried at this point. (emphasis mine)

      And how could I possibly steal something that is in the public domain? Just beacuse they wrote it they own it? The framers of the consitution rejected natural-rights thought with regard to intellectual property. Who owns it anyway? The public of the U.S. paid for it, so don't we own it? If I copy it and use it for my own purposes why would this make me a thief?

      I think you have fallen into the group-think that the RIAA wants everyone to succumb to.

    4. Re:What part about public domain don't they get? by Maserati · · Score: 1

      Steal it may have been an overbroad statement. If anyone has fallen into the RIAA groupthink it's the HoReps who put the notice in in the first place.

      --
      Veteran, Bermuda Triangle Expeditionary Force, 1992-1951
    5. Re:What part about public domain don't they get? by Anonymous Coward · · Score: 0

      Yeah, you're not a victim of any sort of "group-think" [COUGH SLASHDOT COUGH] are you?

      Nah, you're just an ass.

    6. Re:What part about public domain don't they get? by thogard · · Score: 1

      The only people I know who could benifit by stealing this is a different goverment.... maybe a shadow goverment?

  28. Schema war is not over...W3C XML-Schema is bloated by ClarkEvans · · Score: 3, Insightful

    Why use DTDs?

    Have you ever tried to use XML Schema? It's a bloated peice of ****. Relax is tons better. And for the government's purposes, DTDs work much better and are an ISO standard.

  29. happy july 4th! by Anonymous+Pancake · · Score: 0, Insightful

    I'd like to wish a happy july 4th to the country that funds Israel's terrorism, created the DMCA, and generally wipes it's ass on the rest of the world.

    Happy July 4th you filthy pig fuckers.

    1. Re:happy july 4th! by Anonymous Coward · · Score: 0

      Ha ha! The rest of the world can suck my big hairy seppo dick!

  30. So does this mean... by neonzebra · · Score: 2, Funny

    .... that the president can use an XSLT to make a bill into law?

  31. DTD may be old by bsDaemon · · Score: 1

    But so is the constitution and noone much complains about upgrading that to version 2.0

    1. Re:DTD may be old by foniksonik · · Score: 2

      Well that is if you don't count the Bill of Rights and the rest of the AMENDMENTS to the Constitution.

      Seems to me like it's been at 2.0 RC X.x for quite some time.

      --
      A fool throws a stone into a well and a thousand sages can not remove it.
  32. ddt free to use? huh??? by CProgrammer98 · · Score: 3, Insightful

    "And it looks like the DTDs will be free to use and distribute"

    Ummmmm if you're using a validating xml parser, you HAVE to have access to the dtd!!! All DTDs have to be free to use!

    --
    And the people shall be oppressed, every one by another, and every one by his neighbour Isaiah 3:5
  33. Indeed, it's not free by twitter · · Score: 3, Informative
    The mention of M$ Word put me on alert, as have previous stories here which have demostrated that XML will simply be a container for propriatory data formats like M$ Word. Closer examination, however, reveals a much more horrible arangement.

    XML is dependent on unicode, as the US Government site's reference states. Follow the W3C to unicode ,

    Unicode is required by modern standards such as XML, Java, ECMAScript (JavaScript), LDAP, CORBA 3.0, WML, etc., and is the official way to implement ISO/IEC 10646.

    Unicode is owned by Unicode Incorporated and all of it's documents and standarts are issued under a restrictive license with a unilaeral change clause:

    Modification by Unicode Unicode shall have the right to modify this Agreement at any time by posting it to this site. The user may not assign any part of this Agreement without Unicodes prior written consent.

    Dare I compare this evil arangement to ASCII and other predecesors? To have IBM, M$, Sun and other OWN the very format your data takes and to be able to change it and break previous implimentations at whim, and YOU may not? Who wants to be a plump nickle that any thing vaugly resembling unicode in the future will be called a "derivative" and it's distribution halted? Is this not a collusion of comercial software vendors to control information at it's most basic representation? Does anyone else here see this as the ultimate extention of copyright? Evil, Evil, Evil.

    I'd rather see the US government continue to publish in the American Standard for Information Interchange. This extensible standard is no standard at all.

    --

    Friends don't help friends install M$ junk.

    1. Re:Indeed, it's not free by Anonymous Coward · · Score: 0

      XML is dependent on unicode
      Whoops, wrong! It is true that XML uses Unicode as its default character set, but you can choose whatever encoding you like. See this part of the XML spec.

    2. Re:Indeed, it's not free by Maserati · · Score: 1

      The article mentions WordPerfect as well. And so long as the DTD is available, anything else that reads and writes XML will work fine.

      --
      Veteran, Bermuda Triangle Expeditionary Force, 1992-1951
    3. Re:Indeed, it's not free by smallpaul · · Score: 2

      Unicode is owned by Unicode Incorporated [unicode.org] and all of it's documents and standarts are issued under a restrictive license [unicode.org] with a unilaeral change clause:

      Have you looked at the copyrights for most standards? Try to get a free copy of the SGML or EDI standards? Unicode is wide open comparitively. Plus, if you're going to complain about vendor-owned consortia, you might as well whine about the W3C itself.

    4. Re:Indeed, it's not free by RennieScum · · Score: 3, Insightful
      Paranoia.
      It shows how each line, name and term has an identifying tag, created by exporting the document from a word processor such as Microsoft Word or Corel WordPerfect into a special XML template
      They're usign a *tool* to help convert .doc and .wpd files to XML. They're just leveraging their assets (MSW*rd being an, ahem, asset) so that secretaries and regular folk can do the work of text entry in tools they are familiar with, which then gets converted into a useable format.

      Settle down, they're not trying to use MSXML engines to do the work. Sheesh.
      --
      ...Time is the best teacher, unfortunately it kills all of its students.
  34. MOD PARENT UP by GeckoX · · Score: 1

    A rare piece of insight indeed.
    Listen up kiddies.

    --
    No Comment.
  35. Re:Schema war is not over...W3C XML-Schema is bloa by malakai · · Score: 1

    Using XML to describe XML simply makes sense. DTD's are antiquated, and I can't even transform against them for meta-meta-data tasks.

  36. I second that. by Futurepower(R) · · Score: 1, Redundant

    I agree. Mod parent up.

  37. Re:Schema war is not over...W3C XML-Schema is bloa by ClarkEvans · · Score: 2

    Using XML to describe XML simply makes sense.

    In this case RELAX is far superior, it has both an XML and a non-XML represenatation and is build on top of a clean model by some brilliant fellas.

    XML Schema, OTOH, is just a bloated mess.

    DTD's are antiquated

    Perhaps, but they are readable. XML Schema is anything but readable.

    and I can't even transform against them for meta-meta-data tasks

    Oh, now that's something you do every day. Using XML syntax for everything is just plain stupid. IF you have to do transforms, use RELAX, it has a cleaner model anyway... doing transforms on XML Schema is like pulling teeth.

  38. You forgot Antarctica. by Anonymous Coward · · Score: 0

    You forgot Antarctica. It's an enemy, too. Everything that isn't ignorant, good-old-boy Republican is an enemy.

  39. Finish the job. by Futurepower(R) · · Score: 1

    Give the parent post that 5th point. He's right.

  40. Guess what part of the equation... by Anonymous Coward · · Score: 0

    Isn't free?

  41. Why didn't they just use standard HTML? by moncyb · · Score: 2

    Standard HTML is just as searchable as long as you use the tags properly. One does have to wonder if M$ "encouraged" them to use this format.

    1. Re:Why didn't they just use standard HTML? by Ravagin · · Score: 2

      Why not html? Because they're not just describing text here. There're all sorts of data associated with a piece of legislation, and an extensible - not a hyptertext - markup language is the best way to do it.

      --

      Karma: T-rexcellent.

    2. Re:Why didn't they just use standard HTML? by moncyb · · Score: 2

      What is this mysterious data that can't be expressed in HTML???? Blipverts!!!??!!?? Maybe they'll put cartoons into the bill--to help explain why they passed it. Oooo...maybe they can put in complex equations so everyone will think they are smart.

      I think some people just believe XML is some sort of magical file format that should be used no matter what. I expect MPEG 5 will be in XML, then they'll wonder why the files are so much larger and takes 10x the processing time and memory to decode.

      XML may be useful in some places, but not everywhere. Replacing it with binary formats is bad because it will unnecessarily increase the filesize and resources to decode them. Using it for config files will require all programs to run an XML parser and make the config files less human readable. Using it to express laws will just make them inaccessible to the common person by requiring them to have expensive proprietary software (or software made by an illegal monopoly) to even view them.

      If they want bills to be searchable, they should be designing database tables for them, and allow the public to export the database (or subsets of it) in a standard database format. For online viewing, they could easily export the data into HTML (or XML) using PHP.

      Using "Microsoft Word and a special converter to do the job" is just stupid. Creating a program that allows some intern to key the data into the database would probably be easier and more effective in the long run.

    3. Re:Why didn't they just use standard HTML? by Anonymous Coward · · Score: 0

      Its always good to see that no matter how somebody does something, there's some jackass standing in the back leaning against the wall saying how he could do it better.

    4. Re:Why didn't they just use standard HTML? by simonj · · Score: 1

      Or, for online viewing, they can leave it in XML and use a stylesheet.

      Also what here looks easier to parse (and therefore to search) to you?

      This :-

      <table>
      <tr><th>Paragraph Number</th><th>Paragraph></th></td>
      <tr><td>1</td><td>blah, blah, blah</td></tr>
      </table>

      Or this :-
      <paragraph num=1>
      blah, blah, blah
      </paragraph>

    5. Re:Why didn't they just use standard HTML? by moncyb · · Score: 2

      Oh yeah, just make up some contrived obviously biased answer! Do you make infomercials???? Or maybe you just don't know anything about html.

      The html version of your "example" would probably look more like this:

      <p><a name="para1">(1)</a> blah, blah, blah

      ...and for you information, browsers already search that way--the paragraph in question can be referenced by appending a #para1 to the document's url.

    6. Re:Why didn't they just use standard HTML? by simonj · · Score: 1

      No I don't make infomercials, nor am I clueless about HTML. I just happen to disagree with you, which, given your stupid mindless insults, is obviously something you don't like happening to you very much. My answer is "biased"? Well yeah I guess it is, biased towards my own point of view. If you don't like that, tough.

      Anyway, the fact remains that HTML is a format designed to *display* information in a human readable way.

      XML is a format designed to *represent* data in a human readable way.

      Sure you could layout your HTML version in the way you described and it would work. But wouldn't it be easier if you used a more appropriate format?

  42. I get this in Netscape 7 Preview: by ImaLamer · · Score: 2

    I get seperate paragraphs (yet mashed together), yet I can paste the data to notepad or this text box and it looks even worse.

    I can't post it because of this error:

    Your comment has too few characters per line (currently 6.2)

  43. Check this with IE though: by ImaLamer · · Score: 2

    http://xml.house.gov/hr100_eh.xml
    http://xml.hous e.gov/hr6_ath.xml
    http://xml.house.gov/hr10.xml

    all just code

    1. Re:Check this with IE though: by Verizon+Guy · · Score: 1

      At least IE has the decency to delimit and color code it in a collapsible tree, unlike moz which mashes it all together.

      --

      Aw, fuck it. Let's go bowling. - The Big Lebowski

    2. Re:Check this with IE though: by ImaLamer · · Score: 2

      No, IE shows code which is just ghey.... who wants to go surfing the net reading HTML the whole time?

  44. Just use IE6 by Anonymous Coward · · Score: 0

    Internet Explorer 6 worked fine for me. Why not ditch Mozilla, Netscape, and Opera and go for a real web browser. There is a reason that IE is the #1 browser on over 93% of its user base. It is proven and *gasp* works.

    1. Re:Just use IE6 by DunbarTheInept · · Score: 2

      Why not just use IE? Because it only works if you are using a shitty Operating System underneath it, and the OS you use affects a lot more stuff than just your web browser. There are reasons completely unrelated to web browsing that make me want to be running Linux most of the time except for the occasional game. I think that this is the primary reason for the IE hostility a lot of geeks have. To use it we have to dumb-down *everything* we use (which is what happens it feels like to use Windows after being used to using Unix), just to get a particular web browser. If I.E. was produced by a company other than the one that has a vested interest in keeping the Windows monopoly in place, it wouldn't be a problem because they would make a Linux version.

      --

      Don't label something "offtopic" unless you know the topic well enough to tell what's on topic.

    2. Re:Just use IE6 by Verizon+Guy · · Score: 1

      I use IE6... =) I was just curious what it would look like in moz... I only use moz for browsing for pr0n... all those tabs!

      --

      Aw, fuck it. Let's go bowling. - The Big Lebowski

    3. Re:Just use IE6 by Verizon+Guy · · Score: 1

      Please go suck a putrid, herpes-infected dick you zealot cocksucker.

      IE runs on real Unixes, like Solaris and HP-UX. Grow some pubes, take a shower, and get a life.

      --

      Aw, fuck it. Let's go bowling. - The Big Lebowski

    4. Re:Just use IE6 by Anonymous Coward · · Score: 0

      Solaris and HP-UX give unix a bad name.

      Try AIX or some other real unix.

    5. Re:Just use IE6 by Bert64 · · Score: 1

      IE6, as mentioned in the post topic.. does NOT run on unix, 5.0sp1 is the newest available for Solaris/HPUX and has no support for the new Solaris 9, and no support for solaris on x86 systems. It`s also incredibly heavy on the network if your running X remotely, sending an average of 40mbit/sec over my lan when running.

      --
      http://spamdecoy.net - free throwaway anonymous email - avoid spam!
    6. Re:Just use IE6 by DunbarTheInept · · Score: 2

      1. The only poeple who give a flying fuck about the fact that linux isn't technically legallt allowed to be called unix are lawyers and trolls like you and that "Rev Don Cool" idiot on usenet.

      2. IE support on the few unixen where it does run is awful and the thing is too bloated to be practical (since instead of porting IE to unix APIs they ported parts of the Windows API and put IE on top of that, the executable is gigantic on unix.)

      3. You did say "IE 6", which even on the few unixes where IE 6 exists, it doesn't go up to that version number, so clearly you are lying.

      --

      Don't label something "offtopic" unless you know the topic well enough to tell what's on topic.

    7. Re:Just use IE6 by DunbarTheInept · · Score: 2

      Err, delete that "6" from the second "IE 6". The dangers of cutting and pasting.

      --

      Don't label something "offtopic" unless you know the topic well enough to tell what's on topic.

  45. Happy 4th! by Pinball+Wizard · · Score: 1, Offtopic
    To recognize our great country on its birthday, I present you with an XML representation of the American flag:

    <?xml version="1.0" encoding="ISO-8859-1" >
    -<Flags>
    -<Flag type="American">
    <symbol type="Stars">
    <count>50</count>
    <background>navy</background>
    <color>white</color>
    </symbol>
    <symbol type="Stripes">
    <stripeno=1>
    <stripeval>Deleware</stripeval>
    <color>red</color>
    </stripeno>
    <stripeno=2>
    <stripeval>Pennsylvania</stripeval>
    <color>white</color>
    </stripeno>
    <stripeno=3>
    <stripeval>New Jersey</stripeval>
    <color>red</color>
    </stripeno>
    <stripeno=4>4</stripeno>
    <stripeval>Georgia</stripeval>
    <color>white</color>
    </stripeno>
    <stripeno=5>
    <stripeval>Connecticut</stripeval>
    <color>red</color>
    </stripeno>
    <stripeno=6>
    <stripeval>Massachusetts</stripeval>
    <color>white</color>
    </stripeno>
    <stripeno=7>
    <stripeval>Maryland</stripeval>
    <color>red</color>
    </stripeno>
    <stripeno=8>
    <stripeval>South Carolina</stripeval>
    <color>white</color>
    </stripeno>
    <stripeno=9>
    <stripeval>New Hampshire</stripeval>
    <color>red</color>
    </stripeno>
    <stripeno=10>
    <stripeval>Virginia</stripeval>
    <color>white</color>
    </stripeno>
    <stripeno=11>
    <stripeval>New York</stripeval>
    <color>red</color>
    </stripeno>
    <stripeno=12>
    <stripeval>North Carolina</stripeval>
    <color>white</color>
    </stripeno>
    <stripeno=13>
    <stripeval>Rhode Island</stripeval>
    <color>red</color>
    </stripeno>
    </symbol>
    </flag>
    </flags>

    Note: I'm from New Mexico, so I know what it feels like when a state gets left out. Rest assurred, my flag includes Deleware!

    --

    No, Thursday's out. How about never - is never good for you?

    1. Re:Happy 4th! by csguy314 · · Score: 1

      Jeez, what a way to honour your country...
      by misspelling *Delaware*!

      --
      This is left as an exercise for the reader.
    2. Re:Happy 4th! by Anonymous Coward · · Score: 0

      Is New Mexico any better than the original? I hope you guys got rid of all those dirty Hispanic whores.

  46. HR has made 100 DTDs by Ilan+Volow · · Score: 5, Funny

    Congress has always been full of lyahs and chetahs. That it's now full of schemas is really no surpise.

    --
    Ergonomica Auctorita Illico!
    1. Re:HR has made 100 DTDs by Anonymous Coward · · Score: 0

      Cong'erss has alluhs bin fullah lyahs 'n cheatahs. That it's naow fullah schemas ain't no kinda suhprahs.

    2. Re:HR has made 100 DTDs by Anonymous Coward · · Score: 0

      Fucking hoser.

  47. yep by Mikkel_bob · · Score: 1
    And it looks like the DTDs will be free to use and distribute!

    No, this doesn't mean you can make your own laws. =P

    --
    Mmmm. Sig.
  48. The Importance of DTDs by The+Monster · · Score: 2
    DTDs are obselete by now
    They may not be bleeding edge, but what's important about this is that the House is making
    a commitment to open data formats. Even where we don't get open source code, this guar-
    antees that we don't get the most virulent form of 'vendor lock-in', where failure to pay the
    latest rent increase means we can't even access our own data anymore.

    ---
    Fight Page Widening! Make your own line <br>:reaks.

    --

    [100% ISO 646 Compliant]
    SVM, ERGO MONSTRO.

  49. we need open source software by Trailer+Trash · · Score: 1

    They're already using vb-script in their xsl stylesheet, I can see Microsoft trying to weasel their way in here (or some Microsoft-based consulting company). We need to get some open source software that can be of use to them, and hopefully to state governments as well. Anyone game?

  50. Not the real issue by Jedi+Creed · · Score: 1

    The real problem is that XML itself is too new. DTDs turned out to be too clumsy and limited, so schemas replaced them. What Congress really needs to do is wait 2-5 years for XML to settle down. By jumping in prematurely, Congress is running into pitfalls like the use of DTDs.

    --
    Ready are you? What know you of ready? For eight hundred years have I trained Jedi. - Yoda
    1. Re:Not the real issue by Prof.Phreak · · Score: 1

      XML is unlikely to go away (you'll still be able to read XML docs 50 years from now, even if basic formats like JPEG, etc., are totally replaced).

      Not to mention in case of any major changes, it doesn't take long to create an XSLT script to convert your XML into anything.

      --

      "If anything can go wrong, it will." - Murphy

    2. Re:Not the real issue by Jedi+Creed · · Score: 1

      No arguments there. Actually, I'm convinced you'll have no problem reading JPEG either, but that's another story. But it's still smart to wait until the standard has settled just a little, so you're not aiming for a moving target.

      --
      Ready are you? What know you of ready? For eight hundred years have I trained Jedi. - Yoda
  51. Even HTML would be a HUGE improvemt by ahfoo · · Score: 2

    --aything with links is essential to reforming legal texts into something useful. In the US, the laws are written in English. It should be the case that anybody with a high school education could read them and understand them with ease. The main reason lawyers get so involved in anything that has the slightest concern with the law is the twisted textual markup that is currently used makes the documents incomprehensible and extremely difficult to understand in full because of the need to obtain the hundreds of essential external references. This is wonderful news.
    Even the stilted style of language referred to as legalese is partly a product of the need for a meta context within legal writing. This is long overdo, but awesome nonetheless.

  52. U.S. Senate Responds... by cburley · · Score: 1
    ...by making resolutions in CommonLISP S-expressions.

    --
    Practice random senselessness and act kind of beautiful.
  53. XML creaps in another place by thogard · · Score: 2

    Didn't any of the XML supporters every study parsing in their CS classes? Or are they just web control freaks that didn't bother with anything past highschool. Oh wait, I'm talking about w3c so of course they are contorl freaks. At least most people ingored them.

    The problem with XML is that it diverges into two dinstict worst cases. One requires and infinite amount of memory, the other and infinite amount of time. Both of these are bad things and much study of algorithms is about avoiding both of these conditions. Odd thing is most people in the IT field today have no clue about why this happens or even that it can happen. Of course these are the same programmers that coudn't describe a quicksort if they had to or descibe something in BNF grammar. And we wonder why most programmers today just produce garbage.

    1. Re:XML creaps in another place by vidarh · · Score: 2

      Can you elaborate? I can't see what part of parsing XML you are referring to - parsing XML for the most part seems relatively simple, though I haven't written a complete XML parser or spent the time to read through the complete specification.

    2. Re:XML creaps in another place by Anonymous Coward · · Score: 0

      The problem with XML is that it diverges into two dinstict worst cases. One requires and infinite amount of memory, the other and infinite amount of time.

      Yeah, yeah, and since it's impossible to tell if a program will halt or run forever, it must be impossible to write a debugger.

      Who cares about theoretical worst cases?

      What's important is the fact that escaping works, Unicode works and markup works. XML should be used as a glorified ASCII.

      It's not just used that way, and that's the real problem. People are trying to turn XML into a database standard. See this article, this one, this one, this one, or this. All are from the excellent dbdebunk.com site.

    3. Re:XML creaps in another place by thogard · · Score: 1

      Lets say you have an application that has 1 mb of memory. Now assume you need to process a file that is 10mb. Now assume that the file is broken in some common way (say a closing tag is gone).
      How do you parse the file in this case?

      For some clues, read up about how TeX works since its parser has the same issues. It's parsing has the same problems and Dr. Knuth has written about why its bad. TeX deals with it by dumping the stack and going interactive on the user. Non-interactive programs don't have that option and when the job description includes malformed input, there are difficult problems to solve.

    4. Re:XML creaps in another place by thogard · · Score: 1

      Ever run a debugger that wasn't designed to be used interactivly? Its amazing what you can do when you have a person with the ability to make a choice.

      You can hit a theoretical worst case every time you run low on memory or get a malformed input. real world production systems deal with these two conditions all the time. So its ok to have your banks payment system grab all the memory because the latest version of some popular program forgot a closing tag?

    5. Re:XML creaps in another place by vidarh · · Score: 2
      I don't see the problem. If the closing tag is missing and you are using a Sax parser the only effect is one more scope indicator, and the parser will plod along happily until you try to close the surrounding tag at which point it will know right away that it should give an error.

      Whether it will allow you to try to recover or not at that stage would be up to the parser.

      Recovering from malformed input is regardless a difficult task, and typically you don't want to go there - that's not a parsing issue, but an issue of trying to predict how an error should be recovered.

      For a DOM parser, the parser would do the same thing, and just fail and free the tree once it found the surrounding tag (or the end of the file). However using a DOM parser with a scenario like the one you suggested would be plain stupid.

      In either case, handling a missing closing tag is trivial with XML, and I certainly can't see any justification for the claim that you'd either need unlimited memory or unlimited time based on that

      Anyway, you've just given an example of a case where ANY grammar based on nested blocks will have to have thought put into it when it is fed bad data, with no justification for why it should make XML bad from a parsing standpoint.

      Do you have a better example?

    6. Re:XML creaps in another place by thogard · · Score: 1

      Missing tags are an issue when they create the case where they build up stack state. This very same thing is a major reason netscape (prior to 6) crashed.
      Assume you have a language that uses { and } to define scope. What happens when they get lost:
      { record 1
      { record 2
      { record 3

      when it should have been
      { record 1 }
      { record 2 }
      { record 3 }

      If your language allows nesting like
      {record 1 { some_other_record 2 { some_other_record_3 }}}
      then you hit the problem.

      My problem with XML is I can always make a file that will break programs that expect XML. I can't make a file that breaks grep.

    7. Re:XML creaps in another place by vidarh · · Score: 2
      This is a very different problem from what you suggested in the other message, and is can be just as real with REAL documents.

      So what you are really saying is that your problem is with ANY system that allow scoping, and where state is required for each scope until the scope is closed?

      The problem with that is that scoping is useful and makes it a lot easier to represent a whole lot of data in a structured form that seems natural to humans.

      In other words, an XML parser may require more resources than a parser for a grammar without scoping. But the scoping is allowed for a reason - it provides structure that is hard to provide without it.

      The reason you can't make a file that breaks grep is that grep doesn't care about structure. You can easily work on XML files withouth running into the problem as well if you ignore structure. But then you are also losing a whole lot of advantages.

      I still don't see this as a problem. You need to handle resource limits regardless. If you have 1MB available, as you originally used in your example, then when you have used that 1MB then you have to fail gracefully. If the only case where you use the whole 1MB is a broken document, then whether you fail because the parser detects it or fail because you don't have more memory is irellevant - the parse failed.

      If you need to give more specific error messages, you can do that fairly easily, by, when you've filled memory scanning the remainder of the document to determine whether any of the outer tags will EVER get closed.

      If you want to recover from unclosed tags, the standard way of doing that for HTML and XML is to define which start tags you want to autoclose which types of open tags for.

      This is a straightforward mechanism that works well, in particular in the presence of a schema or DTD where you can easily determine where leaving a tag open means the document is malformed where it may possibly be wellformed if the tag is closed.

      I haven't implemented it for XML, but I have implemented in an HTML filter that needed to handle particularly broken HTML.

      In the real world this is a problem only if you don't think about it and design your software to handle it, just as not thinking your design through in general leads to broken software.

    8. Re:XML creaps in another place by thogard · · Score: 1

      There are just so many problems with xml that I tend to ignore it. I wasn't even thinging about the scoping issue when I wrote that, I was thinking about stack depth. In that case if your code had to figure out the current record could fit in the current scope or the prior one, the decision matrix would also grow without bounds. Unless you have the option of fully rejecting any malformed xlm file, your asking for trouble.

      I think your well versed in the world of users dealing with the output. Most of the stuff I do, the users are way at the other end of the pipeline. Any error message must be passed out of band to the user and that make the systems excessively complicated (aka buggy).

      At work we process batches of transactions. We insinst on one of two simple ascii formats. So far we find that about 90% of the people can't get their 1st test batch to us correctly. About 50% can't convert from a test tag to their tag without messing up something else (think s/test/their_uid/ ). If these people can't build ascii files, how are they going to build xml files? Our business requires that we process as much as the file as we can so we have some strange rules about failing gracefuly.

    9. Re:XML creaps in another place by vidarh · · Score: 2
      The scoping issue and the stack depth issue are the same, and the solutions I described are solutions in common use.

      And I'm used to dealing with users on the input side. The company I work for operate the .name TLD. Registrars interact with us via XML. Our subcontractors interact with us via XML. We're dealing with far from perfect XML and errors needs to be communicated.

      We did use to have an ASCII based format, and we had more problems with that. The advantage of XML is that the users can validate the XML generated pretty well on their side by running it through an XML parser with schema validation support.

  54. Re:Yee haw! Crappy laws in better format! by macdaddy357 · · Score: 1

    Old men with VCRs flashing 12:00 won't be able to use this. Too complicated. They will need people to do it for them. I'll do it if the pay is good.

    --
    How ya like dat?