Data Sharing, Government Style
rowama writes "The Department of Homeland Security and the Justice Department have been collaborating to develop an XML-based model for data sharing. After less than a year since the initial release, in October 2005, the National Information Exchange Model (NIEM) 1.0 Beta is out. It's big, really big. There are no less than 9 namespaces and plans for future expansion. Contact your local government contractor, with resume in hand, and you may be one of the lucky developers to implement NIEM-capable software."
Neato. Maybe now they'll make less errors in that Terrorist Screening Database they have. You know, the one that has the names of over 250,000 people tagged as terrorists used in everything from no-fly lists to border crossings ever since the administration wanted all such watchlists to be consolidated into a single big one. That one the NSA probably uses. That one that, according to Department of Justice Inspector General reports, may be riddled with errors.
Read the Department of Justice and Department of Homeland Security Inspector General reports. They redact sensitive information in some cases, but based on context you can identify information in some places they've failed to redact in others. Keep on reading and you'll remember things to fit together a bigger picture.
Actually that statment is pretty clear to someone with domain knowledge. Like any other knowlege domain, its probably very abstruse to the outside. Remember, Feynman was not famous for only being a physicist, but for being a physicist that could make himself understood to those outside of his domain of expertise (c.f. Feynman's lectures).
Its actually a very concise and clear explanation of that part of the data plan. The problem for you is that you do not have the context, nor subject matter expertise, so it appears to make no sense to you. I, on the other hand, have handled and created classified compartmented documents "back in the day", so its meaning is perfectly clear to me. Its also quite obvious this is from a section about how to carry across message-handling markings ("Classification" and "Dissementation" restrictions & caveats) from one agency to another, or even intra-agency stuff. This indicates to me that you probably pulled it from the Intelligence part of the namespace.
Bascially, the part you quoted says, in more coloquial English:
To control who gets to see this portion of data, the document is marked over-all AND portions are marked individually. To properly mark a portion of a document, (usually a paragraph), ther may be some paragraphs in a document that are "secret", some may be "unclassified", some may be US-only, some may be releasable to NATO, or various and sundry combinations of these types of things. To designate these "portion classifications, caveats and dissementation controls" and properly "mark" this portion fo the document, there is either (a) a single abreviated term, or else (b) a list of abbrevaited terms delimited by spaces. These terms can be found in a document called the "CAPCO Reigster". The only exception to this rule is the "REL" term, which means "Releasable To". Therefore, the values normally found after the REL term in a portion of a document should be put into the "releasableTo" attribute of this portion of a document, instead of the normal dissemenation control data block part of the document.
Thats a lot of context that isnt needed by someone reading a spec, governmentor otherwise. The spec assumes a given level of subject matter and domain expertise. To dumb it down would be wrong - that is the best way to lard up and bloat a spec, or else allow a spec so loose as to be useless in constraining the data properly. And, as you mention, "XML is upposed to make it easier to manipulate data by providing unambiguous definitions". The quoted text in your post is an example of a *very* _un_ambiguous definition of a data field. And contrary to your belief, its not just goverment that created such hard-to-scan (for outsiders) documents/specs, I've seen banks, health companies, telecom companies, aerospace [and other places that cannot afford a "loose" data type] write very similar specifications that contain similar definitions.
You'll see much of the same once you get out into the world.
HTH.
Buffalo buffalo Buffalo buffalo buffalo buffalo Buffalo buffalo! http://goo.gl/J9bkO