On The CopyLeft Of DTDs
"Writing a DTD is a challenge in itself (my company had never tried to go to the Web before, and never heard of XML until my project). To make the system work, we should then write software to adapt our supplier's data model to ours: for n suppliers we would need 2(n-1) correspondences (import and export) from their data model to ours which gets to be expensive on a large scale. Having a common model would help, especially for small companies not on the Web yet (those which rely only on paper data sheets for instance). My opinion, as there is no standard on our industry like RosettaNet, is that we could speed up things, and avoid babelization of XML tags by releasing our model with a Copylefted licence, lowering the cost and hassle for others on our market to build electronic publishing tools. Of course, there is a lot of money invested in our DTD, so what if competitors try to steal it?
Would the Copyleft of our DTD be a good idea?"
XML would keep things a lot easier in the long run as long as you and the other data models can keep up with the changes that take place in the future...
Sarcasm is the recourse of a weak mind...
--
there are things that cost money. How about compatibility problems and other things that will require some sort of tech support. How will you pay for that? or will your company be able to absorb the cost in the hopes that everyone begins to use your DTD?
Goat sex free since 2001
Once you have done the analysis and have a working system, put your XML DTD forward as the standard. Make this your business model - be the first to market. Get the qudos of being the ones who wrote the standard. Let the other companies "steal" you DTD - if they do, you have created a bigger less segmented market in which you are the leaders. Try and be the best implementors and supporters of the system that you have written the standard for. What licence should a standard be released under? Does not really matter - the only problem is if another company takes the standard and uses and "embrace and extend" policy that makes old implementations incompatible. The role of the standards body is to check whether implementations conform to the standard. If you cant find the right standards body to do that - do it yourself - you need a "brand" to make this work.
You might have a look at the freely available DocBook DTD. If you can use that, you don't have to roll your own.
If you copyleft your DTD, with a GPL-like license, nobody can steal it, because it's free. You might even create a standard, if it's a usable DTD. And you could share the load for data conversion, by asking your contributors to format the data according to your open DTD before submitting it.
I'm not really sure if there are really any downsides, unless your DTD is in some way your critical moneymaking resource (although I can't imagine how).
Just my $0.02.
Roland
Never ascribe to malice that which is adequately explained by incompetence.
Whether or not this is a good idea depends mainly on your company's approach to business.
In the grand scheme of things, it won't help your competition much, as they'd just spend the time to develop their own in-house solutions anyways when they felt the need. The practical effect of releasing the spec is that you've made a fixed, one-time donation of manpower to your competitors (they no longer have to develop their own versions of this spec).
On the other hand, there is little direct benefit to you releasing the spec. Some groups will adopt it, others won't, and you'll still have to spend a lot of time beating on your customers to use it properly. The good news is that a), free/open tools to perform conversion to/from common formats may become available, which reduces your support load to your customers (you'd otherwise have to provide the tools yourselves), and b) the spec may be extended by others when shortcomings are noticed. This is a benefit - you get R&D for free.
In practice neither effect is likely to be large unless you get lucky/unlucky. Your competitors will probably develop their own in-house specs tailored to their own needs anyways, and unless this is spectacularly useful, the Open Source and Free Software communities are unlikely to glom on to it to the extent required for free (beer) tools and an improved spec to appear.
What will determine whether management approves/disapproves this idea is a) whether their optimistic about the OSS/FS community's ability to spontaneously produce tools, and b) how cagey they are about their "intellectual property". Most likely scenario: They'll see no benefit and some potential loss, and more importantly see a chunk of their IP hanging out there for the world to see. Project not approved.
But, IMO it's still worth a shot, as long as you state your justifications carefully and do your research.
Wasn't DTD banned in 1972 for causing Bad ThingsTM?
A DTD is supposed to standardize data formatting, isn't it? Think less "copyleft" and more "standardized". This is one situation where the Artistic license makes sense, because it requires non standard versions to be labeled as such.
The Artistic license is so vague though, you might want to have your legal department draft something based on the BSD license, with a clause that hacked versions would have to be relicensed under a different name. That would give developers maximum freedom without compromising the standard. In other words, they could steal your code but they couldn't steal your brand name; similar to RedHat.
A GPL'd DTD would compel other developers to release refinements, but it would do nothing to protect your brand. Brand theft would be far more damaging than code theft.
For all intensive purposes, "whom" is no longer a word. That begs the question, "who cares"?
No one ever made money on selling DTDs -- formats exist to be used, and keeping a "secret" DTD makes no sense as it can be easily reverse-engineered if someone really needed it. The only people who benefit from closed formats are ones that make their whole business model around selling software that implements them -- a model that is counterproductive for actual use of information.
Contrary to the popular belief, there indeed is no God.
Perhaps the GNU Free Documentation License at http://www.gnu.org/copyleft/fdl.html/A& gt; would suit your needs for a copyleft license.
What, me worry?
Now if we could get someone to show that a closed formats causes loss of money then maybe my boss would actually consider open source.
If you are working on a new project use XML Schema rather than DTDs. DTDs are a hangover from the days of SGML and do not allow you much control on the content of your documents.
If you use XML Schema then you can specify exactly the format and content of your fields and validate the document much more precisely than just PCDATA / CDATA permits.
Go and have a look at the W3C site before you commit yourself, it is an easy change at the start of a project but will be much harder later.
Description of XML schema can be found at http://www.w3.org/XML/Schema .
I've both used and written a number of DTD's and releasing one under the GPL would really make no sense. The GPL freely allows anyone to modify your code, which is the last thing you want with a DTD. Since a DTD is a formal specification, you need to keep control over it. Ideally, once defined, it should never be allowed to change. If people can modify it as they like, then it becomes useless, since your XML documents may not conform to the modified versions.
If I were you, I would use something very similar to the Docbook copyright notice:
Copyright 1992-2000 HaL Computer Systems, Inc.,
O'Reilly & Associates, Inc., ArborText, Inc., Fujitsu Software
Corporation, Norman Walsh, and the Organization for the Advancement
of Structured Information Standards (OASIS).
$Id: docbookx.dtd,v 1.12 2000/08/27 15:15:26 nwalsh Exp $
Permission to use, copy, modify and distribute the DocBook XML DTD
and its accompanying documentation for any purpose and without fee
is hereby granted in perpetuity, provided that the above copyright
notice and this paragraph appear in all copies. The copyright
holders make no representation about the suitability of the DTD for
any purpose. It is provided "as is" without expressed or implied
warranty.
If you modify the DocBook DTD in any way, except for declaring and
referencing additional sets of general entities and declaring
additional notations, label your DTD as a variant of DocBook.
HH
How difficult would it be for a competitor to "steal" the DTD anyway? I mean, copy your ideas whilst renaming tags, restructuring the DTD a bit, and so on, till it wasn't provably derived from your DTD? The only point of you having a non-free license to defend your DTD is if this kind of defense might work. If your DTD would be easy to duplicate anyway, then you're not getting any security from a non-free license.
As to whether copylefting the DTD would help your company, I think the answer largely depends upon who you are, and your relationship with your suppliers. If you are having problems persuading your suppliers to use your DTD, then being able to point to the open license might help: "this is poised to become the standard". On the other hand, if all your suppliers are happy to use the DTD already, then you won't make any short-term gain. You might make long-term gain if future suppliers would be more willing to use a copylefted DTD; but that depends on what your industry's like and what kind of stance your suppliers are likely to take.
perl -e 'fork||print for split//,"hahahaha"'
On the other hand, don't expect the copyleft to protect your DTD. If anyone wants to use the data format in a proprietary application, well, they might not be able to use your DTD directly, but they can clone it and the result would probably not be considered a derivative work.
There are a few rights that we want to protect for the good of Free Software. We don't want API copyrights to be enforcible. We want to have the right to reverse-engineer for purposes of compatibility. We don't want to have a Microsoft come along and say "You can't make word processors that are Word-compatible, the file-format is copyrighted". Asserting the copyleft on a file format isn't compatible with this. However, a DTD isn't a file format, just its description. Thus, go ahead and copyleft your DTD, but be aware of the limitations.
Thanks
Bruce
Bruce Perens.
For starters, it wouldn't take a team of rocket scientists to clean-room clone the DTD to a level of functionality that'd satisfy most anyone.
For seconders, there are already a bijillion incompatible DTDs out there. The world doesn't need more.
And most importantly, requiring your suppliers and/or customers to conform to a closed-source DTD *COSTS THEM RESOURCES.* You shoot yourself in the foot when you do that: as soon as someone with a cheaper solution comes along, kiss your contract goodbye.
The best thing you can do is work *with* your competition to develop a *single* DTD that saves all your suppliers/customers money. Compete on the basis of service, of added-value, or something else that counts. Competing based on proprietary DTDs is just utterly stupid.
--
--
Don't like it? Respond with words, not karma.
Unless you want your data to be inaccessible to anyone else. What would be the point of a company declaring of ``We're Open! We use XML!'' and then tie up the use of the data with some silly license attached to the DTD.
I'd love to see something big happen to XML. But then I had high hopes for EDI way back when. It turned into a total mess where every implementation was a custom job it was doomed to fall on its face and find far fewer companies that wanted to take advantage of it. And each job was custom since no one could agree on things like what ``customer code'' meant. Hard enough to get two divisions of the same company to agree on that let alone two separate companies. Along comes XML and it just might fall on its face for similar reasons.
--
CUR ALLOC 20195.....5804M
... is a DTD? It sure doesn't sound good. Maybe its a digitally transmitted disease? Why not just call it a virus?
I can see how the fear of your work being stolen would give you pause for thought. There is every chance that a competitor will rip off your work and even more worrying are those that would employ the E3 approach against you (Embrace, Extend, Extinguish). A while ago I gave a considerable amount of thought to protecting myself from competitors and I gradually came to realize that a reasonably effective defense is a release programme. Basically, you use software life cycle development techniques to control the release of your DTDs into the public domain. Unfortunately this does mean a lot of extra work in the steup but it will pay dividends over the long term. Hope this helps and good luck with the project.
I guess a DTD partially defines an document interface, say document object model (not fully though).
...
But then, it does not seem to catch on. This whole W3C XML1.0v4 thing seems to evolve into some kind of niche, in which some people want to play and many more others don't. We've all looked at it, and are not really impressed. I would dare to say: It fails to become "hot".
And then we've got those little ideosynchratic languages like XSLT and stuff, that aren't very impressive either. I'll stick with Java, if I need to convert complex data structures.
I'm rather convinced that releasing DTDs won't convince that many extra people to play the DTD game.
You could release instead some novel martian poetry in the wild and hope that people will read it
Well if it's released under a free license they can't really 'steal' it -- it's already theirs.
Andy Armstrong
I can't believe that DTD are a serious subject on slashdot. Next we'll be arguing about fonts.
This sucks.
And that proves what? The level of progress that these pre-Scotts ever got to was probably nowhere close to the level of sophistication that the Romans had. Also to point out that the country of England wasn't founded over a thousand years after they attempted to take the land. They met with stiff restiance (see Hadrian's wall).
Respond to s
I'm currently engaged in some XML efforts where I work. The hard part is not the XML. Almost all database engines can generate XML wrappers for data objects based on their schema or generate data objects from XML streams.
The difficulty comes from getting two sets of people to agree on what the objects definitions are or are going to be. That requires collaboration and cooperation. Two things that are not going to come from any software effort.
All software developpers tend to treat the invertion of fire as their exclusive intellectual property and you can eat your meat cold and bloody or pay them for the privilege of cooking your steak.
The effort will have to come from consortia of clients and related firms who use data processing but aren't in that business.
That said, yes you can publish the DTD specifications arrived at by the consortia and it wil be aequately covered by the document copyright.
Though I think that using copyleft would allow you to avoid stupidity like the RAMBUS debacle.
Newton said I see far because I stand on the shoulders of Giants. Linus Thorvald RMS et alia are giants. Bill Gates is a big dip in the level playing field. Emulate Linus and you stand a chance. Emulate Bill and your effort will degenerate into a pack of wild dogs tearing at a haunch.
MSBPodcast.com The opinions expressed here are my own. If you don't like 'em... Think up your own stuff.
If you're asking questions like this, your company probably isn't ready for GPL'ing your work. Here's the deal, if your competitors use (umm that'd be 'steal' in your parlance) your DTD, their suppliers/vendors/partners will like y do so too. How many of these suppliers/vendors/partners do you have in common with your competitors? I think you can smell what I stepped in here...
The whole purpose behind GPL'ing something should be to encourage/enable it's use (spare me the ethical/moral lectures please, I'm talking in practical terms here) by others, be they friend or foe. In the long run, the more companies that use your DTD, the fewer you'll have to write custom code for.
Think outside the... Hey, where'd the friggin' box go?
If anyone is a bit unsure of what a DTD is, you may be interested to see how Slashdot (and the Slash code in general) use XML and DTDs.
Slashdot (again, Slash if it's setup to) produces all headlines in a convienient, machine-readable format. It can be found at www.slashdot.org/slashdot.xml .
At the same time, the DTD for this file (called 'Backslash' and can be found at www.slashdot.org/backslash.dtd) essentially describes to an XML parser what is and what is not allowed in the file. It essential defines what constitutes a "valid" document; it is valid meaning that when compared against the DTD, it conforms to the defintion.
"Well-formed" is another XML term which means it at least is formatted correctly accordingly to the XML definition (for example, single tag elements end in a backslash.)
If you're interested in learning about XML and this DTD stuff, as well as all the latest proposals that are meant to replace DTDs (such as XML Schema's), check out the official W3C site at www.w3.org/XML/.
Pete
As many others have already pointed out, try to make yourself known that you are the ones who got this standard out.
However, you have to be careful that the standard you are pushing is a very good one. Once you're sure, go out and make yourselves heard. Coz if it turns out not-as-good-as-you-thought, get ready for some shit big time.
I've had a very similar experience last year in my old job. My project was to export an in-house VHDL compiler's internal data structures onto XML and then visualise them in HTML (as an application for the format).
What we did (after we'd finished it) was sit down and write a paper on it, submitted it to a conference, it got accepted and there we were getting ourselves some +ve publicity. So, that's definitely something to keep in mind.
Now, having gone through this research, here are a few hints for later down the road:
Trian
(off to get some rest and a beer coz I've spent a few hours too many in front of this machine)
I'm no longer fed up with MS Windows: I go rid of them
I went to a M$ XML conference a month or so ago (hey, work paid for it and I got a semi-decent XML book for free too), and what they spent the most time touting was their new BizTalk setup. Basically, there's a repository at biztalk.org or somesuch, where biztalk schemas live (they wrote their own DTD for the schemas, of course). Various companies are supposed to post their schemas here, and eventually any given industry is supposed to be able to develop a single schema. In the meantime, M$ provides the tools necessary (using "The Amazing msxml3.dll") to make conversion from format to format "easy".
There's nothing here that can't be done open-source, as far as I know. And even if you don't want to go BizTalk right away, definitely consider the implications of what they've done before going and implementing anything.
It must be said that every once in a while M$ does something actually kinda innovative. They are doing a lot of cool stuff with XML that no one else is doing, so you have to give them credit for that much. Their OS's, on the other hand...
GStreamer - The only way to stream!
There's a good chance that a DTD wouldn't be subject to copyright.
Copyright protects the expression of ideas, and not ideas themselves (that's what patents are for). There's a copyright law concept called "the merger doctrine" that says (more or less) that you cannot copyright a work that represents the only possible expression of an idea -- to do so would result in copyright protecting the idea along with its expression, and that's beyond a copyright's power. The idea and its only expression are said to merge, and thereby fall out of the scope of copyright protection.
(The case that set this idea out was Baker v. Selden, which was decided at the turn of the last century and had to do with a book of accounting forms -- the expression of the form was its idea, and as a result people were free to copyright the layout of the form.)
This is the reason right-thinking people believe that APIs cannot be copyrighted -- by definition, the API is the only accurate expression of the idea represented by the interface, and the merger doctrine applies.
A DTD would likely be subject to the same reasoning.
XML only has a real use in b2b if people know how your XML is structured. If they dont, then you are stuck with your documents because no one can send you a XML that you can understand.
;)
Besides, publishing your DTD will give you positive feedback. It doesn't really matters if the competition tries to start using your XML structure. You are the ones who already have all their systems developed to support than kind of XML documents. In fact, if your competitors also start trying to use your DTD then more and more customers will start to use it But your competitor will start from zero, while you have the knowledge, the prestige from having invented it, the actual systems already working, etc. etc.
However, if you are planing to do this, dont use a DTD, they are probably sentenced for a quick death. XML Schema definition are better, not only because they allow you to describe your XML document more precisely, but because they are XML themselves.
Actually, if you can convince your customers to send you their data according to your DTD, then you are already doing a good thing in bringing all that people to the beautifull world of XML
Santiago
As I understand it, a DTD is an interface. As such, it should be completely unrestricted. No proprietary licenses, no copyleft, just plain old BSD, MIT or like license.
I believe that copyright law says that you cannot prevent anyone from using an interface. Any license that restricts access to the interface is taking *away* a right that the user already possesses. This is a pretty big step for copyleft to take, and I don't know that it is legally valid without an end user license.
Another option would be a "weak" copyleft, that guarantees access to the original DTD, but does not restrict any software that uses the DTD. Sort of an LGPL for DTDs. I know you guys want a world where the people you don't like don't exist, but you twist the meaning of "freedom" beyond recognition when you dictate the license that other people's XML documents must be under. (I'm not leveling this solely at the copyleft community, but also at the commercial firms that do the same with proprietary licenses).
A Government Is a Body of People, Usually Notably Ungoverned
I would show you an example of a huge loss that was mostly wasted effort of reimplementing things multiple times and extending the format/protocol in ways that it was not able to accomodate because no one else designed their systems to be compatible with it, but the problem is... format is still closed, so how can I publish anything about it?
Contrary to the popular belief, there indeed is no God.
Under certain circumstances you can use a DTD to create an XML document, and then send someone the document without the DTD, because XML may still work without it.
XML doc control is different from open-source programming in that a DTD is not a program, it's more like a config file, and there is no point in keeping it secret.
If your DTD is a useful tool, it makes sense to allow others to use it; if it's really useful it may even become a de facto standard.
But there are so many useful DTDs out there already that creating your own should only be done after a document analysis has demonstrated that none of the existing ones will fit the bill.
///Peter
--
Either way, if I were you, I'd learn Schemas. They have just become a Candidate Recommendation and are much more powerful in what they can define and do. And, more importantly, Schemas are XML files themselves so they can be transformed by XSLT if need be, or they can be parsed and processed, for instance, to provide the contents in a drop down list.
This whole question is irrelevant.
First, you should ask, can you COPYRIGHT a DTD?
Answer: NO
Then the copyleft or GPL issue never comes up.