Open Source License For Databases?
Myddrin asks: "Recently there has been lot of discussion of databases, and who owns them. The US either is considering or passed a law saying a Database(and info contained there-in) is owned by the creating person/company. [I honestly can't remember.] At anyrate, this got me thinking of a the (possible) need for Database GPL (DGPL). Basically the same as the LGPL, but adding that the database host (i.e. the owner of the server hosting the specific instance of the db) can put restrictions on access allowing them to offset the cost of hosting the machine (administration, i'net connection, etc)." Any data in a database is content, just like information on a web page. Maybe an Open Content License might be a better idea? Thoughts? (More)
"...Examples of acceptable restrictions would be:
- any program accessing this database must display the advert. provided,
- a cost of $.000000001 per record returned
- a nominal monthly subscription fee...
Is there a license that allows this kind of thing, or should I be working on one? "
With all the crap we've seen on NSI's Whois database, I'd say this is damn good idea - why shouldn't something created by the public (yes, all of our registrations created this database!) be owned by the public?
Or more like a restriction. Our personal information that is already floating around can not be resold but merely modified with changes? :)
Ok.. but at least a legal way to prevent us from being in public or sellable databases. I'm so tired of getting calls from phone-spammers.
"And how can this be? For he is the
They can't have it both ways.
If such a law is passed, this means that anybody who creates a database of song lyrics owns it!
If copyright law interferes with this, then I'm going to copyright my personal information.
It sounds like this might be a good idea. Anything we as the open source community can do to keep the government and control out of our business is generally a positive step. It does bring to light some rather interesting questions about public domain databases. Even private ones under a GPL (DGPL) would make an interesting study. Having worked for a company that did database work for Linux, it would certainly present an interesting case as to how we would charge for our databases.
Windows is going the way of phlogiston...
Another license?
Is it needed? Isn't that what
copyright is for?
---
120
chars is barely sufficient
Hand me that airplane glue and I'll tell you another story.
I cannot imagine the FSF would sanction a license (at least I'm assuming you would want DGPL to be sanctioned by the FSF, based on the suggested name) that would require advertisement. Although, in the web-context, I suppose advertisements are the closest thing to a common currency. I still think that'd be the real sticking point, though.
Christopher A. Bohn
cb
Oooh! What does this button do!?
IMHO, it would be great to have a generic "copyleft" scheme, which covered everything, but for now, something for each of the significant special cases (eg: code, documentation, art, databases, etc.) is a good start.
It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)
--
The Free Software Foundation (http://www.gnu.org/) has been working up a license to cover documentation. Not exactly the same as what's being discussed here, but maybe close, if you think that information is information is information. Perhaps with some minor changes it would do the job, or a similar variant could be derived.
This is a work in progress (correct me if I'm wrong). At least I don't yet see it on the Free Software Foundation's license list (http://www.gnu.org/philosophy/license-list.html)
I'm sure the authors would have more appropriate input than myself. Just my two cents.
--Lawrence Lessig for Congress!
In my experience (as someone who has setup databases and interfaces for commercial ventures), the major concern of for-profit database owners is not that they won't be able to make money off the database (even if it's just ad revenues) but that someone will be able to grab all their information and resell it better than they can. I'd imagine that the major concern of not-for-profit database owners/creators is that someone will fragment the database through irregular mirroring.
The concerns of for-profit database owners is not paramount to a DGPL but the copying/mirroring of data should still be the focus. Towards this, it should be ensured that the DGPL addresses both dynamic and static databases and gives owners as much reason to use this license as the LGPL.
A few comments:
I would like to see licenses concentrating on the data (content) rather than the whole database (the collection of data) - that would let you modify, it resell it etc -- much like the US census data or USGS geographic datasets.
This question is very interesting, especially for geographic data (for GIS -- Geographic Information Systems). The situation in the US is like a dream, where all the USGS data is distributed without any tough restrictions (a BSD-ish license for data). The datasets are very expensive to create and a valuable asset.
In comparison, the situation in most of Europe (for example UK or Sweden, where Im from) is that the mapping agencies are recovering most of the costs associated in creating digital geographic datasets. They are incredibly expensive!!! Thus the use of GIS is much more restricted (as well as development in the field) in this part of the world.
Another interesting point, the license which NASA licenses the new Landsat 7 digital imagery. They are a lot cheaper than before (a few hundred $$$) and the license is 100% non-restricted (even here a BSD-ish license). In comparison, earlier Landsats, and the current competitors are a magnitude more expensive, and in most cases they require you to license the USE of the data, not the 'ownership' of the data. That way you had to buy one license to use a satellite image for education/classes and another license to use the same image for an analysis... Landsat is run by the US government, so it is you tax payers that are paying for this give-away (they are not obviously recovering all the costs for the operation)
Nowadays there are people that have bought (and used) the new Landsat images and are making them available for download (for free!). Of course this is under great debate (imagine the competitors to Landsat).
So... More talk about data and databases!
// Fraxinus
>If copyright law interferes with this, then I'm
>going to copyright my personal information.
Hmm, that's actually a fairly interesting idea. Could this be done to thwart having your phone number resold etc? Copyright your name, phone number, and address and then sue people who sell it for infringement? Hmm. It seems to me that there is already established the idea that personal information has value, it seems logical that the person whose information it is should be considered the owner. If companies have to pay ME for my phone number/mailing address, I can set the price high so it's not worth the effort for them to spam me with advertisements
I'm working to create a text-based musical notation copy of a music book that has fallen into the public domain. I was wondering what license I could release it under- technically, the files are information, not program logic. The music format is human-readable, but also parsable by graphical notation programs. Hmm, any suggestions?
I would say that whoever creates the database owns it; but what do they own? If the database contains personal information surely the individual records "belong" to whoever they concern?
Possibly creating a database means you own the right to control access to it; and to assign control over the data in it.
Anyone know if the Data Protection Act in the UK says anything about ownership of databases and the data within them? I know it says that you have the right to view any personal data held on file (either physical file or stored in a computer system) and something about being able correct that data if it's incorrect.
Yeah, I had a sig once; I got bored of it.
The problem is that it can be hard work to research and compile these facts even if the result has no originality. I think we believe that people should be able to obtain benefit from their work. Database protection schemes try to create a copyright-like right against the substantial extraction and reuse of facts from a database. Thus, someone who contributes to a publicly licensed database wants to be sure he can access the additions of others in the future in payment for his work (rather than the corporate-generate-cashflow model for benefit.)
Licenses are important to accomplish that right to later access because they can work even where you don't have a 'right' to copyright. Thus, if I license a CD to you with all the phone numbers in the U.S., I can license it to you as long as you don't put it where multiple people can use it. After all, fair is fair, we have a contract, and I am just making sure I can sell my work to other people, and not have you, my customer, becoming my competitor just for having bought my product once.
A public license on a database would really only be useful if databases DERIVED from the original had to be made available for copying. Consider a list of all the music CDs ever made. It has to be updated, since new product comes out all the time. Can someone go into the business of providing these databases by taking the old, updating it, and calling the new database proprietary? Not if you have a public license. (All of this assumes that shrinkwrap or clickwrap licenses are good. They aren't in many countries.)
As long as the resultant database is available to be copied, in whole, then the charge for accessing the server, whether to take the whole thing at once, or one record at a time ought to just fall under a reasonable distribution charge. Heck, the record-by-record access might as well be charged at any rate the provider wants since they are providing interface as well as content. If someone wants to roll their own, let them download the database.
I think a public database license would be a good thing because it will allow public databases to grow and be distributed in a fair way when database protection laws are passed.
I recently did a report for a tech-english class last semester. It ended up being about ownership and the Internet, most specificly who it is that owns the whole shebang. Not an easy project, and I did not end up finding what I thought I would find when I first started. The paper overall ended up being one on copyrights. So I'll say the same thing that I ended up saying in that paper.
You cannot treat the digital world the same as the print world.
It just cannot be done. Everybody that reads slashdot with any frequency knows the lunacy of walking down that path. So let me take that argument and apply it here.
You cannot treat an online database the same as one you might have as hardcopy database (read:propritary, closed, or rolerdex on a desk) in an office. You cannot charge access to it in the same manner. You cannot oversee the users in the same manner. And most importantly, you cannot expect people to value the data that is stored therein the same.
With that said how can anybody expect to make a profit by putting such a beast online. I have two thoughts.
#1: Do as the search engines do. Find some other way to profit. I have no idea what product Yahoo makes, but for some reason people invest in it, and somebody, somewhere is making money. It has been done once, and it can be done again.
#2: Do it ebay style. Auction the info off. Highest bidder gets the ability to negotiate a use license. No cost to find out if it exists, just a cost to read it. The more people demand rare info, the higher the price goes up.
Any body else go a suggestion?
The fact that you want enforce restrictions on the use of your database and its data (probably for valid commercial reasons) makes your ideal license closer to a typical commercial database license than the (L)GPL licenses which are have no rectrictions except the requirement, paraphrased liberally here, for everyone to distribute the content freely under the same license.
Plagiarism has a long history, and I saw several students get the boot from the University where I went to school for violating University guidelines.
:-)
If you are going to do new work on a previously examined topic, you must cite your sources, have a variety of sources cited, and NOT provide a sense that the owners of the cited work have been plagarized.
For example, I can write a book about "Snoop Doggy Dogg", provide about 100 citations (books, webpages, mag. articles, TV/Radio programs), provide my condensed "personal take" on the rapper, and publish. That's legal; it's the foundation of all new work -- deriving from the old.
But when I cross the line (doing a rehash of an existing SDD book), and call that work my own with no citations, or with a "sense" of plagiarism, I open myself up to legal trouble.
I think the "fair use" rules, as they apply to books, will eventually dominate this issue. People using data from webpages WILL have to cite their sources, use a variety of sources, and verbatim copiers will be penalized/threatened, etc.
What am I missing here? This just sounds like another failure of the legislative process to provide sane solutions to a fairly simple, well-known problem. Is this just a scheme to provide incompetent lawyers with phat salaries for years to come?
I see no fundamental difference between pages on the web and pages in the library. They both convey information to the observer in virtually the same manner. The earliest animations were just flipping paper pages anyway.
New Year's Rocked. Love you all
Treatment, not tyranny. End the drug war and free our American POWs.
See my user info for links.
There allready is a license for open content. Check out www.opencontent.org.
--The knowledge that you are an idiot, is what distinguishes you from one.
I hope RMS updates the GPL to deal with this issue more specifically soon....
Female Prison Rape in NY
WHO CARES!!?
I sure as hell do not.
I don't think the implicaions of this law are understood completely. Current copyright law through various legal precedents grants copyright protection to the format of a collection of data. the classic example is a phone book. It also only applies to the exact organization if that organization is not obvious.
The classic example is a phonebook. A phonebook is a collection of data i.e. names, phone numbers, and addresses. Organized in alphabetical order. As it turns out under current copyright law this has minimal protection. Alphabetical ordering is obvious, and the rest of the directory is information which by law is publlic domain and not protected by copyright.
A law protecting databases and their content could easily extend to a copyright on information. Basically, a database should be covered just like a phone book. Any content in the database would be owned by the creator of that content, but any information would have to continue to be public domain.
Basically, this means that the databases of internet search engines can be extracted and reorganized into a new database, simply because URL and page titles are information and therefore are not and should not be protected.
Dastardly
P.S. Arguably a page title could be considered the property of the creator of the original, but the URL is really public domain information and not protected by copyright.
The easy way around this, of course, is to change some bit of the information. Go from "Apt. 202" to "Suite 202". But this also works both ways -- they could change some irrelevant thing like your middle initial, and presto-changeo: a new entry, a new copyright.
Since you probably wouldn't be able to copyright your address or phone number (e-mail address? Maybe...), it should be relatively easy for a marketer to take a valid entry and make all of the common permutations (Street, Lane, Drive, etc; All middle initials; Ann, Anne, Annie), then copyright the whole schmeer. They'd have to be cross-referenced, but they'd probably be able to brute-force some combination that you hadn't thought of.
--------------------
Earth first? Oooh, and I was thinking of paying the rent.
yes! scalded balls!
6 pairs please.
A legitimate function is to provide better access to existing data, e.g., by indexing it and providing various views (both data-wise and GUI-wise). But again, it is easy to compete in this, so some will try to fence in the raw data for their exclusive use, and lobby/plead/wheedle for contracts that provide exclusivity. These ploys should not just be resisted whack-a-mole style, they should to be eliminated as a species.
Where a database is created by public contributions (e.g., slashdot, or amazon book customer comments, or newsgroups, etc.) the default assumption should be that the creator of the comment owns it and is offering it into the public domain for non-exclusive presentation. Other arrangements can of course be made by contract agreements. There is already a lot of precedent in look-and-feel aspects of presentation, but any copyrights or (ugh) patents there should not be able to restrict free flow of the original contributor's public offering.
Where the database is created by us (the government), the same applies, but data-restricting exclusivity clauses in contracts should have to be explicit, and justified individually with respect to high principles (e.g. Constitution), and not just slipped in as standard contractual boilerplate.
Privacy laws must take precedence, but must not become a vehicle for attempts at exclusivity having nothing to do with privacy.
This has a lot to do with who does own a database... If I go out, messure the rainfall over a period of a year at 10 different places, and then put that into a database, its mine. I don't think anyone but mother nature can contest that (unless I put it in an Access database, then MS might contend ;)).
But if I go and put all the information I know about everyone I know into a database, who does the database belong to? Can I go and sell the information? The 1991 Privacy Act in New Zealand says that if I am a company, and I collect information about ppl, one of the things I must do is along ppl access to view/modify there record. (Within reason, ppl can't demand to modify their bank balance ;)). I also must state what I plan to do with the information, including wether I plan to sell it. Ianal, but I don't think it prohibits me from selling it to anyone I want.
Theres a good reason for this, our electoral rolls (list of ppl who are enrolled to vote, names, addresses, etc) are availible for purchase, (incidentaly, in order to have my record unavailible, I have to have a "good" reason, eg I'm being stalked, and I have a restraining order, etc. I can't opt out of it just because I want to).
This means that my database of your personal habits I noticed is mine. And I can do with it what I want. (Note; there is an option for various personal defimation(sp) laws here if I say false things).
Now that thats settled, what DO I want to do with my database of your habits? Well, I believe in free speach, my programs are GPL, so I want to make it free.
I will license my database under a "free" license. This license is NOT designed to allow ppl to make money off of my database, so the same rights must be transmitted to the user of the database. So, the license must allow a user to "copy" the database one record at a time if they like.
Now, the big thing, cost. Simple, same as the GPL, a distribution fee. ie you can charge a reasonable fee for the distribution of the database in whole to the user.
Ahh, but what about accessing records, eg a web database, or phone, whatever. Thats fine, you can charge me whatever, that is outside the scope of the license, but what is in scope, is you MUST offer the entire database for a reasonable cost.
"What!" you cry, "This is no good for me". Fine, then don't use the license, if you want to make money out of something, why are you trying to use a "Free" license?
The point of the matter is, a "free" database license should not be orientated at making money. I don't earn a cent from the GPL programs I write. If i wanted to, I could, I'd just use a different license. But I don't, and I want my database of your personal habits to be free aswell.
The minute you try and work out how a company can still make money with this license, you defeat the purpose of it. As I said, you can offer access to the database for whatever price you want, but you must offer the entire database for a resonable price too. RedHat makes their money by basically selling pretty boxes and support.
Stop trying to work out how you can make money out of database, and start working out how you can make it available for all.
I use to have a funny sig, but slash cut it off, and I forgot what the punchline was.
Ah! The Joy of it. We will all live in a Brave New OSS eutopia as described by ESR. Its NOT communism, its GMoney! Worthless, just like Communism.
With GMoney, there is a Free Lunch! Just dont expect too much from $0.
One day, in the near future, the OSS Communist Rules (ESR and Co) will find that the real world needs to generate real money to survive. Making everything free will destroy that which provides the free lunch you are now enjoying.
Public discussions are a valuable problem solving tool, as well as politically important. Is anybody keeping archives besides the commercial entities? (DejaNews, Remarq, etc.) I've seen a few private archives of individual groups, but nothing like the big guys. How big is a compressed daily usenet feed? (minus binaries)
Perhaps I'm not catching the drift here, but I don't think I'd like my credit report to be freely available and modifiable to everyone on the net....
I have no problem with a company databasifying public data and charging for their compilation. Don't like it? Buy a different compilation from a competitor, or get the raw data and databaseify it yourself.
On the other hand, I have a BIG problem with a company and a government agency cutting a sweetheart deal such that only that ONE company gets to databaseify and sell that agency's public records. (This has happened with both the US Patent Office and the Library of Congress card catalog, though I'm not sure if either exclusive deal is still in effect.)
Bantam Dominique roosters crow a four-note song. Once you've heard it as "Happy BIRTHday" you can't NOT hear it that way
It is my opinion that databases in and of themselves should not be copyrightable. This is different from the copyright status of data residing within it. That is to say, slashdot comments are owned by their respective authors, but the collection of articles should belong to nobody. If a person can get permission from each comment author, why should reproduction ever be a violation of copyright? I feel that it shouldn't.
To illustrate, although not strictly a database, I'm reminded of a certain site with a collection of links to movie scripts. The owner complained that others had built similar collections. It's almost as he wanted to have exclusive rights to a particular collection of links. I find this unacceptable.
but this so-called "real" money that you are talking about has no inherit value x-cept that it is backed up by the good faith of the federal govt. I'll take faith in the open-source community over that of the federal government any day!!
the whole foundation of the GPL is based off the assumption that copyrights are inherently bad, we would be much better off not letting databases be copyrighted to begin with - than to have them copyrighted and then try to pick up the mess with yet another license.
the http://www.useit.com/ site has the articles
of j. nielsen the user interface haranger.
He talks about micro charges for content access; fractions of a cent per page view, similar to this database issue.
There are other considerations for some types of databases, mostly privacy related.
I live at:
Andrew Dvorak
555 Maple Drive
Yourtown, USA
The above is copyrighted by me, Andrew Dvorak. Any use whatsoever constitutes a violation of copyright laws. Should you have any information like such in your database, you are using it illegally, as i own the rights to it
Look at the IMDB. as an example.
It's not "open" in the facet that it can't be repackaged or repurposed, but is "open" as far as using it to obtain a wide variety of well-organized and searchable information. Updates, servers, and bandwidth are paid for with mass exposure (advertising).
Databases are interesting things. I mainly work with radio/tv station db's. It has been determined that the average cost for obtaining a name/address/phone is roughly $7. Appending interesting information costs more money, as well as yearly NCOA (National Change of Address) updates, databases can be expensive, or I should say, used to be. The Internet has changed things siginificantly (as if you didn't know). Acquisition has dropped (for us) to about $.10 a name.
Large Databases used to (15-20 yrs. ago)require the work of millions of dollars of heavy iron, now a moderately equipped small company can do serious modeling/profiling and apply it (BTW, this is another reason CS majors are pulling heavy $) effectively.
I don't really see how the OS model fits this. Unless you're talking about DB tools. OS developement isn't like DB developement, collecting/organizing data is different than coding compilers and desktop environments.
centscents
+&x
If we apply the same logic to the Gimp or for that matter any software program any graphics created with the Gimp would be subject to the GPL. Or much worse, any Word Perfect Document created with WP would be subject to Corel's particular license.
In no way do I agree that data in a database can be owned by the person who compiled it. If MY personal data is included in any database, I believe I'm entitled to a piece of the pie. If you didn't create that data how can you possibly own it. THINGS ARE GETTING OUT OF HAND!
There may or may not be an answer but a licensing scheme such as this will make things worse.
The DGPL (database GPL) suggestion is orthoganal to, or counter to, the GPL -- I can't decide which.
First off, the GPL and LGPL do not prevent you from charging a distribution fee, limiting access, or associating advertising with downloads from your site. You can put advertisements on the same web page from which you distribute or mirror GPL'd code. You can limit access, or even charge for access.
However, the GPL also explicitly allows (and requires you to allow) people who download the code to set up their own sites to mirror the code, with or without access restrictions, payment, or ads. If I had to pay a penny per page to use your database but I could get it for free from Bob's House of GPL'd Databases, where do you think I'd go?
In short, the DGPL does not suggest the same solution, or even try to solve quite the same problem, as the GPL. To charge per-search fees or tie ads to searches would require that you make content (i.e., search results and the database itself) proprietary, which runs very much counter to the GPL. Many schemes for making content proprietary exist, including not putting any explicit copyright on it at all; the GPL is not such a scheme. Calling suggestions like enforced per-search fees or advertising tie-ins the "DGPL" is misleading.
The closest analogue to the GPL in the world of "content" is probably the Open Content license. Let's stick with that.
People should own data that is about them, no matter who collects it. Failing that, they should have unrestricted access and use of data that is about them. It's time to value human rights over corporate rights. This will not cause the collapse of civilization as we know it; it will mean we can all breathe a little easier.
More poets in here?
If it's haikus that you want
I have got plenty.
(Here are some...)
Morning smiles upon
Post-2K community
The gods let us live.
Redmond upon us
The bloatware makes me shiver
I fear Win2K.
Linux is not bad
Free if time has no value
Should be preinstalled.
Pheer the cracker kid
Chats on AOL all night
He is true 31337.
See the Redmond Beast
Its vapourware is worthless
One more promise.
C++ sucks ass
Although I need my paycheck
I wish Bjarne was dead.
To the editors: your English is as bad as your Perl. Please go back to grade school.
I see several forms of data ownership.
1) Data that you have created yourself (eg. a midi file of original work)
2) Personal Information - Information that is inherently yours (SSN,
Name, bank records, medical records, court sealed records)
3) Data that is collected that cost time / $$
4) Public information that is general knowledge or placed in the public
domain
Newspapers currently charge for archive article searches and rights to
use/republish original work (rule 1)
Private information should be kept Private and Confidential.
This information should only be divulged to a third party with the written
permission of the person or by a court order. (rule 2)
Dejanews does not own the articles it gathers from the newsgroups.
They defer the the $cost$ of accessing it by advertizing. (rule 3)
If a publisher produced a book of public domain programs. You could not
copy the book; however, you could re-type the program(s) into your
computer and use it any way you wanted. They are charging you for the book
and the effort it took to produce, and not the content(programs). (rule 4)
And that is my 2 cents worth for now.
make Linux, not Microsoft. sin(beast) = -0.809016994374947424102293417182819
To me, a database is just as much a protectable entity as a book in that it is a particular collection of information in a particular fixed form. The data itself is no more copyrightable than are the words in a book. But the collection is unique. Can it be duplicated? Sure. But an out and out copy is a copy of someone else's work and it is up to them what limits they wish to impose on that. The flip side of open licensing is respect for people's licensing decisions, after all.
Of course this is a special case of my general desire for API/open-protocol access to all databases. Why should I have to use Amazon's web-based interface to buy a book from them?
Right now the problem is both human and technical. XML will address some of the technical issues. (I'm working on some others.) The political and commercial issues will be tricky.
The regular GPL already handles this, as a database compilation is already copyrighted. The problem is that releasing a database under GPL allows someone to still download the entire database, modify it, and then use it for private use (INAL, but I think this is true). That makes it quite prohibitive to release many types of databases under this licence (essentially making it the same as the BSD licence). In any case, a DGPL would probably need to have more restrictions, not less as you suggest. I'd love to see databases released under GPL, it would be a perfect candidate as they are so quickly evolving and there's always more data to be added. Imagine a GPLed movie database with every movie and actor in it, that anyone could use as a backend to her website, but if someone adds entries to the database and then "distributes" that via incorporating it into her search scripts, the person would have to release the entire modified database. The only restriction I would want to add to the GPL is that one must release the database in raw format, if she modifies it and then makes it available through a web site. INAL, so maybe this is already true, it probably would still be nice to spell that out.
ok then your [sic] infringing on my copyright! Could you as [sic] me next time before STEALING my comments for your own?
What we should be doing is databasing everything. We could render the law ineffective by databasing everything that we can find and putting it under a license that allows free use by any person or organization who has never sued another person or organization for using their database. It might be a good idea to do something similar with patents.
I'm in a very awkward position on this issue, and would like to figure out just where I stand!
I run a web site called hockeydb.com. On the site I archive historical hockey statistics. Not just NHL, minor leagues too -- you can look up any pro hockey player ever on the site. I hate to sound like I'm bragging, but there is nothing else like it that I'm aware of for any sport.
I compiled nearly all the data myself by searching out and purchasing many volumes of books with hockey statistics in them. It's not as easy as it sounds -- there's no central place to find this stuff.
I spent 5 years building the collection, and thousands and thousands of hours typing in the data, fixing mistakes, standardizing names, etc. I've developed a custom computer program to maintain the data which was a non-trivial cost.
One part of me would feel very bad if the data suddenly became open source. I spent so much time on it, why would it be fair if ESPN or some company just grabbed it and decided to sell it? It would surely make my data worthless if everyone had it.
[on a side note, I know there is no legal protection on it now, although such databases aren't quite 'open source' yet].
On the other hand, I'm dependent on several entities for current compiled statistical data. One of those entities is a company called Howe Sportsdata, another is the NHL via the Elias Sports Bureau.
Howe is contracted by the minor hockey leagues to compile their statistics. The teams fax their game sheets to Howe, and Howe adds up the numbers and publishes official stats. The leagues pay Howe a good sum of money to do this, but it's cheaper than if they did it themselves.
If databases become copyrightable, then Howe -- as the compiler of the data -- could claim copyright on all the numbers. Or perhaps the leagues could claim copyright.
[I've heard that the NBA is trying to claim copyright on their statistics so that they can license them instead of publish them for free.]
It would be literally impossible to compile the information by hand because the data is only in Howe's (or the teams') possession. You couldn't even duplicate their effort because only the official scorer (there's only one per game) knows the true "facts".
So what is my position? I don't know -- it tears me up every time I think about it! I heavily lean towards no copyrights because it would sew up such data beyond belief and no one would benefit.
What is the true philosophy behind open source? Create something so that companies with many more resources than you can exploit your labor? Does the fact that something new was created somehow right that wrong? Is there a middle ground here?
Ralph Slate
http://www.hockeydb.com
Yes, it (building good group-moderated systems) is a Hard Problem. But right now we're not allowed to even try to solve it without starting over from scratch again.
A such issue would help the people over at the Humane Genome Project, as they are bothered by companies who work out new patents based on their material. This is not fair. You know the stuff.
Was originally a collection of people on the net who said, damn, let's get together and index this stuff.
-- Ender, Duke_of_URL
Note: this only applies to copyright law. Patent law on the other hand is a completely different issue although, i believe it should follow the same rules to some extent. I believe patents on genetic codes that occur in nature should be unenforceable since they are simply fact, and regardless of the effort required to decode the fact that my Y chromosome has the sequence hcctgaaggth should not be patentable.