Who Owns The Database?
dkm writes "The Boston Globe has an interesting article on legislation in Congress to make databases copyrightable. " Copyright issues are so nice and stickey; but at least it's not patents. Yet.
← Back to Stories (view on slashdot.org)
Some food for thought...
--
The US constitution doesn't treat intellectual property like other forms of property. It gives the creators exclusive right to their works in order to provide incentive for creation.
Now supposedly the fact that data compilations can be copied freely removes all incentive for their creation.
Says who?
Why the creators of data compilations, of course!
Post may contain irony: discontinue use if experiencing mood swings, nausea or elevated blood pressure.
Actually, the Supreme Court has held, in Feist v. Rural Telephone Service, 499 U.S. 340 (1991) 499 U.S. 340, that collections of data that are not at all creative are not copyrightable. The West Publishing Group, the folks that publish most of the court decisions in the US have been pushing for copyright-like protection for years. It looks like they've finally managed to pay off enough people to get it.
The real problem is that now West and their ilk will "own" the public data they collect. Because they data is not generally available through other sources, everyone will have to pay for supposedly public information.
>Read this to learn what the Supreme Court has said recently about copyrighting "facts"
Seth, you should tell a little more about it; this decision is huge. A unanimous Supreme Court here ruled that extending copyright protection to databases was not merely current law, but would violate the Constitution. While fat cats might buy changes to the law to help make more money (for example, the extension of copyright duration, which stretches the "limited time" phrase in the Constitution to near breaking point), this particular change would require changing the Constitution. Somehow, I don't think that will happen.
So, the only issue is how much will be spent on lawyers getting any changes to the law declared unconstitutional, an annoyance to be sure, but not nearly as bad as if it could become law.
Database collators can make money by not distributing the entire database en masse, or by doing it for people who really need it for a fee.
Ooh, a sarcasm detector. Oh, that's a real useful invention.
In the days of early copyright law, it made SENSE that databases shouldn't be because, as was so eleganly said above Copyrights are to protect the creator of intellectual property. Software, art, literature are the products of the intellect. A collect of data is not.
However, today collections of data ARE a product of intellect. The value isn't in the data itself, but in the sheer volume and ease of manipulation OF that data. This is where the whole industry of data mining and data warehousing comes from. Databases are products, things, now as much as cars or computers are. They are as valuable as a or a book, so they should be copyrightable. It seems silly to me that a Hoover's Handbook of American Business is copyrightable where a database with the same data in it isn't.
But there's a caveat. They shouldn't be copyrightable to the point where someone cannot go back and redo the research to get the data themselves, but you oughtn't be able to make your living off of someone elses database without reimbursing them.
IM (highly unpopular) O
Note that this is all about "copyrighting" facts. Copyright currently protects the expression of an idea, the exact words you used to express it. This bill would allow the protection of basic facts of life. If, for example, you compiled a database of what the rainfall for Washington, D.C. was every day, and anybody else anywhere published the rainfall for D.C., even if they gathered the information independently, you could sue them - and win - because you had "protected" that information in your "hard-earned" database and they had "stolen" it.
Official sports scores are tallied by the official league scorekeepers. Unofficial ones are tallied by whoever is watching the game. Nobody would be able to legally report sports scores without permission from (and payment to) the league, even if they tallied it themselves, because these facts are part of a giant "Sports Score Database".
In other words: if this passes, you can prevent anyone from publishing FACTS about the history of the world, expressed in any manner, as long as you collect those facts into your database. Sports scores. Court decisions. Scientific data. Natural phenomena. Time of sunrise. High tide. Your medical history. Anything that can be expressed as a collection of data, you can legally be prevented from saying or publishing if someone with more money and more lawyers than you gets there first. The first person to publish the periodic chart of the elements gets to prevent everyone else from publishing this database of facts - no joke. What's hydrogen's atomic weight? Sorry, can't tell you unless you pay a fee to the database owner.
This is a radical, radical departure from previous law, which protected expressions rather than facts. It is being pushed by the biggest database owners in the world, who see fantastic profits behind it. If this passes, my god! Forget buying stock in Andover.net, run out and buy stock in Reed-Elsevier, world's largest scientific publisher. They will own science.
--
Michael Sims
jkorty wrote
The article mentioned one database company that gathers together Massachusetts court records and then charged fees for viewing. In the perfect, future world, each court would instead make the raw data of all its decisions available directly on the web. Researchers then mine this raw data to their heart's content. In this scenerio there is no place for database compilation company to insert added value.
Conceptually this argument makes sense. However, the implementation may easily be hijacked. The cost of retrofitting computer databases and the associated infrastructure (sysadmins, hardware, software) is going to impose a major cost on the public coffers. I can easily see private providers stepping into to offer "free" web sites in return for "exclusive access". One example is an enterprising soul offering discounted high school application hosting in return for ads and access to a captive audience of school kids and teachers. Another example was the recent attempted licensing of (Californian?) license numbers to mass marketers. Would you be willing to hostage your legal system to similar schemes? All we can hope is that some clued-up clerks are knowledgeable about OpenSource so that the implementation as well as content is open to public scrutiny. As they say, the devil is in the details of implementation.
I suspect I would not be the only person concerned about the conflict between public and private data collection. With federal databases, at least there are some legistlative guidelines and open media scrutiny of public office. Only vigilent watch of corporate practices can avoid similar abrogations of privacy rights in the private sphere. As other companies have shown, controlling the source of information whether news or publicity is akin to sitting on a gold mine, especially with the increasing popularity of spin-doctoring. If and when the legal system becomes subverted by vested interests, you might as well emigrate for the law is (supposedly) the only protection the weak have from oppression by the majority (assuming a public gun-fight is overkill for making a protest).
LL
Supposing there were some equivalent to copyright that retained proprietary rights to software and/or data that expired, putting the material into the public domain, after (say) five years, this would provide a substantially better "regime" than the present situation where:
I'm not sure what the "excuse" would be for cutting the expiry times; in order to push this through legal channels, it would have to be shown that this provided some benefit over the present "expiry regime."
If you're not part of the solution, you're part of the precipitate.
Solution: infringe on everyone's patents and
copyright *but* earn no profit from it. One will
thus be rendered judgment proof.
Oya, the person who submitted this article seems to imply copyrights are somehow more palatable than patents. I can't think of a reason why since copyright monopolies last for the life of the author plus 70 years versus the 20 for a patent. Also, it is harder to obtain a patent. Perhaps he is just referring to the lunacy of the current patent administrators in their patenting of obvious ideas with much prior art. This is different than saying patents are more invidious than copyrights. In the sofware arena, the distinction between copyrights and patents nearly disappears. Where copyrights apply to specific works of authorship and patents apply to ideas registered with the USPTO in the specified manner, in software both works of authorship and the ideas are so easy to duplicate and proliferate that enforcing monopolies of the fruits of intellect becomes impracticable.
Prediction: the only workable way to regulate the internet will be to treat it as a separate legal jurisdiction, much like we do with international waters. It will be governed by treaties rather than domestic regulations. In this context an international standards body will determine patentablity and copyrightablity. This will lead to more equitable rules on the subject. Because the need for this method of regulation will emerge from a perceived impracticability in regulating the thing we will see a minimalist approach to regulation (i.e., similar to Hong Kong's laissez faire policies).
I thought if something was copyrighted then any copying of it was an infringement on the copyright. I.e. the copyrighted book they were talking about. Someone put it on disk. Most books I've seen say something like "No part of this book my be reproduced in any form or by any means without permission" (whoops guess I just violated that:). So how could the court rule in favor of the disk publisher?
Really though if you don't want info to be redistributed then you better state that clearly. The realtors complaining that someone can take the info they posted in their window and put it on the web, well of course they can. You put it in the public view. Your competitors could walk up and write them down too. Typically if something is viewable by the public and not marked with a "redistribution is illegal" statement then its fair game I think.
-cpd
Ability to claim ownership of information just because you organized it into a database is a dangerous thing. The danger is that information that used to free will find itself in "owned" databases and so will stop being free. That is not a good thing.
Besides, the laws regarding it will necessarily be either very vague or quite arbitrary. Let's say I compiled a list of all cow manure suppliers in my area and put it on the web. This is now a database, worthy of protection. Can other people copy the whole database? Not under the proposed law. What about one address? two addresses? five addresses? What if another guy goes to yellow pages and compiles his own list? Will he be required to prove "clean-room" conditions? What if he compiles his own and then cross-checks it against mine?
Lawyers will be very happy.
Kaa
Kaa
Kaa's Law: In any sufficiently large group of people most are idiots.
While databases really couldn't be patented (not the data that is) I'd rather see that then a copyright. Patents expire and then become available to the general public to extend the state of the art. Copyrights can last an awful long time and can be stickier in many different ways.
Of course databases have never been copyrightable before. Collections of data were just that. Your formating, layout, or other presentation could be, but not the collection of data. Granted many copyright holders used armies of lawyers to make people act as if their databases were copyrighted just becuase who can afford to fight it. The ARRL has done this multiple times with their repeater directories (that's why you can't find it online), and I think it's been done with different sports scores with varying amount of success.
Copyrights are to protect the creator of intellectual property. Software, art, literature are the products of the intellect. A collect of data is not.
Stallman wrote a feature for Linux Today over a month ago on this subject here. As you might expect, he's strongly in favor of open databases.
I certainly understand why someone who compiled a data base would feel ripped off if someone copied it for profit, or maintained an out of date mirror that caused harm. On the other hand you can see how great public good can come from free availability of certain databases. Perhaps the government should exercise eminent domain over the databases that need to be public?
I see us, as a society, heading for a situation where every facet of our lives will be dominated by intellectual property law. Want to think about "physics"? Sorry ma'am, we own that word. You'll have to pay us $5 every time you use it.
The problem our society is facing is that information has gotten radically easier to reproduce. It use to be that if you wanted to reproduce a "database" (then usually in book form) you would have to go to great effort and expense to typeset it, then print it, then distribute it.
None of these could be done casually, and it simply wasn't possible to easily undercut the original publishers. The problem is that computer technology has changed that. I can duplicate your whole database, world-wide, in a matter of minutes.
So what is the solution? Not more laws!!! I think that, ultimately, we are going to have to accept that IP is an obsolete concept -- indefensible in an electronic world. Everything is going to have to become open source. Emergency rooms need a database of antidotes? Great. Then they can pay someone to compile one on a contract basis.
You didn't contribute to the list this year? Sorry, you don't get access to it. I seriously doubt that, if the emergency rooms had to directly subsidze the creation of the list, they would be very eager to give it out to their stingy colleagues (especially in an industry as competitive as medicine). In the end, social pressures would encourage payment.
Let's not forget classical research either!!! The publication of open source databases is something that the Universities could do very effectively and would fit well with their classical role.
Bottom line... We don't need laws. We need a more enlightened attitude. Maybe it's time to campaign for the total abolition of IP law, worldwide. Otherwise, we will soon find that we have given soulless corporations the most basic of freedoms: freedom of information.
-- Slashdot sucks.
...compilations, otherwise known as databases, go largely unprotected by copyright laws that safeguard the interests of the authors and publishers of creative works.
The use of the word "protected" presents a bias in favor of the ownership of factual databases. Databases don't need "protection" if it is entirely legal to copy them.
[Databases] are generally gatherings of information created by someone else.
WRONG. Databases are collections of FACTS. A collection of copyrighed information is still owned by the owner of that copyrighted information.
The article also gives examples in which database compilers have been exploited, but fails to give any examples in which these companies have been the exploiters. One very good example could have been the recent case of West Pubishing trying to claim ownership all federal court opinions that it publishes.
one of the biggest problems I have with this whole database-as-protected-intellectual-property situation is that in the case of some databases, for instance, lexis-nexis, parts of the information in the database belong to you and I. these database companies compile information from publicly available sources and then want the information as a whole protected by law.
why should my civil court records become lexis-nexis's profitable proprietary information? I certainly didn't authorize anyone to sell my property or tax records. these databases that are huge collections of public data can't logically be made private.
I'm somewhat confused by the situation. While it's true that mere collections of information probably aren't what the legislators had in mind when copyright was established, they may still represent an essential investment to the compiler - not in storage costs, but in the effort needed to collect the data in the first place.
The Swedish Copyright Act has for quite some time contained a special kind of protection for collections of large numbers of information items (Article 49), similar to the protections given to audio or video recordings. It differs from normal copyright in a number of ways:
As with books, general provisions for fair use, private copying etc. apply, and as with books, nobody is prevented from extracting individual items of information from the collection, as long as you don't simply copy the entire collection.
These provisions predate the appearance of computerized databases, and were appearantly intended for printed catalogues, directories and the like. However, I think they apply equally well to digital collections, and I'm not aware of any legal major disputes over this matter in Sweden.
Then we started hearing complaints from several other countries that databases weren't protected by copyright, that such protection had to be established, and that it must be international. Funny they didn't seem to notice that Sweden already had that kind of protection, but went ahead outlining that protection from scratch. Then we were essentially required to adopt whatever they came up with, so now we have two kinds of protection covering approximately the same thing, but with very different rules.
Now I hear that the USA still has no database protection at all - and I was under the impression that the USA was the place where these desperate cries for database protection originated. Was I wrong? How many different kinds of database protection will be imposed on smaller countries before the USA gets its act together and implements even one of those, wreaking havoc with existing legal concepts everywhere?
Don't expect the media to raise any issues at all with this law. Media companies will benefit BIG time from such a law. They'll be able to compile all of those polls they're always running into databases and then copyright them. They'll be the first one's to commercialize the use of copyrighted on-line databases.
Copyrights are there not to ensure 'fairness'. They are there to encourage creation of intellectual property. Check the Constititution, not that Media considers THAT relevant anymore.
I haven't seen a lack of data collection/database creation under the present system. This is legislation that probably protects only campaign contributors.
Speaking of campaign contributors, if someone puts together campaign contribution databases will I be allowed to point out that Senators X, Y and Z and Congressmen A, B, C, D, (not E or F) and G got fat contributions and then supported some legislation? A case could certainly be made that I used their database and violated their copyright here.
I also believe the example of someone's book that was photocopied and placed on the Web to be bogus. It doesn't raise to the level of specious because nobody who knows ANYTHING about copyrights would believe that this is a problem. Of course the original author of a book is protected against it being published on the Web without permission. That's exactly what copyright protects.
They are attempting to extend copyright to every fact in a collection of facts. Very scary.
I am a lawyer...
...
... Possession is 9/10ths of the law, and control presumes possession
IANAL, but I can still argue with one, cannot I?
My name is my own, and I have every right, through possession and grant, to restrict how it is used: Why do you think that companies who give out prizes require you to sign a release so they can use your name? If my name was not owned by me, where's the hang-up?
Such companies want releases mostly to avoid being sued under tort law. Tort law and property law are very different approaches as you probably know. Whether your name is your own (in the property sense) is rather doubtful. You cannot destroy it or change it without the consent of the government. You do not get to pick it (again, without govt consent). Your rights to restrict its use are rather limited, and they disappear if you are a public figure. The phone company includes it in white pages by default, unless you specifically instruct it (and pay it!) not to.
To reiterate, your control over your name's use is accomplished through tort law, not through property law.
Banking and medical records are, again, owned by me (and restrictable: See the FCRA and privacy conventions) as they were created by a direct result of my actions and my doing
That may depend on a state, but I don't believe you own your banking and medical records. Try instructing your bank or your physician to destroy them -- see if they comply. Try telling the bank not to use your records for e.g. marketing by its insurance arm -- the bank may agree, but out of politeness, not because they have a legal obligation to do so.
Besides, the bank records are not created by you. You make actions, which are then recorded by your bank. Granted, you are the cause, but it is the bank that actually creates the records. The records reside on bank's computers and are its intellectual property. If you are believe they have been misused, you will sue the bank under the tort law and not under property law.
Also, my address is mine, simply because I can control its use.
Come on! Go back to your first year of law school and re-read the Property textbook. Just because you happen to control something does not mean you own it.
I may keep my address private, if I choose. "But, wait Mr. Lawyer, can't someone look up your address at least in the county's Hall of Records?" Yes, they can. However, that does not give them permission to use my address for their purposes.
And, pray tell me, why not? Or rather, why do they need your permission? If your name is Guppenblinken and I happen to collect the addresses of all Guppenblinkens in the world, I can perfectly well get your address, store it in my database, post it on the web, send you mail, etc. If you don't like it, you can sue me under torts, but for the n-th time, there are no property issues here. You cannot sue me for theft.
If what you say were true, then my address would be available for others to use for the same purposes I do, such as receipt of mail.f my banking records were not mine, then others could use them to place transactions against.
Nonsense. Receipt of mail is a function of the physical location of your house/apartment and the matching of that physical location to the Post Office database. The banking transactions involve actual assets in your account, which you definitely own, not *records* of past transactions.
I am sorry, you have to come up with better arguments.
Kaa
Kaa
Kaa's Law: In any sufficiently large group of people most are idiots.
While there has been some debate with scientific data (rather useless outside specialist fields), the case of commercial data is less obvious. There have been a few historical cases which have given people pause about the monopolisation of data. One specific example was the privatisation of some early LandSat satellite imagery which according to one viewpoint, was immediately priced to the legislated maximum which effectively stunted academic research into algorithms for processing satellite imagery and any follow-on applications. Other raw data by definition can only have value if shared, e.g. is meterological data spread across multiple countries. With the increasing automatic data collection and computerisation, the potential of conflict for owners and users of databases will only increase.
One of the biggest issues is how to "price" the assembly and aggregation of disparate data. Even pure scientific data could have some commercial value (e.g. genetic codes) under the right circumstances. One solution may be to provide the raw data and the processed value-added stuff and let the market judge whether it is cheaper to massage the raw themselves or save time in purchasing the processed.
Another approach is to create data rights limited by geographical, time, or functional scope. However, this in turn raises more problems in debating to what extent data can be altered before it is considered a unique "new" work (compare with music mixes or composition of existing recognisable art scenes). How far down the value chain is one allowed to claim a slice of the action (compare music score composers claiming a slice of movie soundtracks of their songs)? These are still unanswered questions.
A collection of innoculous facts (e.g. mouse-clicks) can be transformed into a perpetual watch on your web-browsing habits. Given enough time and persistance, any digital event can be tied to a personal profile. Who "owns" this data? A satellite can take pictures of people sun-bathing, some countries would be paranoid to define this as invasion of their sovereign air-space.
In short, the information age will create a whole new raft of problems which will require some legistlation just to clarify any ambiguities. IMHO some time limits would be the most likely solution, even sensitive federal data can be declassified after a suitable cool-down period. But unfortunately I suspect that until some people have seen how far the system can be abused, I doubt whether there will be any popular outcry for safeguards.
LL
The rationale used to justify copyright protection is that it's needed to provide incentive for authors, writers, etc.
But if we didn't have copyrighted databases, would we have a shortage of databases? If we wouldn't, then extending copyright protection is not economically justified.
What remains is simply our sense of fairness; we can all relate the the unpleasantness of somebody else selling our work. It may not bother a musician if somebody makes a copy of their CD for a friend, but it would definitely bother them if somebody else were pressing their own copies and selling them in music stores.
It might be a funny idea to force, say, my local phone company to drop my address information, which I trivially own, from _all_ their databases (phone book, billing databases, etc).
Welcome to the real world, pal. You do NOT own your address information. You do not own (in the intellectual property sense) your name, your address, your banking records, your medical records, your phone call logs, etc. etc. They are owned by whoever collected them. You have no say in how they are used (with exceptions provided by specific laws, e.g. video rental information).
Kaa
Kaa
Kaa's Law: In any sufficiently large group of people most are idiots.
I don't see why a Linux distribution wouldn't be considered a database. This could, indeed, have some nasty implications unless the law is written very narrowly (which would surprise me).
The big question here is how the copyright on the database relates to the copyright of an individual entry within the database. If the two are legally independent, then even if all the packages are covered by the GPL, a given distribution may be proprietary, requiring a separate license for each installation.
Legal? Yes, under current law.
BS. This is no more legal than would be doing the same thing with Steven King's latest horror epic. At least they let us know up front what the factual content of the rest of the article is liable to be.
Why would anyone spend $2 million to create a database if it's not going to be protected?'' said David Mirchin, of SilverPlatter Information, a database publisher in Norton, Mass.
Gee, I don't know. Why do they do it now? The fact is, databases are protected. Taking a database someone else has compiled and republishing it is a violation of law. The courts have ruled on this several times as specifically regards the internet in the last several years. However, the information in the database is not necessarily protected. If said info is otherwise publically available, then you cannot prevent others from using it.
What you see here is a push (not a new one; it's been going on for a while) to allow someone to create a database of freely available information, and make that information proprietary. The rationale behind this has been that it's the only way to ensure that the information hasn't been 'stolen' from the database, rather than gathered independently. Of course, it's not hard to see what the real motivation is; it's becoming very easy to gather large collections of freely available information. Many people see this as an opportunity to grab a free ride and make a lot of money.
I'm sure most everyone here can see the obvious problem with allowing the proprietization of this information. If you obtain it independently, how are you going to prove it? And if you create an independent database? Yes, you can document your sources, but unless you can afford to defend yourself in court, this information will be effectively off limits to you unless you pay whoever has bothered to gather it into one spot.
I keep hearing so much about how Americans are doing so well, how strong our economy is, how much everybody has. Why then, is greed becoming the defining trait of our culture? Are not Bill Gates' tens of billions enough? I kid you not, we as a culture and as a country are headed down a very ugly road; should we persist, we most certainly will get what we deserve.
The article mentioned one database company that gathers together Massachusetts court records and then charged fees for viewing. In the perfect, future world, each court would instead make the raw data of all its decisions available directly on the web. Researchers then mine this raw data to their heart's content. In this scenerio there is no place for database compilation company to insert added value.
Therefore, Congress should not pass any laws giving special protection to these dinosaurs.