Slashdot Mirror


Open Source License For Databases?

Myddrin asks: "Recently there has been lot of discussion of databases, and who owns them. The US either is considering or passed a law saying a Database(and info contained there-in) is owned by the creating person/company. [I honestly can't remember.] At anyrate, this got me thinking of a the (possible) need for Database GPL (DGPL). Basically the same as the LGPL, but adding that the database host (i.e. the owner of the server hosting the specific instance of the db) can put restrictions on access allowing them to offset the cost of hosting the machine (administration, i'net connection, etc)." Any data in a database is content, just like information on a web page. Maybe an Open Content License might be a better idea? Thoughts? (More)

"...Examples of acceptable restrictions would be:

  1. any program accessing this database must display the advert. provided,
  2. a cost of $.000000001 per record returned
  3. a nominal monthly subscription fee...
something like that. Very similar to the part of the (L)GPL that says you can charge a nominal fee for the materials of distrubution. The idea is that several competing servers could be set up, with multiple competing open and closed source clients running against it.

Is there a license that allows this kind of thing, or should I be working on one? "

34 of 85 comments (clear)

  1. NSI's database... by EnForce · · Score: 3

    With all the crap we've seen on NSI's Whois database, I'd say this is damn good idea - why shouldn't something created by the public (yes, all of our registrations created this database!) be owned by the public?

    1. Re:NSI's database... by jd · · Score: 2
      That runs into all sorts of arguments as to what exactly defines "public knowledge" or "public origin".

      Having said that, I agree with what you're saying. If the knowledge comes entirely from open, public sources, then there does seem to be something unethical about closing the compilation of that knowledge off and keeping it for commercial gain. It's about as sensible as AOL trademarking "You've Got Mail".

      P.S. ObOffTopic footnote: If you're into electronics, check out the following websites for a scary note: Ramset Electronics, 2600 and CyberSKIP. There's something definitely not OK going on.

      --
      It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)
  2. How about an open end-user license agreement? by VWswing · · Score: 3

    Or more like a restriction. Our personal information that is already floating around can not be resold but merely modified with changes? :)

    Ok.. but at least a legal way to prevent us from being in public or sellable databases. I'm so tired of getting calls from phone-spammers.

    --
    "And how can this be? For he is the ..."
  3. No way! by Erik+Fish · · Score: 2


    They can't have it both ways.

    If such a law is passed, this means that anybody who creates a database of song lyrics owns it!

    If copyright law interferes with this, then I'm going to copyright my personal information.

    1. Re:No way! by EngrBohn · · Score: 2

      The distinction to bear in mind is between the database itself and the data entered into the database.
      Christopher A. Bohn

      --
      cb
      Oooh! What does this button do!?
  4. First haiku! by Frank+Sullivan · · Score: 2

    Another license?
    Is it needed? Isn't that what
    copyright is for?

    ---
    120
    chars is barely sufficient

    --
    Hand me that airplane glue and I'll tell you another story.
    1. Re:First haiku! by EngrBohn · · Score: 2

      Slickness points for the haiku (that I won't attempt to match).

      But, regarding the content, a license is how you give people permission to use copyrighted material. That is, the copyright is the claim of ownership, and the license is the set of conditions under which you're willing to share the use of the material you own.


      Christopher A. Bohn
      --
      cb
      Oooh! What does this button do!?
    2. Re:First haiku! by Frank+Sullivan · · Score: 2

      Try writing haiku!
      Everyone is doing it!
      Even ESR!

      But seriously... the problem with sussing out a license for a database is that it depends on how the data is used. Open Source licenses work because the ways we use source code are pretty straightforward. Open Content licenses build off of them. But source and content have one thing in common... duplication causes no essential harm, and data integrity is not a huge issue.

      Databases, on the other hand, are often intended to centralize and synchronize information. Hence transactions, which exist to protect the integrity of the data. Moreover, databases often contain relations that require locks and triggers to maintain referential integrity. You may not WANT free copies of your database floating around, even if the information within the database should be free (speech or beer).

      So, barring lots of deep thought on the subject, i don't see a simple, general set of rules for "open" databases, because of the integrity issues, and because of the wide variety of ways in which the data may be used.
      ---
      120
      chars is barely sufficient

      --
      Hand me that airplane glue and I'll tell you another story.
    3. Re:First haiku! by Frank+Sullivan · · Score: 2

      Oops! You're right! My bad.
      What can i do about it?
      Repost corrections?
      ---
      120
      chars is barely sufficient

      --
      Hand me that airplane glue and I'll tell you another story.
    4. Re:First haiku! by Myddrin · · Score: 2

      I think that I may have stated my question poorly. Unfortunately I was on vaction when this was posted, so I missed much of the discussion.

      My idean is to post the database schema under something like the LGPL, allowing multiple sources to host information on say the value of 1980's comics. The structure of the DB would progress like any other open source project, but the content would be available from several source, each with different content. All I'm talking about doing is setting up a standard db for a given function (dishing out the value of you X-MEN 247) that many people could write open/closed source clients for searching.... but the content would not be syncronized unless the "licensce" (the restrictions mentioned in my question) allowed for it.

      Is that any clearer?

      --
      Myddrin
    5. Re:First haiku! by Frank+Sullivan · · Score: 2

      Yes, that's much clearer, thanks! I can see open-sourcing the *schema* for a database rather than the data itself.

      Too bad none of this will be moderated up.

      ---
      120
      chars is barely sufficient

      --
      Hand me that airplane glue and I'll tell you another story.
  5. Advertisements by EngrBohn · · Score: 2

    I cannot imagine the FSF would sanction a license (at least I'm assuming you would want DGPL to be sanctioned by the FSF, based on the suggested name) that would require advertisement. Although, in the web-context, I suppose advertisements are the closest thing to a common currency. I still think that'd be the real sticking point, though.
    Christopher A. Bohn

    --
    cb
    Oooh! What does this button do!?
    1. Re:Advertisements by bmetzler · · Score: 2
      I cannot imagine the FSF would sanction a license that would require advertisement.

      The submitter didn't mean that it would "require" an advertisement, he implied it would "allow" an advertisement.

      The point being, if you have a public database, you can't just allow people to use it, the must have the ability to profit from it. Otherwise there's no incentive to using it. The GPL allows you to profit from source code.

      -Brent
  6. Good idea! by jd · · Score: 2
    Copyrights cover any "organised collection of data", so does cover databases. Some equivalent of Copyleft for the specific case of databases would be great! (It shouldn't need a significant change, either, as it's all straight copyright law.)

    IMHO, it would be great to have a generic "copyleft" scheme, which covered everything, but for now, something for each of the significant special cases (eg: code, documentation, art, databases, etc.) is a good start.

    --
    It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)
  7. Why not just the GPL? by Overt+Coward · · Score: 2
    I mean, legal language aside, the desired license seems to contain the main precepts of the GPL:

    • The information in the database is free to be used, and even incorporated in a derived work, as long as that work is also covered by the same license.
    • The hosting service is not charging for the data, but rather for the service of providing a means to access that data.
    • The original owner(s) of any data, which requires that no one else's data was used to create the data, may release the data under any other license(s) they so desire.



    --

  8. New FSF license may be close by wfrp01 · · Score: 3

    The Free Software Foundation (http://www.gnu.org/) has been working up a license to cover documentation. Not exactly the same as what's being discussed here, but maybe close, if you think that information is information is information. Perhaps with some minor changes it would do the job, or a similar variant could be derived.

    This is a work in progress (correct me if I'm wrong). At least I don't yet see it on the Free Software Foundation's license list (http://www.gnu.org/philosophy/license-list.html)

    I'm sure the authors would have more appropriate input than myself. Just my two cents.

    --

    --Lawrence Lessig for Congress!
  9. Focus on Copying by Vagary · · Score: 2

    In my experience (as someone who has setup databases and interfaces for commercial ventures), the major concern of for-profit database owners is not that they won't be able to make money off the database (even if it's just ad revenues) but that someone will be able to grab all their information and resell it better than they can. I'd imagine that the major concern of not-for-profit database owners/creators is that someone will fragment the database through irregular mirroring.

    The concerns of for-profit database owners is not paramount to a DGPL but the copying/mirroring of data should still be the focus. Towards this, it should be ensured that the DGPL addresses both dynamic and static databases and gives owners as much reason to use this license as the LGPL.

  10. Data too, GIS and Europe vs. the US by fraxinus · · Score: 3

    A few comments:
    I would like to see licenses concentrating on the data (content) rather than the whole database (the collection of data) - that would let you modify, it resell it etc -- much like the US census data or USGS geographic datasets.

    This question is very interesting, especially for geographic data (for GIS -- Geographic Information Systems). The situation in the US is like a dream, where all the USGS data is distributed without any tough restrictions (a BSD-ish license for data). The datasets are very expensive to create and a valuable asset.

    In comparison, the situation in most of Europe (for example UK or Sweden, where Im from) is that the mapping agencies are recovering most of the costs associated in creating digital geographic datasets. They are incredibly expensive!!! Thus the use of GIS is much more restricted (as well as development in the field) in this part of the world.

    Another interesting point, the license which NASA licenses the new Landsat 7 digital imagery. They are a lot cheaper than before (a few hundred $$$) and the license is 100% non-restricted (even here a BSD-ish license). In comparison, earlier Landsats, and the current competitors are a magnitude more expensive, and in most cases they require you to license the USE of the data, not the 'ownership' of the data. That way you had to buy one license to use a satellite image for education/classes and another license to use the same image for an analysis... Landsat is run by the US government, so it is you tax payers that are paying for this give-away (they are not obviously recovering all the costs for the operation)

    Nowadays there are people that have bought (and used) the new Landsat images and are making them available for download (for free!). Of course this is under great debate (imagine the competitors to Landsat).

    So... More talk about data and databases!

    --
    // Fraxinus
  11. Hmm, that might be an interesting idea by Hoonis · · Score: 2

    >If copyright law interferes with this, then I'm
    >going to copyright my personal information.

    Hmm, that's actually a fairly interesting idea. Could this be done to thwart having your phone number resold etc? Copyright your name, phone number, and address and then sue people who sell it for infringement? Hmm. It seems to me that there is already established the idea that personal information has value, it seems logical that the person whose information it is should be considered the owner. If companies have to pay ME for my phone number/mailing address, I can set the price high so it's not worth the effort for them to spam me with advertisements

    1. Re:Hmm, that might be an interesting idea by rocca · · Score: 2

      IANAL, but I'd have to agree.

      Your SSN/SIN numbers are owned by the government, your telephone number is owned by the telephone companies, your drivers license by the DMV, your address by the city, your name is probably pretty public domain and your birthdate is definitely not yours alone. Even your job history is likely owned by your employers and criminal records by the police. Shopping patterns by the discount card holders and/or credit card companies, and your email address to your ISP.

      The only thing that is yours, and only because there is government legislation stating so, is your healthcare information.

  12. Public Database License Would Be Good by Artagel · · Score: 4
    A database can be protected by copyright if there is sufficient originality in the "selection and arrangement" of the contents. As pointed out earlier, it is important to remember that the contents can be separately protected. Think of it this way: A book of quotations can be protected as a compilation. Each of the quotations within it may also be protected by copyright in the quoted work. There are many useful databases which cannot be protected by copyright, usually databases that are made up of facts, and those facts are comprehensive and have obvious arrangements. A white pages phone book includes all of the phone numbers and names, and arranges them alphabetically. So much for selection and arrangement.

    The problem is that it can be hard work to research and compile these facts even if the result has no originality. I think we believe that people should be able to obtain benefit from their work. Database protection schemes try to create a copyright-like right against the substantial extraction and reuse of facts from a database. Thus, someone who contributes to a publicly licensed database wants to be sure he can access the additions of others in the future in payment for his work (rather than the corporate-generate-cashflow model for benefit.)

    Licenses are important to accomplish that right to later access because they can work even where you don't have a 'right' to copyright. Thus, if I license a CD to you with all the phone numbers in the U.S., I can license it to you as long as you don't put it where multiple people can use it. After all, fair is fair, we have a contract, and I am just making sure I can sell my work to other people, and not have you, my customer, becoming my competitor just for having bought my product once.

    A public license on a database would really only be useful if databases DERIVED from the original had to be made available for copying. Consider a list of all the music CDs ever made. It has to be updated, since new product comes out all the time. Can someone go into the business of providing these databases by taking the old, updating it, and calling the new database proprietary? Not if you have a public license. (All of this assumes that shrinkwrap or clickwrap licenses are good. They aren't in many countries.)

    As long as the resultant database is available to be copied, in whole, then the charge for accessing the server, whether to take the whole thing at once, or one record at a time ought to just fall under a reasonable distribution charge. Heck, the record-by-record access might as well be charged at any rate the provider wants since they are providing interface as well as content. If someone wants to roll their own, let them download the database.

    I think a public database license would be a good thing because it will allow public databases to grow and be distributed in a fair way when database protection laws are passed.

  13. It's just not enough by H3lldr0p · · Score: 3

    I recently did a report for a tech-english class last semester. It ended up being about ownership and the Internet, most specificly who it is that owns the whole shebang. Not an easy project, and I did not end up finding what I thought I would find when I first started. The paper overall ended up being one on copyrights. So I'll say the same thing that I ended up saying in that paper.

    You cannot treat the digital world the same as the print world.

    It just cannot be done. Everybody that reads slashdot with any frequency knows the lunacy of walking down that path. So let me take that argument and apply it here.

    You cannot treat an online database the same as one you might have as hardcopy database (read:propritary, closed, or rolerdex on a desk) in an office. You cannot charge access to it in the same manner. You cannot oversee the users in the same manner. And most importantly, you cannot expect people to value the data that is stored therein the same.

    With that said how can anybody expect to make a profit by putting such a beast online. I have two thoughts.

    #1: Do as the search engines do. Find some other way to profit. I have no idea what product Yahoo makes, but for some reason people invest in it, and somebody, somewhere is making money. It has been done once, and it can be done again.

    #2: Do it ebay style. Auction the info off. Highest bidder gets the ability to negotiate a use license. No cost to find out if it exists, just a cost to read it. The more people demand rare info, the higher the price goes up.

    Any body else go a suggestion?

  14. isn't this a plagiarism issue? by small_dick · · Score: 3

    Plagiarism has a long history, and I saw several students get the boot from the University where I went to school for violating University guidelines.

    If you are going to do new work on a previously examined topic, you must cite your sources, have a variety of sources cited, and NOT provide a sense that the owners of the cited work have been plagarized.

    For example, I can write a book about "Snoop Doggy Dogg", provide about 100 citations (books, webpages, mag. articles, TV/Radio programs), provide my condensed "personal take" on the rapper, and publish. That's legal; it's the foundation of all new work -- deriving from the old.

    But when I cross the line (doing a rehash of an existing SDD book), and call that work my own with no citations, or with a "sense" of plagiarism, I open myself up to legal trouble.

    I think the "fair use" rules, as they apply to books, will eventually dominate this issue. People using data from webpages WILL have to cite their sources, use a variety of sources, and verbatim copiers will be penalized/threatened, etc.

    What am I missing here? This just sounds like another failure of the legislative process to provide sane solutions to a fairly simple, well-known problem. Is this just a scheme to provide incompetent lawyers with phat salaries for years to come?

    I see no fundamental difference between pages on the web and pages in the library. They both convey information to the observer in virtually the same manner. The earliest animations were just flipping paper pages anyway.

    New Year's Rocked. Love you all :-)

    --


    Treatment, not tyranny. End the drug war and free our American POWs.
    See my user info for links.
  15. Open Content by redhog · · Score: 2

    There allready is a license for open content. Check out www.opencontent.org.

    --
    --The knowledge that you are an idiot, is what distinguishes you from one.
  16. mp3's by Signal+11 · · Score: 2
    Well, I have a program - mp3db, that does database stuff. I'm going to just add a clause stating output of my program must be done under the GPL as well - ie: keeping it internal is OK, but if you release it - you do so to everyone at no cost.

    I hope RMS updates the GPL to deal with this issue more specifically soon....

  17. Open Content Licenses already exist! by greenrd · · Score: 2
    See e.g. the Open Directory license. That's been a very successful business model (pay volunteers nothing, give away data for free) - it's growing at an astounding rate and will soon surpass Yahoo!

  18. This law is bad. by Dastardly · · Score: 2

    I don't think the implicaions of this law are understood completely. Current copyright law through various legal precedents grants copyright protection to the format of a collection of data. the classic example is a phone book. It also only applies to the exact organization if that organization is not obvious.

    The classic example is a phonebook. A phonebook is a collection of data i.e. names, phone numbers, and addresses. Organized in alphabetical order. As it turns out under current copyright law this has minimal protection. Alphabetical ordering is obvious, and the rest of the directory is information which by law is publlic domain and not protected by copyright.

    A law protecting databases and their content could easily extend to a copyright on information. Basically, a database should be covered just like a phone book. Any content in the database would be owned by the creator of that content, but any information would have to continue to be public domain.

    Basically, this means that the databases of internet search engines can be extracted and reorganized into a new database, simply because URL and page titles are information and therefore are not and should not be protected.

    Dastardly

    P.S. Arguably a page title could be considered the property of the creator of the original, but the URL is really public domain information and not protected by copyright.

  19. Now lets just think about this for a moment by Zaffle · · Score: 4

    This has a lot to do with who does own a database... If I go out, messure the rainfall over a period of a year at 10 different places, and then put that into a database, its mine. I don't think anyone but mother nature can contest that (unless I put it in an Access database, then MS might contend ;)).

    But if I go and put all the information I know about everyone I know into a database, who does the database belong to? Can I go and sell the information? The 1991 Privacy Act in New Zealand says that if I am a company, and I collect information about ppl, one of the things I must do is along ppl access to view/modify there record. (Within reason, ppl can't demand to modify their bank balance ;)). I also must state what I plan to do with the information, including wether I plan to sell it. Ianal, but I don't think it prohibits me from selling it to anyone I want.

    Theres a good reason for this, our electoral rolls (list of ppl who are enrolled to vote, names, addresses, etc) are availible for purchase, (incidentaly, in order to have my record unavailible, I have to have a "good" reason, eg I'm being stalked, and I have a restraining order, etc. I can't opt out of it just because I want to).

    This means that my database of your personal habits I noticed is mine. And I can do with it what I want. (Note; there is an option for various personal defimation(sp) laws here if I say false things).

    Now that thats settled, what DO I want to do with my database of your habits? Well, I believe in free speach, my programs are GPL, so I want to make it free.

    I will license my database under a "free" license. This license is NOT designed to allow ppl to make money off of my database, so the same rights must be transmitted to the user of the database. So, the license must allow a user to "copy" the database one record at a time if they like.

    Now, the big thing, cost. Simple, same as the GPL, a distribution fee. ie you can charge a reasonable fee for the distribution of the database in whole to the user.

    Ahh, but what about accessing records, eg a web database, or phone, whatever. Thats fine, you can charge me whatever, that is outside the scope of the license, but what is in scope, is you MUST offer the entire database for a reasonable cost.

    "What!" you cry, "This is no good for me". Fine, then don't use the license, if you want to make money out of something, why are you trying to use a "Free" license?

    The point of the matter is, a "free" database license should not be orientated at making money. I don't earn a cent from the GPL programs I write. If i wanted to, I could, I'd just use a different license. But I don't, and I want my database of your personal habits to be free aswell.

    The minute you try and work out how a company can still make money with this license, you defeat the purpose of it. As I said, you can offer access to the database for whatever price you want, but you must offer the entire database for a resonable price too. RedHat makes their money by basically selling pretty boxes and support.

    Stop trying to work out how you can make money out of database, and start working out how you can make it available for all.

    --

    I use to have a funny sig, but slash cut it off, and I forgot what the punchline was.
    1. Re:Now lets just think about this for a moment by mindstrm · · Score: 2

      This seems straightforward to me, how you make money, though I realize it gets fuzzier in practice.

      I believe a copmany should not make money simply by excercising complete control over a set of information (ie: a database). The service they provide me is one of collecting and providing me with said information.

      20 years ago, if you had a 1GB database that I could pay to access online, I wouldn't have had a problem with it. There is no way I could store that kind of information myself anyway.. so you were, in effect, providing a data-warehousing service. The databse is just one way to look at it.

      Nowadays, if you have a small database (something I can't reasonably fit on my computer), why should I be paying tons of money for accessing individual records, when the whole thing is pretty small anyway?

      I guess what I'm saying is, even though they like to pretend it's the information that it's all about, the real service that has been provided in the past is one of data warehousing, and data sorting; doing what others did not have the resources to do.

  20. Partial disagreement by Ungrounded+Lightning · · Score: 2
    Having said that, I agree with what you're saying. If the knowledge comes entirely from open, public sources, then there does seem to be something unethical about closing the compilation of that knowledge off and keeping it for commercial gain.

    I have no problem with a company databasifying public data and charging for their compilation. Don't like it? Buy a different compilation from a competitor, or get the raw data and databaseify it yourself.

    On the other hand, I have a BIG problem with a company and a government agency cutting a sweetheart deal such that only that ONE company gets to databaseify and sell that agency's public records. (This has happened with both the US Patent Office and the Library of Congress card catalog, though I'm not sure if either exclusive deal is still in effect.)

    --
    Bantam Dominique roosters crow a four-note song. Once you've heard it as "Happy BIRTHday" you can't NOT hear it that way
  21. Maybe the Market has an answer by Wah · · Score: 2

    Look at the IMDB. as an example.

    It's not "open" in the facet that it can't be repackaged or repurposed, but is "open" as far as using it to obtain a wide variety of well-organized and searchable information. Updates, servers, and bandwidth are paid for with mass exposure (advertising).

    Databases are interesting things. I mainly work with radio/tv station db's. It has been determined that the average cost for obtaining a name/address/phone is roughly $7. Appending interesting information costs more money, as well as yearly NCOA (National Change of Address) updates, databases can be expensive, or I should say, used to be. The Internet has changed things siginificantly (as if you didn't know). Acquisition has dropped (for us) to about $.10 a name.

    Large Databases used to (15-20 yrs. ago)require the work of millions of dollars of heavy iron, now a moderately equipped small company can do serious modeling/profiling and apply it (BTW, this is another reason CS majors are pulling heavy $) effectively.

    I don't really see how the OS model fits this. Unless you're talking about DB tools. OS developement isn't like DB developement, collecting/organizing data is different than coding compilers and desktop environments.

    centscents

    --
    +&x
    1. Re:Maybe the Market has an answer by dsplat · · Score: 2

      The IMDB has also protected future access of anyone who contributes by allowing a download of the complete raw data and programs to extract from it. They don't make it particularly obvious to find it, but they don't hide it either. They restrict the ways that you can redistribute it, but they provide it complete for unlimited personal use. That has one of the desireable properties of open source: you are not dependent on their continued existance and goodwill for access to the data.

      --
      The net will not be what we demand, but what we make it. Build it well.
  22. YADHP (Yet Another Damned Haiku Post) by Kaufmann · · Score: 2


    More poets in here?
    If it's haikus that you want
    I have got plenty.

    (Here are some...)



    Morning smiles upon
    Post-2K community
    The gods let us live.



    Redmond upon us
    The bloatware makes me shiver
    I fear Win2K.



    Linux is not bad
    Free if time has no value
    Should be preinstalled.



    Pheer the cracker kid
    Chats on AOL all night
    He is true 31337.



    See the Redmond Beast
    Its vapourware is worthless
    One more promise.



    C++ sucks ass
    Although I need my paycheck
    I wish Bjarne was dead.

    --
    To the editors: your English is as bad as your Perl. Please go back to grade school.
  23. Re:What About Slashdot's Database? (API access) by Myddrin · · Score: 2

    Thank you! This is exactly what I am talking about! This is a much better explination of what I am talking about. Thank you!!!!!!!

    --
    Myddrin