Slashdot Mirror


Ask Slashdot: How To Both Mirror and Protect Crowdsourced Data?

New submitter cellurl writes "I run wikispeedia, a database of speed limit signs. People approach us to mirror our data, but I am quite certain it will become a one-way street. So my question is: How can I give consumers peace of mind in using our data and not give up the ship? We want to be the clearing house for this information, at the same time following our charter of providing safety. Some thoughts that come to mind are creating a 'Service Level Agreement' which they will no doubt reject, or MySQL-clustering, or rsync. Any thoughts, (technically, logistically, legally) appreciated."

10 of 76 comments (clear)

  1. Be the best and stop trying to "own" data by BitZtream · · Score: 5, Insightful

    You'll only be THE clearing house if you are the best source. Second, it's public data, stop trying to own it, you can't, it's not yours to own in the first place.

    --
    Persistent Volume manager for Kubernetes - https://github.com/dwimsey/openshift-pvmanager
    1. Re:Be the best and stop trying to "own" data by dyingtolive · · Score: 4, Informative

      Well, I mean, the alternative is that you insist that it IS an e-peen contest. If that's what you're going for, then by all means, build an API, license it out, but most importantly, PATENT THE MECHANISM YOU HAVE FOR COLLECTING DATA. Seriously. The more extraneous words you can add in, the better. If you need help on that, just let me know. I have a friend; this guy is amazing. He has this thing called a thesaurus. Neither me nor my MBA friends are entirely sure what exactly it is or does, but we know that when he uses it, it makes RoI improve 23% and IPOs, on average (cause we're professionals) improve by 62%, on average, by volume.



      Seriously though, to anyone reading this, I'm trashed, full of shit, banned from posting on the forums I normally frequent, and too uncoordinated to start an emulator. Do not mod this up. Do not encourage the OP.

      --
      Support the EFF and Creative Commons. The war is coming, and they're supporting you...
    2. Re:Be the best and stop trying to "own" data by Xacid · · Score: 3, Interesting

      Well that's nice until a facebook comes along to crush the myspace. "Public data" isn't something to be owned. But a specific distribution method or implementation of it can be. Yellowpages anyone?

      If they're trying to make a living off this there is the real world factor of keeping this info someone secured and then following up with a business model of some sort. Just because it says non-profit doesn't mean everyone works for free.

  2. Plenty of other sites by cold+fjord · · Score: 3, Insightful

    There are plenty of publicly accessible sites that mirror data from trivial to critical. I would contact a few of them and see what agreements they have in place, if any.

    I would think you would want to make sure they note their data is a mirror, and that updates should be sent to your site. That might be handled by doc files for each file, or some type of about file in each directory. You probably want something like that if for no other reason so as to note metadata.

    I've seen quite a few sites that prefer that you go to a mirror to download actual data.

    --
    much of left-wing thought is a kind of playing with fire by people who don't even know that fire is hot - George Orwell
  3. Thats why API's were invented by muphin · · Score: 4, Insightful

    create an API and provide an interface where your client base can interface with the data.
    there are a lot of places out there that does this, as its considered Intellectual Property.

    --
    It's not a typo if you understood the meaning!
  4. Be the best by giorgist · · Score: 3, Interesting

    Be the best Make all information free Choose a good licence Expect to be taken over one day from something better, when that comes along ... help them Make it easy for anybody to use your information It is counterintuitive but the moment you put up protective barriers, you fall over. The moment you depend on an artificial barrier to protect your lead is the moment you will degrade the quality of your product. Happens every time on products and services that grow on openness and suddenly feel the reason they are good is more so because of their qualities than the openness. If you develop a product/service based on a closed environment, that is a different story. It makes business sense to improve your model based on a closed environment until a disruptive product/service comes along.

  5. Not much you can do. by king+neckbeard · · Score: 4, Insightful

    This is a compilation of public data, with the legwork being done by others. You've got no real legal option in protecting the data, at least in regards to the US. You could perhaps try some technical means of controlling the data, but that would greatly reduce the utility. I would also consider in unethical to try and 'own' the results of work done by other unpaid volunteers. If you wish to be the center of this data collection, than make it as useful as possible.

    --
    This is my signature. There are many like it, but this one is mine.
  6. Protect from what? by ysth · · Score: 3, Interesting

    You want to "Protect...Data", "not give up the ship", "follow...our charter of providing safety". But what is it that you don't what mirrors to do with the data? Less verbiage, more clarity, please.

    1. Re:Protect from what? by Anonymous Coward · · Score: 3, Insightful

      It's fairly obvious to anyone who took 3 seconds to figure out what they are asking.
      They don't want to give their data up only to loose all their user-base to a "mirror". There are several ways around this, probably the easiest is not to share the data.
      However, their data does appear that it could potentially be of great use, especially to anyone who wants to calculate an accurate arival time when talking a trip. I would recommend keeping the actual data on your server, but providing an external API that allows outside apps to access your data and commit updates. I might even add a disclaimer that a portion of all profits made using your data must go to you. (Even if it is just adds) Also watch for data mining bots that will "steal" your data via rapidly accessing all data on your server through your interfaces. This is your hard work, you do need protection.

  7. Silly copyright notice by FaxeTheCat · · Score: 4, Insightful

    From the home page "the sign you capture is copyrighted with your name since you found it".

    How on earth can you copyright a speed sign, and even if you could, how can that copyright be relevant to anything?

    The location and speed limit of a speed sign is a fact. How can that be copyrighted? How can it limit the rights of others who observer the sign to publish its location and speed limit?

    If anybody were entitled to copyright a speed sign, it would be the authorities that put it there and who actually own it. How can the location of other peoples property be copyrightable? Looks like somebody took the concept one step too far...