Slashdot Mirror


Ask Slashdot: How To Both Mirror and Protect Crowdsourced Data?

New submitter cellurl writes "I run wikispeedia, a database of speed limit signs. People approach us to mirror our data, but I am quite certain it will become a one-way street. So my question is: How can I give consumers peace of mind in using our data and not give up the ship? We want to be the clearing house for this information, at the same time following our charter of providing safety. Some thoughts that come to mind are creating a 'Service Level Agreement' which they will no doubt reject, or MySQL-clustering, or rsync. Any thoughts, (technically, logistically, legally) appreciated."

76 comments

  1. Be the best and stop trying to "own" data by BitZtream · · Score: 5, Insightful

    You'll only be THE clearing house if you are the best source. Second, it's public data, stop trying to own it, you can't, it's not yours to own in the first place.

    --
    Persistent Volume manager for Kubernetes - https://github.com/dwimsey/openshift-pvmanager
    1. Re:Be the best and stop trying to "own" data by dyingtolive · · Score: 2

      This is what I was thinking. It's not an ownership e-peen contest. It's letting people have their one-way streets, realizing it's not the end of the world. Creative Commons Share-Alike it if you will, but I'm not sure there's a better way to do it.

      --
      Support the EFF and Creative Commons. The war is coming, and they're supporting you...
    2. Re:Be the best and stop trying to "own" data by dyingtolive · · Score: 4, Informative

      Well, I mean, the alternative is that you insist that it IS an e-peen contest. If that's what you're going for, then by all means, build an API, license it out, but most importantly, PATENT THE MECHANISM YOU HAVE FOR COLLECTING DATA. Seriously. The more extraneous words you can add in, the better. If you need help on that, just let me know. I have a friend; this guy is amazing. He has this thing called a thesaurus. Neither me nor my MBA friends are entirely sure what exactly it is or does, but we know that when he uses it, it makes RoI improve 23% and IPOs, on average (cause we're professionals) improve by 62%, on average, by volume.



      Seriously though, to anyone reading this, I'm trashed, full of shit, banned from posting on the forums I normally frequent, and too uncoordinated to start an emulator. Do not mod this up. Do not encourage the OP.

      --
      Support the EFF and Creative Commons. The war is coming, and they're supporting you...
    3. Re:Be the best and stop trying to "own" data by Anonymous Coward · · Score: 2, Interesting

      Other sites slurp OpenStreetMap data all the time. No biggie, that's what it's for - if the traffic gets too much they *ask* you to take a mirror to reduce bandwidth costs. OSM has a "share with attribution" kinda licence.

      If you're really wiki-anything, you'll recognise that this is public information that you curate. Let 'em have it.

    4. Re:Be the best and stop trying to "own" data by bugnuts · · Score: 1

      It's a collection of public data, kind of like you know, a dictionary.
      Each individual picture is copyrighted. The collection has an editorial copyright much like encyclopedias.

      And he does actually own the collection. Do you really think databases of public data can't be "owned" (in the non haxorz way)? Better tell Google to stop wasting all that money on street view, which is merely taking pictures of public streets.

    5. Re:Be the best and stop trying to "own" data by Anonymous Coward · · Score: 0

      Do you really think databases of public data can't be "owned" (in the non haxorz way)?

      In the US, collections of facts are not subject to copyright law.

      Better tell Google to stop wasting all that money on street view, which is merely taking pictures of public streets.

      Perhaps the worst possible example you could give. As you say yourself:

      Each individual picture is copyrighted.

    6. Re:Be the best and stop trying to "own" data by queazocotal · · Score: 1

      Op needs a copyright lawyer, and may be out of luck.
      As others have said, openstreetmap is relevant.
      Most important is the existing licence of the database.
      If it does not contain a 'I can change the licence at whim' clause, you're fucked.
      You need approval from every submitter.
      At least for the photos, and for the data too in some countries.
      Read up on why openstreetmap chose odbl,

    7. Re:Be the best and stop trying to "own" data by Xacid · · Score: 3, Interesting

      Well that's nice until a facebook comes along to crush the myspace. "Public data" isn't something to be owned. But a specific distribution method or implementation of it can be. Yellowpages anyone?

      If they're trying to make a living off this there is the real world factor of keeping this info someone secured and then following up with a business model of some sort. Just because it says non-profit doesn't mean everyone works for free.

    8. Re:Be the best and stop trying to "own" data by BitZtream · · Score: 1

      Facebook replaced because they were a better Myspace.

      Everyone gets replaced eventually, you only lead while you actually have the best product or cheat.

      --
      Persistent Volume manager for Kubernetes - https://github.com/dwimsey/openshift-pvmanager
    9. Re:Be the best and stop trying to "own" data by Xacid · · Score: 1

      Of course, but why give someone else a headstart and piss away all of your efforts?

    10. Re:Be the best and stop trying to "own" data by Anonymous Coward · · Score: 0

      Yes, I wholehartedly agree. Social darwinism is truth!

  2. Plenty of other sites by cold+fjord · · Score: 3, Insightful

    There are plenty of publicly accessible sites that mirror data from trivial to critical. I would contact a few of them and see what agreements they have in place, if any.

    I would think you would want to make sure they note their data is a mirror, and that updates should be sent to your site. That might be handled by doc files for each file, or some type of about file in each directory. You probably want something like that if for no other reason so as to note metadata.

    I've seen quite a few sites that prefer that you go to a mirror to download actual data.

    --
    much of left-wing thought is a kind of playing with fire by people who don't even know that fire is hot - George Orwell
  3. Find a partner by Kergan · · Score: 1

    Consider teaming up with a seasoned negotiator with good business sense, or hiring an attorney -- or both. If there is any value in your dataset, those who got in touch with you will not reject fees, SLA's, reciprocal updates, etc. It all depends on how much data you have, and how accurate it is.

    On a separate note: your site is disfunctional on my tablet. I'm left wondering what it's about or how it's supposed to work.

    1. Re:Find a partner by Anonymous Coward · · Score: 0

      On a separate note: your site is disfunctional on my tablet. I'm left wondering what it's about or how it's supposed to work.

      It also doesn't work on a normal computer. As far as I can tell there are like 4 datapoints available atm. Whoopdiedoo.

      It's amazing to see how people stress out about work being 'stolen' before any work is actually done. Sounds the same as musicians getting together to start a band and then spend weeks figuring out a band name + logo + website before they even wrote down one song...

    2. Re:Find a partner by JaredOfEuropa · · Score: 1

      It might be too late to worry about your work being stolen after you've already done the work and made it public. It's a good idea to sort this out before there's anything to actually steal.

      --
      If construction was anything like programming, an incorrectly fitted lock would bring down the entire building...
    3. Re:Find a partner by Plunky · · Score: 1

      As far as I can tell there are like 4 datapoints available atm. Whoopdiedoo.

      I thought that too since there is nothing showing in my town, but the wikipedia page says the project was started in 2005 and there should be a lot more than that..

      By early 2011 the Wikispeedia database contained 28 million speed limit entries

      Perhaps their database is slashdotted, or the website is just broken? For some reason the map API is slightly different from the normal google-map one..

  4. Thats why API's were invented by muphin · · Score: 4, Insightful

    create an API and provide an interface where your client base can interface with the data.
    there are a lot of places out there that does this, as its considered Intellectual Property.

    --
    It's not a typo if you understood the meaning!
  5. Examples by buttfuckinpimpnugget · · Score: 1

    Might not translate exactly, but look into how the openbsd project mirrors their stuff. There is the main site, tons of mirrors. Everything is hashed. Grab a mirror, if you don't trust it get the hashes from the main site and check the files. Not sure if it would scale to what you're doing. And what do you mean by 'giving up the ship' exactly?

  6. By license by tlambert · · Score: 1

    License the mirroring only in the event that:

    1. It's visibly acknowledged that you are the source site
    2. updates are either directly sent to you, or are sent to you by the other site within a time limit
    3. All content on your site, including that sent to you by another (mirror) site, be watermarked as belong to your site. For pictures, this would be a visible watermark on the picture.

  7. Be the best by giorgist · · Score: 3, Interesting

    Be the best Make all information free Choose a good licence Expect to be taken over one day from something better, when that comes along ... help them Make it easy for anybody to use your information It is counterintuitive but the moment you put up protective barriers, you fall over. The moment you depend on an artificial barrier to protect your lead is the moment you will degrade the quality of your product. Happens every time on products and services that grow on openness and suddenly feel the reason they are good is more so because of their qualities than the openness. If you develop a product/service based on a closed environment, that is a different story. It makes business sense to improve your model based on a closed environment until a disruptive product/service comes along.

  8. Not much you can do. by king+neckbeard · · Score: 4, Insightful

    This is a compilation of public data, with the legwork being done by others. You've got no real legal option in protecting the data, at least in regards to the US. You could perhaps try some technical means of controlling the data, but that would greatly reduce the utility. I would also consider in unethical to try and 'own' the results of work done by other unpaid volunteers. If you wish to be the center of this data collection, than make it as useful as possible.

    --
    This is my signature. There are many like it, but this one is mine.
    1. Re:Not much you can do. by mattr · · Score: 0

      Tell that to Lexis-Nexis.

    2. Re:Not much you can do. by drinkypoo · · Score: 1

      That depends on the AUP of the site in question. Mine states (or used to, my legal module may be missing) that comments remain the property of the poster but that I'm granted a irrevocable right of reproduction for any and all purposes. If anyone were uploading original images or other potentially useful data I might want to protect that right.

      --
      "You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
    3. Re:Not much you can do. by BitZtream · · Score: 1

      You don't pay Lexis-Nexis for the data, you pay them to FIND the data you're looking for. They can claim they own it all day long, doesn't make it actually true.

      --
      Persistent Volume manager for Kubernetes - https://github.com/dwimsey/openshift-pvmanager
  9. Protect from what? by ysth · · Score: 3, Interesting

    You want to "Protect...Data", "not give up the ship", "follow...our charter of providing safety". But what is it that you don't what mirrors to do with the data? Less verbiage, more clarity, please.

    1. Re:Protect from what? by Anonymous Coward · · Score: 3, Insightful

      It's fairly obvious to anyone who took 3 seconds to figure out what they are asking.
      They don't want to give their data up only to loose all their user-base to a "mirror". There are several ways around this, probably the easiest is not to share the data.
      However, their data does appear that it could potentially be of great use, especially to anyone who wants to calculate an accurate arival time when talking a trip. I would recommend keeping the actual data on your server, but providing an external API that allows outside apps to access your data and commit updates. I might even add a disclaimer that a portion of all profits made using your data must go to you. (Even if it is just adds) Also watch for data mining bots that will "steal" your data via rapidly accessing all data on your server through your interfaces. This is your hard work, you do need protection.

    2. Re:Protect from what? by Anonymous Coward · · Score: 0

      He doesn't one of the mirrors to take over the project and basically steal it out from under him, like with what GraceNote did to CDDB.

    3. Re:Protect from what? by ysth · · Score: 1

      Err, no, that isn't obvious. Or at least it is in direct conflict with the request that the data be "mirrored", which implies to me a copy which also distibutes to the public.

      I'd like to hear more from cellurl to resolve the conflict I see in the request.

    4. Re:Protect from what? by Anonymous Coward · · Score: 1

      They don't want to give their data up only to loose all their user-base to a "mirror".

      It's not their data at all, it's data that was entered by their users.

    5. Re:Protect from what? by Anonymous Coward · · Score: 0

      They don't want to give their data up only to loose all their user-base to a "mirror".

      That's "lose", not "loose." Thanks.

  10. Simple by Anonymous Coward · · Score: 0

    Daily MySQL dump to S3. Done. Your welcome.

    1. Re:Simple by wallyhall · · Score: 1
      [Not so] simple?

      I may be wrong, as the OP didn't mention budget!

      However looking at their site, I'm guessing they're desperate to keep costs to an absolute minimum - correct me if I'm wrong (please), I think the S3 would be potentially quite expensive?

      I *think* the OP is looking for crowd-source solutions, i.e. a way for people to run mirrors themselves whilst maintaining integrity and copyright(s).

      --
      I think therefore I am... a Linux geek.
  11. Blagh by Anonymous Coward · · Score: 0

    This project is terrible and has gotten only marginally better with time. The woman in the video sounds very attractive. She needs a sexy slashdotter to help her out.

  12. Private or Community Mirrors by mattr · · Score: 1

    If it's safety you want, I don't understand why you are trying to get other sites to freely back up your data.
    Get a real backup service and tell people how it's backed up, poof! safety.
    Or if you want to make a community resource you can do like sourceforge, ibiblio, etc, free mirrors that point back to your site.

  13. Don't force it, just make it easy by subreality · · Score: 2

    You don't want to mandate people give you data. That will just get you bad data. Instead, make it as easy as possible for them to do - APIs, easy web forms, any method you can think of that will make the barrier to entry as low as possible. Encourage them to use it, but relax and set your data Free and don't try to force it. It's like Wikis... Somehow it works out OK.

  14. Fully Homomorphic Encryption by Anonymous Coward · · Score: 2, Funny

    Fully Homomorphic Encryption. FHE. See http://en.wikipedia.org/wiki/Homomorphic_encryption#Fully_homomorphic_encryption

  15. Information wants to be free by Anonymous Coward · · Score: 0

    As we keep getting told, information wants to be free.
    Stop being a scummy copyright-enforcer and let people share the information.

    Simple.

    You don't lose anything if others mirror the data, instead you gain an even wider audience, it's all good.

  16. Silly copyright notice by FaxeTheCat · · Score: 4, Insightful

    From the home page "the sign you capture is copyrighted with your name since you found it".

    How on earth can you copyright a speed sign, and even if you could, how can that copyright be relevant to anything?

    The location and speed limit of a speed sign is a fact. How can that be copyrighted? How can it limit the rights of others who observer the sign to publish its location and speed limit?

    If anybody were entitled to copyright a speed sign, it would be the authorities that put it there and who actually own it. How can the location of other peoples property be copyrightable? Looks like somebody took the concept one step too far...

    1. Re:Silly copyright notice by bugnuts · · Score: 0

      Obviously it's the capture that's copyrighted. Certainly it's ambiguously stated, but did you really not understand it?

      And facts can be copyrighted. The sun rising over a meadow is a fact, but a picture or drawing or recorded description of it is copyrighted.

    2. Re:Silly copyright notice by FaxeTheCat · · Score: 1

      Obviously it's the capture that's copyrighted. Certainly it's ambiguously stated, but did you really not understand it?

      As it is a legally binding statement, it needs to be unambiguously stated. Also, if each and every fact is copyrighted by the "discoverer" that would place some severe limitations on the information, as each and every copyright holder would need to accept changes to the use of the data. This means that as an example replicating the data may require a license from all copyright holders. From the posting, it appears that this is not the case, so the copyright has no value whatsoever.

      And facts can be copyrighted. The sun rising over a meadow is a fact, but a picture or drawing or recorded description of it is copyrighted.

      There is a difference between a fact (the sun rising at a certain time at a certain place) and an artistic description of it (picture or written description). It is the artistic description that is copyrighted, not the fact. So your example is not relevant.

    3. Re:Silly copyright notice by philip.paradis · · Score: 1

      The statement itself isn't what's legally binding. Unless explicilty stated otherwise via assignment to the public domain, copyright protection for produced works (such as photographs) is automatic in the United States. As for the rest, you're simply being pedantic, and you got upset when you were called on it.

      --
      Write failed: Broken pipe
    4. Re:Silly copyright notice by FaxeTheCat · · Score: 1

      The statement itself isn't what's legally binding. Unless explicilty stated otherwise via assignment to the public domain, copyright protection for produced works (such as photographs) is automatic in the United States

      As the position and speed limit of a speed sign is not an artistic expression, your post actually support what I wrote.

    5. Re:Silly copyright notice by Anonymous Coward · · Score: 0

      I've wasted 20 seconds of my life reading this thread. The image you post is copyrighted, by you, unless the website says differently. It has absolutely nothing to do with what is actually pictured. The fact that it is an image of light rays at a particular instant in time, serialized into a stream of bytes that is then interpreted by a computer monitor to reproduce said rays of light on demand, means it is copyrighted.

    6. Re:Silly copyright notice by FaxeTheCat · · Score: 1

      No pictures are posted to the site, so how is your comment relevant?

    7. Re:Silly copyright notice by philip.paradis · · Score: 1

      You're absolutely wrong. Please look up some actual caselaw before continuing to demonstrate your ignorance. I'd invest 15 minutes of my life doing this for you, since you're apparently incapable of doing it for yourself or you presumably would have already done so, but at this point it seems I'm wasting more of my life than is justified by even replying to your post. HAND.

      --
      Write failed: Broken pipe
    8. Re:Silly copyright notice by FaxeTheCat · · Score: 1

      As you seem to have superior knowledge:
      How can the position of a speed sign be copyrighted? How is the position of a speed sign a "produced work"? What is not a fact about it? So far nobody have claimed that a fact can be copyrighted, so is your claim that a fact can be copyrighted?

  17. First poster to use the word "cloud" by Rogerborg · · Score: 2

    Gets a thousand years of bad karma.

    --
    If you were blocking sigs, you wouldn't have to read this.
    1. Re:First poster to use the word "cloud" by bugnuts · · Score: 2

      So if you have a Chinese accent don't repost the title.

  18. Similar to phone directories by beinsvein · · Score: 2

    I've been working with phone directories for a few decades, where many companies are in basically the same position that you are - making a living from public information. Most data is collected from phone companies that dump their customer databases to the phone directory companies. This process and the associated tariffs are regulated by law. This data must be processed and cleaned up before it is passed on. Then there are data consumers - in the old days these were people reading the phone books. These days, data consumers are people browsing the web and all sorts of web apps that connect to the phone book through one of several apis. Most telephone directory companies provide search apis for their databases - usually not for free. Everything is a one-way street, of course. Information flows downstream, money goes upstream. No phone directory company that I know of will voluntarily mirror their database to anyone. Search APIs, yes. Mirrors, no. Phone directories are sometimes distributed to consumers and businesses on cd/dvd, but never without at least an attempt at scrambling and restricting its usage. You could probably make a business for a while selling an open, mirrored copy of your database. People will pay for subscriptions. The problem is, any one of your customers could choose become your competitor at any time. The more successful you are the more likely someone is to do that. Maybe you can protect yourself legally, but most people prefer to lock their door even in jurisdictions where trespassing is forbidden. Competition in your area would be nice for everyone else, your customers as well as your competitors, so as a member of "everyone else" I should say go for it. But you're no dummy. You got your company name posted on slashdot after all!

  19. time delay by roman_mir · · Score: 0

    you want to be 'clearing house', so you want to be the main provider, if in fact you are that provider right now, then you can put a time delay on all new data that you gather, so that whoever mirrors it from you will get the data say 10 days after you collected it.

    Oh, and API.

  20. Darkcloud. by Anonymous Coward · · Score: 0

    Distributed and redundant p2p web server and storage application.

    Everyone is the site and your data! Or at least several dozen hashed pieces of it.

    Information goes in and never gets deleted Or maybe not without the master key. Which can be entered from any application copy running.
    Use all the tricks of spammers, botmasters, trojan writers, and virus experts.

    With one huge difference. It is opt in.

  21. World's most boring website? by tehcyder · · Score: 0

    What is the point? If you're too blind to read fucking traffic signs, how about not driving?

    --
    To have a right to do a thing is not at all the same as to be right in doing it
    1. Re:World's most boring website? by Anonymous Coward · · Score: 0

      Didn't visit the site, but I've long considered enhancements to navigation systems that know when you're speeding and not guessing based on the class of the roadway. Check out the 5 lights on a modern NASCAR race car.

    2. Re:World's most boring website? by BitZtream · · Score: 1

      Extra alerts at the right times are useful.

      I've seen signs disappear because someone ran over them.

      I've seen signs disappear because kids stole them.

      I've seen trees grow around signs and obstruct the view of the signs.

      These are DoT issues that should resolved ASAP, but until then it might be useful to know that the 45mph limit dropped to 25 suddenly due to being very near a school for the blind and oh, by the way, the untrimmed bushes grew over the sign. No locals bother to report it because they know the speed limit. My out of town behind on the other hand has no clue because the sign is behind a tree and I don't know I need to report it!

      This isn't a sign for telling you where you can avoid speed traps or something, this is a site simply republishing public data in an electronic format.

      --
      Persistent Volume manager for Kubernetes - https://github.com/dwimsey/openshift-pvmanager
  22. Affero GPL by Yvanhoe · · Score: 1

    Just like the GPL but also closes the loophole that allows you to use an open source tool in SAS without giving back. I would investigate this licence. Also, map maker usually put distinctive voluntary mistakes in their maps to prove when data has been copied.

    --
    The Wise adapts himself to the world. The Fool adapts the world to himself. Therefore, all progress depends on the Fool.
    1. Re:Affero GPL by BitZtream · · Score: 1

      So basically what you want to do is cause someone else to start a competing site so they can use the data in thier own way without being bound by you're silly selfishness?

      Putting up barriers to getting the data is a good way to become irrelevant.

      YOU WANT COMMERCIAL USE. Commercial users (some, not all) will commit changes back to the main site just so they don't have to maintain their own distinct database for that purpose. Ask OSM.

      If you disallow others, they'll just start their own collection systems. Sure, you''l have more data at the start, but if everyone finds someone elses database easier to use, they'll use it instead and it'll grow past your copy-hindered version eventually. Of course, then you don't get to be the site everyone goes to anymore.

      --
      Persistent Volume manager for Kubernetes - https://github.com/dwimsey/openshift-pvmanager
  23. Assist and Monitor your mirrors by Anonymous Coward · · Score: 0

    Since you stated that themirros would not sign a SLA, but you are the one that is concerned about the customer expierence,

    (a) make it easy for them to stay up to date with providing a proper API to retrieve updates

    (b) monitor the usage of that API and mail your contacts if they need assistance

    Finally you might want to publish the current state of all mirrors in public. You are trying to open your data, so do not rely on contractual enforcement, but transparency.

  24. "Your browser sucks, upgrade it." by Tim+Ward · · Score: 1

    You could fix that first.

  25. My advice? "Give up" by Anonymous Coward · · Score: 0

    Sorry, this probably isn't what you want to hear, but it sounds like you're wasting your time. Your desire to protect the data simply doesn't make any sense; the best way to protect data is to make it more open. Why are you trying to place restrictions on it? I would forget the project entirely and add maxspeed data to OpenStreetMap. The project is much larger than yours, and has much more impressive maxspeed results.

  26. Interesting, but why? by bwalzer · · Score: 2
    Why would anyone want to know the actual location of speed signs? Normally people want to know the speed of a particular road at a particular place. We already have a fairly popular version of that that in a Wiki form:

    OpenStreetMap

    1. Re:Interesting, but why? by PlusFiveTroll · · Score: 1

      There are a number of particular reasons. One I can think off the top of my head, at least in the state I live in, is that speed limit signs have to be displayed with in the rules of state law. If the sign is hidden by trees, to low to the ground, etc.. you can get the ticket dismissed in court. Also it can be used in defense when a cop tickets you incorrectly, 60 in a 55, when it actually is a 60.

    2. Re:Interesting, but why? by DerekLyons · · Score: 1

      That was my thought too... What the heck is the point of this exercise? Especially since you're supposed to be paying attention to the road anyhow.

    3. Re:Interesting, but why? by Adm.Wiggin · · Score: 1

      I wanted exactly this so that I could build some kind of app to let me know roughly what the speed limit of the road I am currently driving is, or at the very least, the speed limit sign I most recently passed. I found "Wikispeedia" as the only one really doing it with any degree of success, but the website is absolute garbage, the demos rarely work, and the API is a bleeding joke. There's not a whole lot to "protect" here. Increase quality (both website and data), then worry about whether or not your data is worth keeping safe from other people.

  27. You're doing it wrong. by Anonymous Coward · · Score: 0

    If you're worried mirroring MIGHT be a one way street, you don't understand the concept.

    Mirroring = multiple people hosting "reflections" of the same master data. It's used primarily for large data or files that consume a lot of bandwidth. You have mirrors to reduce the bottleneck of a single host to download from, and the network latency of being further away from the master.

    Mirroring != a collaboration mechanism or a crowdsourcing concept where anyone can contribute to any repo they want. Think (for example) of all the RedHat mirrors out there. They're not SUPPOSED to contain anything not in the master. This would be a PROBLEM with a mirror. People go to the mirror because they trust it's the same data they'd get from the master. And the master won't pick up stuff from the mirrors if they DO get something new in there. If someone wants to contribute, they contribute to the master. The master updates, and those updates flow out to the mirrors via replication so they're all in sync.

    It sounds like (from T very terse FA) like you're expecting "mirroring" to allow other people to collect similar data and share it with you? If so, mirroring is a bad choice. So help us out - why do you want this?

  28. torrent-based file system by Anonymous Coward · · Score: 0

    i've always liked the idea of a torrent-based file-system. tfs, if you will, to run on linux's virtual file system. a radical solution that would require major programming, but it would be pretty cool, imo.

  29. Ugh... by Anonymous Coward · · Score: 0

    First, you make an obvious play on someone else's good name (Wikipedia). Then you want to somehow protect public data that you have collected. Sorry man. Not sure what you're trying to accomplish but if it's not getting laughed at and possibly sued you're doing it all wrong.

  30. you should.. by Anonymous Coward · · Score: 0

    Go on Shark Tank.

  31. Speed limit signs? by Anonymous Coward · · Score: 0

    A real contender for the most worthless website on the planet. Ergo, it will catch on and make millions in advertising.

  32. Set it free - P2P data dumps, CPAN for data by Anonymous Coward · · Score: 0

    I've thought about similar problems for years, and you simply cannot restrict the flow information in the noosphere - there's no practical and ethical way to do it. If someone wants to, they can scrape your data from a myriad of proxies (very easily scripted), and then dump it on BitTorrent. In trying to prevent this, you'd only be hurting the convenience of your site, meaning someone else can build a more convenient site that organizes the same type of data and attract your contributors. By making it easy for people to export your data, you'll make your users more conscious of the value that you're openly providing, and thus more likely to donate. You need to let your babies grow up and move out, and be confident that if you've raised them right they'll appreciate it...

    In other words - release the data under a license that requests attribution, and ideally no other restrictions. You can always use Google / Archive.org / etc to timestamp and prove where the data originated. You can do that and still get all the credit and donations, because anyone trying to plagiarize credit will be easily exposed as fraud. We need to educate users to appreciate those who add value to their free software / data, and ostracize anyone who abuses that openness (like ad-filled Wikipedia mirrors that offer no added value). Search engines are gradually getting better at penalizing moocher sites, and rewarding the sites where the content originally appeared.

    As for the means of distributing data - I don't recommend allowing the general public access to database replication. It's much more effective to do an SQL / JSON / CSV data dump, compress it (PPMd usually offers better and faster text compression over xz), and distribute it over BitTorrent with HTTP fallback. You can do a full data dump every X days, and a partial update every Y hours, depending on your data size. (This also has the advantage of boosting your data proliferation / backup-fu.)

    What I would really like to see is something like pkgsrc / CPAN for data - an automatic way to install and update publicly available data sets into one's local database, with application of diff/patch files in sequence, automatic conversion between RDBMS systems (IMHO, PostgreSQL > MySQL), configurable filtering, BitTorrent downloading with configurable generosity of seeding time / bandwidth, etc. This would lead to a smarter and more reliable Internet, and make it impossible to censor sites like WikiLeaks or The Pirate Bay by making them so much easier to preemptively mirror.

    One interesting analogy is how this principle would apply to Slashdot comments. For a very long time now, I've been just drooling over the idea of scraping all Slashdot comments since time immemorial, parsing it to SQL, and putting a data dump on BitTorrent. There's plenty of room in them terabyte storage arrays, but 40 million comments averaging say 5KB per comment (with RDBMS overhead and indexes) is only 200GB - or about 8GB in a compressed SQL dump. This way I can tag all my "AC" writings and make sure they're preserved, since Googlebot only sees the highest-ranking comments (and, well, my copyfree advocacy isn't very popular), and make all my past writings searchable. But, well, I'm lazy... Hope someone else beats me to it... ;-)

    --libman

  33. NFL outfits favorite outfits of this beloved gambl by Anonymous Coward · · Score: 0

    Many counterfeit jerseys will cheap custom Nike NFL Jerseys have the backing still attached on the inside of the jersey Only you need to sacrifice the time and patience In addition, there are some sports jerseys cheap shops, in fact, the overwhelming majority of the people to customize and design their own basketball jersey opportunity, simply choose your own color palette cheap wholesale jerseys By going through the online shops that are there, you will be able to get the order the particular items that you want at a very cheap price You really can taste just how fresh the seafood used in dishes is, so make sure you sample some of the high-quality cheap customized nba jerseys shellfish such as crab, scallops, cheap custom Nike NFL Jerseys lobster and oystersIn this case, you will hardly find a jersey of Rodgers from the Minor League, but you can also order the Aaron Rodgers Jerseys of this kindLight transparent material or mesh is the next in fashion when it comes to men's t-shirt fashion Because of this, these customers interchanged NBA jerseys alongside Iran game enthusiasts Some of the important reasons for damage of these essential parts cheap jerseys are extensive use and age of your laptop If you would like to get the most ideal sites, NBA Jerseys it is as simple as listening to what your friends and even family have to say about the sites that they have previously dealt with Ohio, you vixen, anyone! And in case you hang out cheap nike nfl jerseys wholesale while you're watching TV while using true hockey fans, you'll begin to post a little nike nfl jerseys cheap every now and then There are numerous advantages to this type of movie gathering Genuine NFL jerseys can be divided into three categories, from low to high are: offset fan version (Replica lynch jersey Jersey), high-level fan edition (Premier Jersey / EQT Jersey), and player version (Authentic)

    If you custom jerseys cheap intend to have more children and use the jersey as a hand-me-down, buy only quality made items so the lettering does not crack on them Some of the tactics may consist of paid on the web advertising, promoting in cheap custom nike cheap sports jerseys nfl jerseys research engines, advertising on web sites, and even, posting in forums One of my favorites was the Ballet Studio, because I fostered my love for balletRelated ArticlesNew Euro 2012 Spain Home Soccer Jersey / Football ShirtEuro 2012 Football Shirts To Be More Popular Than EverWhich Euro 2012 Soccer Jersey Will Be The nike nfl logo tee images nfl jerseys cheap Bestseller?MOR

  34. Six Flags Great Adventure and Wild Safari, with it by Anonymous Coward · · Score: 0

    Maybe cheap authentic nike nfl jerseys not your fault, just because someone decided to cheap nike nfl jerseys wholesale go to the past and be in contact with a terrible turn Bob Ortiz Property Hat Even so lots of USA's participants instantly gives a powerful impulse, acknowledged people on no custom nfl jerseys cheap account situation FIFA points, dream to often be cheap nike nfl jerseys very helpful cheap mlb jerseys so that you could Iran athletic men and women, a fabulous desire could deepen cheap customized nba jerseys of brett lawrie uniform which knowledge involving youre a set issues In this economy however, it goes down to zero for some China offers plenty of cheap jerseys, which is actually in the store to buy their shares, but there are other packages which provide cheap custom Nike NFL Jerseys cheap, buy a larger unit They may have this support that you can need to have all day every dayawebd Coach NBA Jerseys Doc Rivers does not deny itRockstar & Pro Circuit/Monster KitsBoth the legendary Rockstar and Pro Circuit/Monster Energy team colors are available in CORE and PHASE kits so you can look like a factory rider and nike nfl jerseys cheap race like a champion History, heritage and a driving desire to cheap nike nfl jerseys create riding gear that enhances a rider's performance has made THOR the choice of champions for decades You can use this information to decide whether you should modify the ineffective website or stop running nfl jerseys cheap these advertisements There are so many options with Dish Network and dishnetwork channels deals G

    discount football jerseys marshawn Larger number of avid fans, the basketball jerseys nfl logo tee images is not just the playersSummer T-shirt Fashion Trend 2012 for MenThis time the summer elite nfl jersey fashion t-shirt trend of 2012 is cheap nike nfl jerseys perfect for creating a fashion statement Only cheap nike elite nfl jerseys you need to sacrifice the time and patienceLong ago, the sea between Jersey and France was a low, marshy plain with a few low hills, of which today only the Ecrehous reef protrudes above the waterline I bought from one shop online which is wholesale ch