Controversy Over San Francisco Public Transportation Data
paimin writes "A struggle is breaking out in San Francisco over whether the developer of a publicly-funded installation of real-time tracking for the San Francisco Municipal Transit Agency has a right to control the use of data from the system. The situation is not totally clear, but this sure seems like an attempt to use patent threats to hijack public data. The city paid for the system, and the developer claims he lost money on the deal, so now he's shutting down applications like Routesy and Munitime that use data from the system unless they license the 'copyrighted' data from him."
It's called 'renegotiation'.
Others call it blackmail.
Whatever. he dude's playing a dangerous game.
The question becomes, "So, OK, you have paid to develop this data, but why? It is, after all, public data."
This gets into the contracts and the "data rights" agreements. For example, there are a few different contracts that can be set up even when a government pays a company to develop an application.
No Data Rights: The customer (government) buys the application and can use it as is. The customer gets no detailed information, source code or redistribution rights, just an end product.
Trade-off: The developer charges less for development as they believe they will be able to sell it elsewhere or further develop it as the sole source.
Limited Data Rights: The customer buys the application and has full access to the detailed information, source code, etc. However, it can not be redistributed for a number of years (say, 5). After that number of years, the customer has full data rights.
Trade-off: The developer charges slightly more for development, as they will not have a monopoly on the product after a few years.
Full Data Rights: The customer has full access to everything necessary to duplicate and modify the product immediately.
Trade-off: The developer chargers more as they can not guarantee that they will make any further money off of the product.
It's like professional photographers. It's a picture of you, but if you want a copy, you're going to have to pay for it. If you want the negatives, you're going to have to pay for those as well. There are further variations that combine these ones, but they give you an idea of the three types that get modified for an actual contract. From the article, it sounds like NBIS is trying to claim that SF doesn't have the data rights to redistribute the information beyond a specific set of applications/methods. To figure out what the truth is, we would need to read the contracts.
Fly me to the moon Let me sing among those stars Let me see what spring is like On jupiter and mars
Look at the outfits that monetize the NOAA's data: that's public information as well. The NOAA was "publishing" this information in a very complicated binary format, and these outfits were making a ton of money in converting it to other purposes. I remember reading here on Slashdot a couple years ago that the government was thinking of making weather data available in XML or some other standard format, and that a couple of these outfits went after them in court to try and prevent it (thereby preserving their distribution lock.) I don't know what the eventual outcome of that was.
The higher the technology, the sharper that two-edged sword.
They lost, and they lost rather completely.
Here's a starting point for exploring some of this data. There's probably more places where this data is available from the NWS in very open formats, and I believe more is to come.
http://www.weather.gov/rss/
If the data could be copyrighted, ownership would go to the creator of the data. That would be the city of SF, not the programmer. They created the data with the software they contracted him to produce for them, then they ran their software on their hardware, watching their mass transit movements and recording the results on their computers. The programmer could not own the data because he could not create it. He has no mass transit system with which to do so.
In any case, it is highly unlikely anyone could copyright the data. Copyright requires at least minimal creativity. Data produced automatically requires no creativity. In addition, works produced by the government (ie. by the public for their own good via their chosen representatives) cannot be copyrighted.
The programmers actions are likely to be considered by the court (unless he backs down very quickly) blackmail. These days, if the actions threaten public safety, they might even be considered terrorism. Under these charges, even if he backs down the damage is done and he might well be looking at many years in prison. The SF DA could file such charges to scare him as they often do with other charges. But terrorism charges tend to go all the way through once the process is started. To prevent others from trying this stunt, they may well do just this. And I hope they do.
The contract may have given him the right to use the data. There's no doubt it my mind that it did not give him sole use, much less state that he also had sole control over its use. There's no way the SF city attorneys would have allowed that in a contract.
"I may be synthetic, but I'm not stupid." -- Bishop 341-B
To me the author of the article is deliberately confusing public timetables with transmissions showing the position and expected arrival times of a bus.
If the position and expected arrival time is calculated on the fly, that's more of a service than just pure publically available data. If the condition of this service being provided is that the data is confidential or restricted to licensees. The provided data is processed real time using their equiptment and code. It's one thing to say "at 2:12pm the bus is 5 miles from transponder 2A, 1/5 mile from 4B and 8 miles from 1E" which is pure statistics (albeit collected from private equiptment), but to say "it's just left the anystreet stop and will arrive at noname plaza in 6 minutes in the current traffic conditions", could be seen as editorialising. If you're able to get as much of this information whenever you want, it then goes beyond fair use too.
An extreme argument of what the author is saying could be this: The fact that Michael Jackson died is public fact, a 400 word article going into the detail of how he died is copyrighted and subject to fair use restrictions. The interesting argument that applies here is, if that same news report was machine generated based on a few facts fed into it and the rest padded out through AI, could you copyright that?
If I pay to collect the data & generate a database that doesn't mean that I can be forced to give the data away. But also, I can't stop anybody else from collecting the data & making their own database. If you don't want to buy it from me go forth & make your own database
That's an interesting argument, and it's logical from where you're coming from.
But the copyright law comes at it from a different direction.
If you go to a lot of effort to collect data, that's commendable. In copyright law, that's called "sweat of the brow."
But in copyright law, you can't copyright data that you've collected just by sweat of the brow. It also takes some kind of creativity or innovation or judgment.
That's what the Supreme Court decided in Feist. Phone numbers can't be copyrighted. http://en.wikipedia.org/wiki/Feist_Publications_v._Rural_Telephone_Service