Controversy Over San Francisco Public Transportation Data
paimin writes "A struggle is breaking out in San Francisco over whether the developer of a publicly-funded installation of real-time tracking for the San Francisco Municipal Transit Agency has a right to control the use of data from the system. The situation is not totally clear, but this sure seems like an attempt to use patent threats to hijack public data. The city paid for the system, and the developer claims he lost money on the deal, so now he's shutting down applications like Routesy and Munitime that use data from the system unless they license the 'copyrighted' data from him."
Just because you lose money doesn't mean you can change the rules after the fact. I guess this guy shouldn't have bid so low to get that contract.
My Photography - http://ian-x.com
The Deathlings (comic) - http://thedeathlings.com
The developer, at least in the linked articles, does not claim that it has lost money on the system. It simply claims to own the data and that it has licensed the exclusive rights (from another private company) to develop with the data. The question becomes, "So, OK, you have paid to develop this data, but why? It is, after all, public data."
It may be bus arrival times in San Francisco today, but this whole notion of data being exclusive property isn't new and isn't going away. And if Bilski stands and ends up partially undermining software patents, then I would hazard a guess that more companies are going to try monetizing the data aggregates and outputs. Even without Bilski as software becomes more of a commodity market, then data and data aggregates will become the value market.
This isn't a new concept. The public pays for scientific research at an institution of higher learning also funded by tax dollars, yet sometimes the only way you could get a copy of the results is pay for an expensive subscription to a scientific journal, which claims copyright on the published data.
This case probably isn't a good example and the developer trying to be the data gatekeeper is going to lose, but it's only the beginning. There will be more.
That's our life, the big wheel of shit. - The Fat Man, Blue Tango Salvage
I also seem to recall a few occations of similar stuff where workers stuff was claimed by their employers, also tended to go in favour of the employer, usually especially so because it was stated in whatever contract
There are, however, limits on those kinds of shenanigans. I worked as a developer back in the eighties for an outfit whose employment contract not only entitled them to ownership of any software or intellectual property that I developed while on company time (obviously I had no problem with that) but ANYTHING I did outside of work, even if in a completely unrelated field, for a period of FIVE YEARS after I left their employment. Naturally I refused to sign that little bastard until they fixed it to my (and my attorney's) satisfaction. Even so, I have the feeling there aren't many courts that would have upheld that contract, but I felt it was best to have the worst portions excised.
The place was run by chimpanzees anyway, with a couple of orangutans in the head office. Yeah, it was a game company, and as employers go they made Electronic Arts look good.
The higher the technology, the sharper that two-edged sword.
If the data could be copyrighted, ownership would go to the creator of the data. That would be the city of SF, not the programmer. They created the data with the software they contracted him to produce for them, then they ran their software on their hardware, watching their mass transit movements and recording the results on their computers. The programmer could not own the data because he could not create it. He has no mass transit system with which to do so.
In any case, it is highly unlikely anyone could copyright the data. Copyright requires at least minimal creativity. Data produced automatically requires no creativity. In addition, works produced by the government (ie. by the public for their own good via their chosen representatives) cannot be copyrighted.
The programmers actions are likely to be considered by the court (unless he backs down very quickly) blackmail. These days, if the actions threaten public safety, they might even be considered terrorism. Under these charges, even if he backs down the damage is done and he might well be looking at many years in prison. The SF DA could file such charges to scare him as they often do with other charges. But terrorism charges tend to go all the way through once the process is started. To prevent others from trying this stunt, they may well do just this. And I hope they do.
The contract may have given him the right to use the data. There's no doubt it my mind that it did not give him sole use, much less state that he also had sole control over its use. There's no way the SF city attorneys would have allowed that in a contract.
"I may be synthetic, but I'm not stupid." -- Bishop 341-B
Are you sure about that? Maps are databases.
Not in copyright lingo they aren't. Interestingly, maps were one of the original works mentioned in the earliest federal copyright act. "Databases" has a very different meaning; US copyright law has been loathe to grant protection to facts themselves. IIRC, the EU does have copyright protection (or some kind of protection) for databases.
"Anyone who [rips a CD] is probably engaging in copyright infringement." - David O. Carson
Even so, I have the feeling there aren't many courts that would have upheld that contract, but I felt it was best to have the worst portions excised.
Better a few billable hours up front than hundreds of billable hours in a court case later.
Is it sad that I am more likely to recognize you and your posts by your sig than your name or UID?
To me the author of the article is deliberately confusing public timetables with transmissions showing the position and expected arrival times of a bus.
If the position and expected arrival time is calculated on the fly, that's more of a service than just pure publically available data. If the condition of this service being provided is that the data is confidential or restricted to licensees. The provided data is processed real time using their equiptment and code. It's one thing to say "at 2:12pm the bus is 5 miles from transponder 2A, 1/5 mile from 4B and 8 miles from 1E" which is pure statistics (albeit collected from private equiptment), but to say "it's just left the anystreet stop and will arrive at noname plaza in 6 minutes in the current traffic conditions", could be seen as editorialising. If you're able to get as much of this information whenever you want, it then goes beyond fair use too.
An extreme argument of what the author is saying could be this: The fact that Michael Jackson died is public fact, a 400 word article going into the detail of how he died is copyrighted and subject to fair use restrictions. The interesting argument that applies here is, if that same news report was machine generated based on a few facts fed into it and the rest padded out through AI, could you copyright that?
IANAL, but I believe it also depends on how the facts are organized. If its a simple list by date or time alphabetical order, then they generally aren't copyrghtable. Being able to copyright that would prevent someone else from collecting the same facts independently and publishing them in an obvious order. If there's some creative addition to the data structure or organization then the database may be copyrightable, while the facts themselves are still not copyrightable.
I.e. someone could publish a database with some creative structure and relationships and you can't copy that and republish with the same structure, but if you just take the facts and organize them some other way you can do that.
Maps are the presentation of data, not the data itself.
If I pay to collect the data & generate a database that doesn't mean that I can be forced to give the data away. But also, I can't stop anybody else from collecting the data & making their own database. If you don't want to buy it from me go forth & make your own database
That's an interesting argument, and it's logical from where you're coming from.
But the copyright law comes at it from a different direction.
If you go to a lot of effort to collect data, that's commendable. In copyright law, that's called "sweat of the brow."
But in copyright law, you can't copyright data that you've collected just by sweat of the brow. It also takes some kind of creativity or innovation or judgment.
That's what the Supreme Court decided in Feist. Phone numbers can't be copyrighted. http://en.wikipedia.org/wiki/Feist_Publications_v._Rural_Telephone_Service
I met the inventor of NextBus some years ago at the Hacker's Conference. What they get from the bus is position, speed, and a few bits of data like the destination sign setting, "doors open" and "wheelchair lift deployed". After much crunching on this data, info like "Next bus at this stop: 6 minutes" comes out. Over time, as more data comes in, the predictions get better. It's a good machine learning problem, because you have actuals; you can tell when the bus eventually gets to the stop, so you have hard data from which to validate the prediction algorithm. You don't even need a map.
The early business plan for NextBus had a little dedicated receiver they were going to sell to consumers. The idea was that you have one at home, and it tells you the number of minutes until the next bus gets to the stop near your house, so you know when to leave the house. That was before the World Wide Web, so that wasn't necessary.
Originally, Muni management hated the system, because it was too honest about their bad service. But after much political effort, eventually it was deployed on a few lines, where it was very popular. Then it was put in everywhere.
Muni probably owns the raw data, and NextBus probably owns the predictions. I'm not sure on that, though.
But maps are "facts themselves." Unless, of course, you want to argue that there's "artistry" in deciding what kind of information to put on the map -- but then you could argue that there's exactly the same kind of artistry in deciding what to include in a database!
In other words, if the Copyright Act of 1790 made a distinction between maps and databases, then it was wrong. Either both should be copyrightable, or (better yet) neither should be!
"[Regarding the 'cloud,'] ownership was what made America different than Russia." -- Woz
You're absolutely right. The boundaries in copyright law are pretty fuzzy when it comes to things like this.
Maps were almost certainly included in the original act purely to appease mapmakers, not for any principled theory of what copyright should or should not apply to. Retrospectively, however, people justify maps by saying that they do involve artistry, and it's not totally unreasonable. You could even argue that more artistry is involved in making a map than in taking a photograph.
The most well-known case dealing with whether a collection of facts can be copyrighted is Feist. But the issue is complex, because while facts themselves aren't copyrightable, enough selection of facts can produce a result that is. What is "enough" is usually determined by a court case (or a settlement).
"Anyone who [rips a CD] is probably engaging in copyright infringement." - David O. Carson
After I moved to SF, I sold my car. No need for it here.
Unless, of course, you ever want to leave SF for any reason. I know there's some decent public transportation to surrounding areas, but it's far from comprehensive, especially after the sun goes down.
Speaking of public transportation, it sure would be nice to get BART up here (I live in Santa Rosa). Fuck you Marin County, fuck you very much.
Knowledge != Intelligence
Ahh, but in this case the data that is being collected is bus location information. The information the everyone wants access to is the bus route timing predictions. Those are created via some complex code that looks at the location information and other parameters and information to produce a prediction of when the bus is to arrive. This information may well be copyrighted as it is not "in plain view" but is generated via some creativity.
Now, if someone else were to generate their own predictions from the actual raw information that is one thing. But just redistributing the predictions generated by NextBus sounds like taking propriatory information and distributing it without the consent of those who created it.