Source Code & Copyright
cunamara writes "Patently-O has posted a discussion of Aharonian v. Gonzales . Aharonian is trying to build a database of source code as a repository of prior art. The interesting thing is in part of the decision, which is that "Conversely, if plaintiff independently creates software that is functionally identical to other software, he does not infringe any copyright on the other software's source code, even if his independently created source code is nearly identical to the copyrighted source code." Interesting. But how does one defend "nearly identical" independently created source code from a copyright infringement lawsuit?" I'm actually not as interested in the copyright side of things as I am in the notion of using something like that for prior art of software patents. The argument that source code is uncopyrightable, with some extensions could be applied to almost all, say, fiction stories since no one's written a truly new story in like five thousand years.
I'm not a big fan of "near-identical" copyrighting source code.. It's almost like copyrighting mathematical equations. The compiler creates a framework designed to achieve predictable results, and whatever results are achieved within that framework, isn't the invention of genius, but the application of an engineering language. It's cclearly wrong to rip off chunks of people's programming and sell it as your own, but if there's proof of linear progression of programming which achieves a similiar function using a similiar process within the programming framework, there's no reason the other's work should be thrown out, or licensed against the 'prior artists'. Intellectual property is going to be such a freaking headache if shit like this is allowed to continue.
Any fool can criticise, condemn, and complain, and most fools do. - Benjamin Franklin
the argument that source code is uncopyrightable, with some extensions could be applied to almost all, say, fiction stories since no one's written a truly new story in like five thousand years.
The idea is not what is protected under copyright, it is the work itself which is protected under copyright. Just because the idea implemented in a story (or computer program for that matter) has been done before, that does not mean that someones actual book, movie or videogame is somehow immune from copyright.
Patents, on the other hand... Well, let's not get started on patents...
For US citizens it important to get organised. FFII has an USA mailing list. Perhaps it might serve as a breeding ground for a US campaign which becomes equivalent to the EU campaign effort. Americans are perfect communicators in the field of software patents but lack anti-swpat organisation.
Currently the rest of the world suffers from the American unability to get anti-Software Patent interests organised.
Dismissed. The case is now on appeal.
The idea that something may not infringe copyright in spite of the fact that it is nearly identical, is a bit of a stretch. It is true sometimes. For instance, if there is a standard way of doing things then bits of code will be identical. On the other hand, for those bits of code that may be copyrighted, the statement sounds nonsensical. Remember, not all code can be copyrighted. Much/most/all the code SCO claimed was in violation of its (disputed) copyrights is not copyrightable.
For example, there aren't much variation in ways to code a doubly linked list. If a project in java needs one, you need to write it yourself, because it isn't in java.util.* yet. With a standard coding style in that language, I've seen quite a few near identical looking implementations for an assignment.
It's about time to stop suing over one snippet of code in a project - there are only so many ways to do the basic tasks. It's how you use the individual lego blocks to build something that counts - if you copy the whole design and claim it as your own, then you deserve to be sued, not for using five white ones to build a wall, as everyone does that.
Google Books seems like an ideal solution to this problem. Of course, I'd talk to Google about it first. Your source code repository would be transformed into book form with the source code as large excerpts and the revision control system being your chapter introductions. This would force the repository to be something organized and not just a mish-mash of inserted code. Their About page says that they'll show you a couple of pages. I would ask them to restrict the search to only showing the section introduction and a 15 lines surrounding the code in question. Google could then wrap an API around it to make it easy to programatically search.
Then, there's the issue of licensing. This would be, I think, the first legitimate use of the GPL (not the LGPL) for a published document. Google promises to protect the work as a dark search until valid copyrights expire. If you put a hypertext link into each section where the code can be properly licensed (i.e. downloaded), then it works as a prior art repository and as a code reuse archive.
The problem with software patents is not copyright it's trade secrets. The source code is never released, so no database of prior art can cover any closed source software. The more innovative the algorithms, the more likely it will be strongly protected with tradesecrets and the less useful a prior art database would be.
Not only that, the source code isn't always a good description of an algorithm which is why every project I've ever worked on has lots of comments and documentation delivered with it.
So I don't see what the point of building a database of prior art actually achieves! How is it different from the GNU libraries? They're partial coverage of software available in sourcecode form too.
If every bit of code was copyrightable, even a "Hello World" program would be a copyright infringement if it were copied out of a book and posted to the web. In this context, it is easy to see that not everything is eligible for copyright.
Every bit of originally created source code is copyrightable...although in many cases code is copied from a public, common, source, like "Hello World".
For infringement to take place, you need to demonstrate that copying took place, that is, that the accused copier had access to the original and used it. Even if the source code is nearly identical, it does not mean there was infringement. You need to establish the copier had access to, and used, the original to create his copy.
I'm not sure a repository is useful for copyright issues. Those are proving minor, anyhow. For patent issues it would be very powerful, but there is another problem. The USPTO doesn't check outside the application and patent database. That is, if something HAS prior art, but that prior art is not patented or included in the application, then the patent examiner will grant the patent anyway in ignorance. The burden then falls on the holder of the prior art to establish that it is prior art. Which means hiring lawyers, litigating a case, etc. It is a PITA. And this is one of the principal ways the system is borked. Patent examiners have no means by which they can access prior art that is not in the system.
Two decades ago when doing stupid things with neural nets was fashionable in computer science, I built a neural net C compiler. Odd thing is it worked on small programs so I expanded it.
...
Its parser would takes code of the form foo=foo+bar; and reduces it to foo+=bar; or other minimal C with translation to var1+=var2; It would then hand that off to the NN compiler. It then ran every bit of C code I could find through it. Its interesting that there were only about 160 (if I remember right) common statements that appeared more than once and most of them were followed by a very limited subset of other statements.
If you reduced a program another step into:
common_line1;
common_line23;
common_line7;
It ended up that many bits of code where exactly the same in many programs or had very small differences.
The most interesting stat was most C used less than about 100 common statements but the guys at Bells Labs added about 40 (of which I think Joe Ossanna was responsible for 30 or so) and BSD guys added about 10. The IOCCC entries didn't change the results but I don't think the compiler ever got any of them right even after a cb and extra reduction step which says something about their code.
The argument that source code is uncopyrightable, with some extensions could be applied to almost all, say, fiction stories since no one's written a truly new story in like five thousand years.
The difference is that programming languages are usually pretty logical and to achieve an aim there's usually an obvious and correct way of solving a problem. For example if I asked a collection of programmers to write a function to sum the elements of an array it would inevitably look like the following (for C at least).
int sumArray(int array[], int elements) {
int i, t = 0;
for (i = 0; i elements; i++) t += array[i];
return t;
}
There would be variations but everyone would essentially write the same code.
When writing literature, writers are restricted by the language, but for some they are extremely flexible and the same concepts can be written about and result in a completely different book.
It's the same for any art. The Queen of England has had hundreds of portrates painted and yet they are all very different depite the use of similar materials. Yes the basic subject is the same but you cannot say the paintings are the same. Coding is more like photography.
At it's extreme source code is a mathematical description of an algorithm. It's either write or wrong. I can't see how you can copywrite it any more than copywriting 2 + 2 = 4.
What they wrote ended up having large bursts of code that was identical to the IBM PC BIOS. Sometimes there is only one good way of doing something.
Well, this is what I remembered reading years ago. It was an unusual exercise because the actual amount of code was small, so the potential legal cost per byte was very high. If there is someone out there who actually was part of this project, maybe they can post their experiences, and say whether I have got it vaguely right.
Copyright protects the expression of an idea, not the idea itself. It is the expression of the idea which creates value for the copyrighted work. Anyone can write a 4-bar blues progression in a-Major, just don't rip off B.B. King's lyrics or melody while you're doing it. We become richer, intellectually, as a society when creators are forced to think beyond what's already been done, to create their own expression of common cultural ideas, not by letting a bunch of hacks monkey around with things which they would otherwise not be able to create on their own.
I haven't heard of any case where copyright was involving prior art defense.
Normally it's related to patents.
IOW. Person A written Program A to do the Task A. Person B written Program B to do the Task A. If task is the same there are very chances that the programs will be quite similar.
Now, from point of view of copyright law there are two absolutely different programmes - implementations of probably the same algorithm to solve the Task A. (Competition is good, isn't it?)
But, when patents get's involved, picture becomes more obscure. If Person A holds a patents for the algorithm of Program A (and since patents by definition "transcends it all" and disregards copyrights) implementation of Program B whilst having no relation to Program A nor to the Person A is in legal crux. (Here prior art starts playing role.)
Copyright protects person's work. Patent protects person's idea.
Two people might have come to the same idea (first to come entitled for the protection). But how it could be that two people independently made the same work? (e.g. book, picture, poem, etc) It's lunatism or what???
Specifically, when applied to software, prior art make no sense whatsoever. Modern obfuscation tools allow people to mask the original code. Was it stolen or written from scratch - one would never guess. (Obfuscators are normally applied to commercial Java programmes to make reverse engineering harder).
P.S. In my experience, when two commercial programs have same peice of code, it usually means that it was lifted from BSD. I yet to encounter single example when one software company stolen something from another. Average quality of commercial code is quite low - it's not worth been stolen. And when you see clean, well made code, rest assured: people behind the code are connected to Open Source. Open Source has to have higher quality - just as in normal life you would try to *NOT* show anybody you dirty undies.
All hope abandon ye who enter here.
and thank you Hemos for displaying your ignorance on the front page.
This is exactly the crucial difference between copyrights and patents.
A copyright restricts you only from copying the work in question. There is absolutely no restriction on coming up with the same work independently, and using it. Thus like George Harrison's suit mentioned in the sibling post, many copyright suits depend on showing that someone did / didn't have access to the work in question.
A patent on the other hand gives the holder the exclusive right to an invention or idea. Like the other guy who invented the telephone independently of Bell, you will have absolutely no rights to your own invention if it has been previously patented, for the life of the patent anyway.
A defence of independent discovery works for copyright infringements, not for patents. This has always been the case, so I'm not sure why it's news today.
my password really is 'stinkypants'