IP Theft in the Linux Kernel
"They just took my code and filed off the copyright" said Søren. "This is clearest with the two header files hptraid.h and pdcraid.h. Compare these with FreeBSD's ata-raid.h, and just look at the similarities." And it's true that these two header files certainly look like a chopped up copy of the FreeBSD header, after a quick search-and-replace. "The reading of the RAID config from the disks is their own code, but is clearly "inspired" from our code," said Søren, "but that's encouraged by the license. It's the verbatim use of the other code without retaining the copyright that's the problem."
ata-raid.h, and the other files, are copyright Søren, and released under the three clause BSD license, which includes the restriction "Redistributions of source code must retain the above copyright notice". So using these files, or significant portions of them, in your own code, without retaining the copyright information, as has happened here, is prohibited.
You may be thinking "This is only a couple of header files, what's the big deal?". As Søren says "The problem here is that the structures in the headers is the whole story. That info tells how you read the proprietary struct off the disks, and was reverse engineered and documented by me after a lot of effort." Søren's intellectual property is tied up in those files.
Right now, Søren is in discussions with the authors of the Linux ATA drivers (employed by RedHat) to ensure that his copyright notice is returned to these and other files, and to ensure that this situation does not recur. And it is hoped that an amicable solution can be reached.
this is crazy, linux developers need to give props where props is due.
E.
-
This Post has been brought to you by the letter "E".
-Legion
Of course, I wouldn't propose that we allow violations of open source liscenses to continue unchecked, just that the opportunity for good faith resolutions be allowed before crying "Boycott!".
This should really be addresses as a wider issue in the Linux community. While we all place great importance on the 'open-source' movement, we also need to ensure that Linux polices it's own code-base and keeps itself in compliance with the GPL, and other license-of-the-week trends.
We must try and validate our work in in the eyes of the corporate (and IP-trigger-happy) environment that we are trying to penetrate if we want to get accepted as a viable option.
hmmm, where will we find this kind of un-attributed code violations next? I sure as hell don't want to have Microsoft breathing down my neck because someone recycled propriatry code and invited the bull into the china shop.
food for thought
(caffine for action)
"If I wanted your input on my pet project, I'd stick my hand up your ass and use you like a sock-puppet." - Muse
Developers give all kinds of reasons for developing free software -- noble spirit, peer respect, etc. -- but one of the big ones is all the shit you don't have to deal with.
... but the odds are with it.
... easy. Yay!
Case in point: there is every reason to think that this author's name will be included with his code in the next release of the Linux kernel source. Think how vastly different this situation would be if this were about theft of proprietary code. Here, nobody's company is at stake, and nobody stands to lose by doing the right thing -- so there are no stupid lawsuits and no hard feelings. At least, I hope it plays out this way
Forget all this paranoia about the venemous GPL. Proprietary code has a really, really high cost of ownership; at a certain point, it's just not worth it. Free is just so
Bravo to Soren: he wants credit for the hard work he did. I 100% agree that it should have been done and is deplorable that it wasn't.
I would like to point out though that there is a strong argument that it was precisely that hard work rather than intellectual property that was stolen. Bear with me, and no knee-jerk mods please:
(1) A structure is just that: a structure. If there is intellectual property there it is in the original designer of the structure.
If this was a structure in nature (such as the human genome or what have you) then there are plenty of people who disagree with it being anyone's IP at all. Unfortunately, in the wisdom of capitalist democracy some people think that they *own* all of our tomatoes.
But this isn't nature, and someone did plan and write these structures and deserves credit. And Soren deserves plenty too for figuring it out and giving it to the world.
(2) You could say that his comments are IP, and that's a pretty strong argument. So perhaps there is more than just good old hard work here. However, it's possible these are just titles of the data structure elements, and titles aren't exactly covered by the same IP standards as other IP.
Oh well. I don't want to take away from the important work, and certainly nothing from Soren's credit. Just some food for thought.
The BSD people did the same thing with the bttv driver. As long as you don't copy verbatim, as NVIDIA did it (they even left the comments in) and claim that this is all your property, you can't really say anything. Some things have to be coded in a certain way, especially drivers. You can't do it differently if you want to access the hardware.
I haven't seen the code segments they are talking about, so I don't know how far the copying goes, but if it doesn't go beyond what is required by the hardware you can't complain too much. If they learned it from the BSD code how to access the hardware they should mention it somewhere,though. Not that people always do that.
Marcus
***Quis custodiet ipsos custodes***
When did Slashdot suddenly become "The Place" to
complain about license and copyright violations?
"Oh my god, its a license violation! Get
Slashdot on the phone IMMEDIATELY!"
Surprised I havent seen a "do , or we'll post
about you on slashdot" yet.
In this case, I agree with the author of the
code about getting proper credit for his work
since it was reused - but all of these GPL/
license/embedded linux stories lately are
getting tiring.
BRING BACK THE QUICKIES!
IP is important. Copyright is important. Licensing is important. Unfortuantely defenders of all these things are often cast in a bad light because of a perceived association with other groups who misuse these tools.
Just my 2c
RFC2119
This is a widely recognized fact in the computer book business, where it's known that no matter what license or restrictions you try to apply to your source code that's included with the book, people will treat it as if it's public domain.
The open/free zealots love to speculate in public about how much of their code has been stolen by MS and other big, bad companies, but how many more incidents like this do you think you would uncover if you ran an enormous pattern-matching check against all the open/free projects? My guess is you'd find quite a few.
The kernel with the offending code was released today. It was noticed today. Wait for the response before bundling all your (well founded) anger and firing it at the linux crowds. I mean seriously, give 'em a chance to respond to the problem before condeming them for it. I suspect this was an honest mistake by everyone except the guy who tried to slip it in.
I for one hope they pull the kernel down now and rework it without the offending code, or not put it back up until sorenson is satisfied with the result.
Ctimes2
My cube. My friend. My solace. My prison.
It's probably been stated, but...
With a name spelled like that, it is quite possible that Søren lives/works outside of the US, and nobody gives two farts about the DMCA where he is.
Also, I believe that the DMCA only outlawed the breaking of protection or encryption or some such.
Jesus was all right but his disciples were thick and ordinary. -John Lennon
Not that it is material to this argument, but how the hell would you know?
Even if you are/were an MS employee, am I to believe you have read all of the source code that is/was Windows and can authoritatively say that it is all properly attributed?
Please!
As you state, involving MS in this is LAME as it has nothing at all to do with them, but so is your defense of MS (especially since it doesn't apply). You probably should have submitted your second paragraph alone.
...yellow number five, yellow number five, yellow number five...
The structures do look similar, and if the Linux headers were copied then I hope they smack the guy responsible and reinstate the copyright notice. If the files were cut-n-paste copied it should be possible to nail this down, and copying something this cut-and-dried is stupid enough to merit a serious LARTing.
OTOH, if you give two programmers the same specs for a data structure and they have to follow the same coding and indentation style, you're likely to get two very similar structures, right down to the names in obvious cases, even if they don't copy each others' work. The fields themselves have to be specific types in a specific order because that's the way it's laid out on disk, and the coding style's pretty much fixed by the Linux kernel coding standards, and things like dummy_1, dummy_2 for filler fields are pretty standard (those're what I'd pick without seeing any other code, for example), how much variation in the structures is actually possible?
For a real-world example, look at any two independent implementations of the CRC32 algorithm. They're probably identical in everything but some variable names and indentation, because there's only one really fast way of writing that algorithm and everybody uses it automatically. Nigh-identical code, no copying done or required to get it.
"I'm speechless. THis sort of thing shouldn't happen. Give the guy his due credit. Now let's move on."
If it really *had* been done in Windows, and someone found out, I bet people here would be screaming for blood, waving the evil empire flag, and talking about how only an MS employee would do such a thing.
I think the main difference here is that we actually have confidence that this problem will be fixed, which is a confidence that we would not have if Microsoft had been the perpetrator. If Microsoft had done it, we'd be out for blood because we'd HAVE to be out for blood in order to get a result. We'd have to be screaming to the heavens to get any form of popular media possible to listen to us, in order to convince Microsoft to do the right thing. Conversely, we trust Linux developers, and we're confident that they'll do the right thing in the end, so we really have no reason to be out for blood.
"It was plagiarism -- essentially they took some of Soren's parts (which were free for the taking), filed off the serial numbers, then stamped their own on."
I can't agree with you here. Søren's code was not free for the taking. If it was in the public domain, it would be free for the taking. But this code was not in the public domain, it was distributed under the BSD license. It's only free to use if you abide by the terms of the license!
Imagine the uproar if I went around using GPL'd code in my proprietary applications, as if it was "free for the taking"!
Ok, I have had it. From this point on, I'm going to moderate these "Slashdot hypocricy" posts as redundant. It has been said a million times, and it's true, but it's not news to anyone. If you have actual insightful comments, make them.
Patrick Doyle
I mod down every jackass who puts his moderation policy in his sig. Oh, wait a sec....
I just said that we don't know the full story.
Any number of things could have happened that led the developer to ultimately violate the BSD license without being aware of it.
Ruling out the possibility is completely naive. Somehow I don't think stealing BSD code to include into Linux is all that foolproof of a devious plan -- leading me to believe that it's much more likely an accident. What possible motive could he have had?
Do you really think the developer said to himself "It is clearly worth risking my reputation by violating the easy-to-comply-with BSD license for my own personal gain of giving code away for free!"?
So yes, 10:1 that this was an accident. I'm not ruling out the possibility of malice, just that it's a lot less plausible.
Suppose Bob writes an open source program. Then along comes John and examines Bob's program, and learns crucial things from it. Such as how the frobulator encoder works. John then writes his own program which has a frobulator encoder, whose concepts are influenced heavily from what he learned by studying Bob's work.
At what point is John stealing Bob's work?
This is a loaded question. (Just like: When does life begin, at conception or birth, or where inbetween.) Except our question here isn't quite as emotionally charged. (Well, maybe it is for us.)
Back in 1979, I would help other students with their programs. Sometimes after making sure they understood the algorithm, and were writing the code, we would end up with what basically amounts to my design. Should I just make sure that I use different variable names? Should I introduce frivolous structural changes to the program so the instructor doesn't think someone is cheating? (Of course, I became so notorious with my instructors that this problem never came up -- they knew me well enough.) And the other student did end up actually accomplishing the learning.
Returning to my above example. Should John make sure to rename the members of the structure? Alter it stylistically? After all, Bob did the hard gruntwork. In some sense Bob should get credit. What if Bob doesn't want to license or give any permission? Can Bob withhold the know how of how the frobulator encoder works -- especially if it is embedded within open source?
Cearly, the ideal thing would be for John to contact Bob. But this takes time and effort. If John had simply renamed identifiers and altered the style, would an issue ever be raised on Slashdot in the future? (Even if Bob someday examined John's code and noticed the similarity, of concepts, if not actual cut&paste lines?)
And as I first stated, I haven't examined the sources, and this may be a very clear case of cut&past without any credit given. These questions are intended to be hypothetical. Any resemblance to actual persons or events is purely cooincidental and unintentional.
Those who would give up liberty in exchange for security and DRM should switch to Microsoft Palladium!
Don't get me wrong, if they did copy the code and remove the copyright, that's a bad thing(tm).
But Microsoft doing the same thing would be worse.
Taking some open source code and releasing it as open source forgetting about the credits is not exactly the same as
taking open source code proprietary and not even bothering to mention where it was taken from.
This message is provided under the terms outlined at http://www.bero.org/terms.html
This is not what copyright laws protect. Copyright laws protect "works of authorship", i.e. some kind of individual creation. Facts, such as the information how some information is organized on disk or even algorithms are not protected (hence the whole patent issue), at least not by copyright law. (See e.g. Copyright FAQ - What is copyrightable?.)
If someone would create a header file from the same information, it would probably look extremly similar. This is a good indication that the header file is not a "work of authorship".
On the other hand, if the author used something - be it code or only information - from Søren, it would at least be fair to give proper credit.
Claus
Au contraire. Compare the following two snippets of code, taken arbitrarily from one of the other raid header files in the kernel:
struct m {
int a;
int b;
kdev_t c;
int d;
* State bits:
*/
int e;
int f;
int g;
int h;
};
And:
struct mirror_info {
int number;
int raid_disk;
kdev_t dev;
int head_position;
* State bits:
*/
int operational;
int write_only;
int spare;
int used_slot;
};
Those are the same exact structure, no? Exact same data types and everything. I even left in the comments. Now, which of those would you rather have to program with? A structure is *not* just a structure; different source codes for the same structure can be of radically different usefulness. There's definitely intellectual property there.
> In fact, has anyone heard anything from the responsible linux developers about this? It seems they've already been tried and convicted being evil, stealing code and "stripping off copyright". Although the latter might me legally true, I doubt this was their intention.
I don't know about you, but if these were responsible linux developers, they would have left the header file intact and appended their comments to the file pertaining to the GPL as well. If you strip out information that is requested/demanded not to be removed, then the developers/code stealers are responsible for their actions and must be dealt with accordingly. Re add the headers or face copywrite infringment.
*Headline News* censorship shuts down the Internet! More at 6PM!
This is clearly the fault of just one PROGRAMMER.
Does your boss see all the code you write, and if s/he did would s/he recognize BSD ATA code? Mine sure wouldn't.
Actually it is exactly the same.
Both violate copyright in exactly the same way. You just happen to "like" one way better than another.
Maybe its me, but something seems kinda strange with this situation. What he seems to be saying is "Reverse engineering is ok, but don't reverse engineer my code."
He didn't say that reverse engineering is okay only if he does it; he's saying that you can't use his reverse engineering, verbatim, without giving him credit for it. In fact, there wouldn't be any issue here at all if the ATA code monkeys in this case reverse engineered it themselves. They didn't do that, apparently.
Only two things are infinite, the universe and human stupidity, and I'm not sure about the former. (Einstein)
Playing fast and loose with contributions (or in this case plagarized code) is something that Open Source adherents do in the name of speed and "getting something done". The FSF is far more diligent in documenting contributions and making sure that copyright is signed over to prevent things like this from happening.
It seems likely to be that header file structure definitions are a functional description of how a piece of hardware works. And if that's the case, that information is no more copyrightable than the telephone book. And if it's not copyrightable, it's perfectly legal to remove the credits and license and redistribute however you want. Not right, mind you, but legal.
Looks to me like he's screaming about copyright infringement and/or license violations without understanding the limited scope of copyright.
314-15-9265
On the one hand, I feel that Soron deserves the credit for his work. Reverse-engineering is NOT trivial, and has been made much more difficult under recent US legislation (eg: the DMCA).
On the other hand, it's hard to villify the Red Hat coders for doing something the license permits, however unethical.
This is one reason I strongly support the GPL. It leaves no grey areas of relicensing and "IP theft".
However, Soron prefers the BSD license. This leaves two choices - debug the license, or accept that things like this can (and probably will) happen, precicely for the reasons I sympathise with him. It IS hard work, and copying is much easier than creating.
My last thought on this is that the notion of "IP theft", with "Open Source" code, should be impossible. We are either all working for a unified goal, or we aren't. If we aren't, then the very notion of "Open Source" is ripped to shreds, and the Bazaar is crushed by the weight of the Cathedral.
On the other hand, if we ARE working to the same ends, then there is no Intellectual Property, and thus there can be no theft of it. If we have abolished private, proprietary notions, in favour of an open, shared environ, then there is no property to steal. There is only a shared resource to access.
And this, dear readers, is the crux of all such arguments. Until people are consistent within their own minds on the issue of ownership, and until some sort of consensus is reached, you can expect the perils of IP to get worse, not better.
It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)
After a few grim moments of comtemplating actually buying and installing Visual C++, it occured to me that these things are probably defined somewhere in the mingw stuff. Sure enough, I found them all in various headers within the mingw package. I copied all these (and a bunch of other little win32 kludges) into a win32stuff.h file that I started including in the various .cpp files.
So did I cross the line? I copied a few dozen lines from various header files in the mingw package (I didn't mention in the file that I got them from the mingw project, but I probably should before I release the port to anyone). Did the the mingw guys copy this stuff from somewhere in all the stuff included by #include <windows.h> ??
Ok, I'll admit that a bit struct that represents the on-disk format of something that was reverse engineered is a bit more substantial than a bunch of constants... but calling it "IP Theft" seems to be leaping to some strong conclusions. Even if both programmers did their reverse engineering independently, aside from using different names, there's not a lot of different ways the struct could look. Even if the linux developer did look at the BSD header file to learn the data formats, how different could one expect his code to possibly be ?? If it's an algorithm with some creative implementation, I can see the accusation, but over a header file that simply documents simple facts seems a bit much. Sure, it can be hard work to get those facts by reverse engineering, but still, the "IP Theft" is simple facts (not really protected by copyright, in my limited understanding of copyright law... IANAL).
And finally, if Søren really does hope "an amicable solution can be reached", why's he turning this into a bunch of bad PR for linux and redhat ?? It's sounds to me like a case of getting mad and posting flames instead of cooling off for a day and thinking it through more carefully.
As far as my porting work for Nullsoft's really cool (SuperPiMP) installer, I hit a big block of very win32 specific code, CEXEBuild::do_add_file at the end of script.cpp. Unlike many of the other bits that I ifdef'd out, this is the one that actually puts the files into the install image, so I can't just chop it off. I will need to completely rewrite this using unix/posix APIs, probably using C library regex patterns instead of whatever wildcard matching win32's FindFirstFile does. I'll probably get back to porting NSIS in a week or two... I might even try rebooting and running it in windows a few times! And, I'm not going to lose any sleep over copying a few dozen constants out of someone else's header files.
PJRC: Electronic Projects, 8051 Microcontroller Tools
I don't know about Windows 2000, but I've got RTM Windows XP here. On the CD in the root directory is a README file. Here's some of it...
Generally, such names are viewed as not being creative, and hence creating compatible software is possible. I very much hope your view won't start getting adopted because it would endanger almost all open source efforts.
Hope this situation gets fixed. And I hope this was an accient merely. If it isn't, then we're facing a problem.
The community cannot possibly get any respect from the world if it's members do not respect plain simple licences. We need to obey other people's licencing habbits.
And finally, if Søren really does hope "an amicable solution can be reached", why's he turning this into a bunch of bad PR for linux and redhat ?? It's sounds to me like a case of getting mad and posting flames instead of cooling off for a day and thinking it through more carefully...."
And of course /. gets to run an "article" that generates hundreds of inflamed posts.
Nothin' like mass posts to keep the advertising rates up...
t_t_b
I'm on PJ's "enemies" list! Are you?
Read it again - this does not qualify as "work of authorship". It's merely a collation of facts or maybe discovery. There is no copyright on things that you discover. You can't collate a list of songs and call that an original work as much as you can collate a list of phone numbers or cataloging how a particular interface works.
There is no doubt to Soren's claim that he did lots of work, but it's not enough to get a license, and neither does it qualify as IP... he deserves credit for the work of getting it, but bully to actually claiming "original" work in putting together the interface description.
Note that he *did not* invent the interface, he catalogued its behaviour. Someone else invented the interface should get the credit of IP and licensing, etc.
This is almost certainly a simple error. The files in the example were obviously chosen for shock value.
They're just a couple of data structures. you have to give the elements meaningful names, and there arent many choices.
Im sure the attrib will be fixed. Now, lets not infight and get on with developing two decent Open Source OSes.
Really? Hrm...seems to be a taint of the we find and fix all flaws in record time belief that Open Source advocates hold so dear...maybe there are things that can't be done by the community.
Yes, I know record time isn't the same as instantly.
Writers imply. Readers infer.
What you just reffered to deals with software, not source copying. However, if you wanna apply that to this:
2 days ago Linux Kernel 2.4.10 was released. Already this problem has been caught and is being dealt with. I would say that that's pretty dang quick, wouldn't you? Imagine that a close-source company released some binary files that had code in it that they had "stolen". How long do you think it'd be before that problem was revealed? Definitely not 2 days.
You can mod your friends, you can mod your nose, but you can't mod your friend's nose.
Andre,
I think it is unfair to punish BSD for mistakes
of one pimple-faced undernurished infantile retard
coming by the nick of Nik. He is not representative
of BSD as a whole. I worked with a number of BSD
people and all of them were pretty decent.
-- Pete
As painful s it may be to some people, the truth is that BSD really s dying. This Søren guy realizes it, leading to his pathological rage. It is somewhat understandable, sort of like he bet all his money on black and the wheel came up red. Nvertheless, Søren's public temper tantrum is disgraceful. Søren's duplicity in this manner is a blot on the BSD community at large. Although the BSD community may be in decline with respect to marketshare and user base, it must not be allowed to decline in civility.
BSD is not dying. The KAME IPv6 stack, as integrated into BSD OS's, for prima facie example, is the reference for how all other OS will implement IPv6. BSD is already (and always has been) as dead as it is going to get: note the sarcasm.
--- Nothing clever here: move along now...