Slashdot Mirror


IP Theft in the Linux Kernel

Søren Schmidt was browsing through the 2.4.10 linux kernel source when he saw something that looked a bit familiar. Too familiar in fact. Søren is the principle developer of FreeBSD's ATA drivers, including FreeBSD's support for ATA RAID cards, and as he looked through the linux/drivers/ide/ files the sense of deja vu was overwhelming. Read on for more.

"They just took my code and filed off the copyright" said Søren. "This is clearest with the two header files hptraid.h and pdcraid.h. Compare these with FreeBSD's ata-raid.h, and just look at the similarities." And it's true that these two header files certainly look like a chopped up copy of the FreeBSD header, after a quick search-and-replace. "The reading of the RAID config from the disks is their own code, but is clearly "inspired" from our code," said Søren, "but that's encouraged by the license. It's the verbatim use of the other code without retaining the copyright that's the problem."

ata-raid.h, and the other files, are copyright Søren, and released under the three clause BSD license, which includes the restriction "Redistributions of source code must retain the above copyright notice". So using these files, or significant portions of them, in your own code, without retaining the copyright information, as has happened here, is prohibited.

You may be thinking "This is only a couple of header files, what's the big deal?". As Søren says "The problem here is that the structures in the headers is the whole story. That info tells how you read the proprietary struct off the disks, and was reverse engineered and documented by me after a lot of effort." Søren's intellectual property is tied up in those files.

Right now, Søren is in discussions with the authors of the Linux ATA drivers (employed by RedHat) to ensure that his copyright notice is returned to these and other files, and to ensure that this situation does not recur. And it is hoped that an amicable solution can be reached.

14 of 1,000 comments (clear)

  1. Er... by Legion303 · · Score: 4, Insightful
    Can someone explain to me *why* a developer would strip off copyright info? It's not like there are licensing fees; the guy just wants his code to be recognized and attributed. It doesn't make much sense to me...could it have been an honest mistake or a coincidence? (I'm not a programmer, so I haven't looked at the two files in question, which would mean nothing to me anyway.)

    -Legion

    1. Re:Er... by JWhitlock · · Score: 5, Insightful
      Can someone explain to me *why* a developer would strip off copyright info? It's not like there are licensing fees; the guy just wants his code to be recognized and attributed. It doesn't make much sense to me...could it have been an honest mistake or a coincidence? (I'm not a programmer, so I haven't looked at the two files in question, which would mean nothing to me anyway.)

      I think it was more of a matter of lazy programming than evil intentions. The header files define structures, a few constants, etc. They encode a bit of knowledge, such as data formats and the meaning of that data, but some people wouldn't consider it code. More of an interface description. Of course, if it was a document describing an interface, then most people would automatically consider the copyright to hold...

      It's a bit like other forms of online "theft". Some folks think that if you download the html for a popular site, remove all the text and images, and use the layout on their own site, then it's not theft, because the copyrightable parts (images, text) were removed, and only the framework retained. But, like HTML framework, headers are the work of the programmer, and any desired copyright should be respected.

      Again, I'm in the "simple mistake, fix it, move on" camp, and would like to add that Red Hat and the rest should add a line to their policy about reusing "open source" code, to retain copyrights.

      If Microsoft did it, I'd expect them to do the same, but Microsoft would probably do it to force the issue, make the EFF take them to trial to define the limits of open source, the BSD liscence , and the GPL liscense. That's the difference - this will be taken care of by peers, while Microsoft conflicts almost always involve lawyers. It's the difference between getting rear-ended by an honest citizen (with or without the insurance companies getting involved), vs. an asshole celebrity who thinks the little people should take their licks and not annoy the "important people" with trivial matters like car bills and possible medical expenses.

  2. Sets a good example by jekk · · Score: 5, Insightful
    Please folks, remember this the next time /. posts some s t o r y about a violation of the GPL liscense. Give them a chance, after it's been pointed out, to resolve things peacefully.

    Of course, I wouldn't propose that we allow violations of open source liscenses to continue unchecked, just that the opportunity for good faith resolutions be allowed before crying "Boycott!".

  3. Another argument for free software? by melquiades · · Score: 5, Insightful

    Developers give all kinds of reasons for developing free software -- noble spirit, peer respect, etc. -- but one of the big ones is all the shit you don't have to deal with.

    Case in point: there is every reason to think that this author's name will be included with his code in the next release of the Linux kernel source. Think how vastly different this situation would be if this were about theft of proprietary code. Here, nobody's company is at stake, and nobody stands to lose by doing the right thing -- so there are no stupid lawsuits and no hard feelings. At least, I hope it plays out this way ... but the odds are with it.

    Forget all this paranoia about the venemous GPL. Proprietary code has a really, really high cost of ownership; at a certain point, it's just not worth it. Free is just so ... easy. Yay!

  4. Credit must be for the right reasons by z7209 · · Score: 4, Insightful

    Bravo to Soren: he wants credit for the hard work he did. I 100% agree that it should have been done and is deplorable that it wasn't.

    I would like to point out though that there is a strong argument that it was precisely that hard work rather than intellectual property that was stolen. Bear with me, and no knee-jerk mods please:

    (1) A structure is just that: a structure. If there is intellectual property there it is in the original designer of the structure.

    If this was a structure in nature (such as the human genome or what have you) then there are plenty of people who disagree with it being anyone's IP at all. Unfortunately, in the wisdom of capitalist democracy some people think that they *own* all of our tomatoes.

    But this isn't nature, and someone did plan and write these structures and deserves credit. And Soren deserves plenty too for figuring it out and giving it to the world.

    (2) You could say that his comments are IP, and that's a pretty strong argument. So perhaps there is more than just good old hard work here. However, it's possible these are just titles of the data structure elements, and titles aren't exactly covered by the same IP standards as other IP.

    Oh well. I don't want to take away from the important work, and certainly nothing from Soren's credit. Just some food for thought.

  5. Hope it was an oversight... by Helmholtz · · Score: 4, Insightful
    I really hope this was simply a stupid oversight. I do think that too often people simply take licenses and plagerism very lightly. Often high school papers read like a poorly chopped and pasted encyclopedia, and rarely is anyting done to curtail this.

    IP is important. Copyright is important. Licensing is important. Unfortuantely defenders of all these things are often cast in a bad light because of a perceived association with other groups who misuse these tools.

    Just my 2c

    --
    RFC2119
  6. Re:And yet... by DarkZero · · Score: 4, Insightful
    There's only a few comments in here right now, but the sentiment seems to be:

    "I'm speechless. THis sort of thing shouldn't happen. Give the guy his due credit. Now let's move on."

    If it really *had* been done in Windows, and someone found out, I bet people here would be screaming for blood, waving the evil empire flag, and talking about how only an MS employee would do such a thing.

    I think the main difference here is that we actually have confidence that this problem will be fixed, which is a confidence that we would not have if Microsoft had been the perpetrator. If Microsoft had done it, we'd be out for blood because we'd HAVE to be out for blood in order to get a result. We'd have to be screaming to the heavens to get any form of popular media possible to listen to us, in order to convince Microsoft to do the right thing. Conversely, we trust Linux developers, and we're confident that they'll do the right thing in the end, so we really have no reason to be out for blood.

  7. Re:Jumping to conclusions.. by defile · · Score: 4, Insightful

    I just said that we don't know the full story.

    Any number of things could have happened that led the developer to ultimately violate the BSD license without being aware of it.

    Ruling out the possibility is completely naive. Somehow I don't think stealing BSD code to include into Linux is all that foolproof of a devious plan -- leading me to believe that it's much more likely an accident. What possible motive could he have had?

    Do you really think the developer said to himself "It is clearly worth risking my reputation by violating the easy-to-comply-with BSD license for my own personal gain of giving code away for free!"?

    So yes, 10:1 that this was an accident. I'm not ruling out the possibility of malice, just that it's a lot less plausible.

  8. Hypothetical question by ReelOddeeo · · Score: 4, Insightful
    I haven't looked at the two sources in question, so I can't comment about how "close" they appear to be to each other.

    Suppose Bob writes an open source program. Then along comes John and examines Bob's program, and learns crucial things from it. Such as how the frobulator encoder works. John then writes his own program which has a frobulator encoder, whose concepts are influenced heavily from what he learned by studying Bob's work.

    At what point is John stealing Bob's work?

    • When he studies Bob's source? (Thus carrying away intellectual property in his head! Worse, maybe even violating copyright from inside his brain.)
    • When he uses Bob's concepts? Especially if Bob worked hard to come up with some novel approach. Or if a significant part of Bob's effort was laying out the structure in a particular way?
    • If he uses the same identifiers, or identifier structure as Bob did? (What if John types in his own original code?)
    • If he simply cut&paste's a few lines from Bob's code. (How many? 1 line, 5 lines, 5000 lines?)

    This is a loaded question. (Just like: When does life begin, at conception or birth, or where inbetween.) Except our question here isn't quite as emotionally charged. (Well, maybe it is for us.)

    Back in 1979, I would help other students with their programs. Sometimes after making sure they understood the algorithm, and were writing the code, we would end up with what basically amounts to my design. Should I just make sure that I use different variable names? Should I introduce frivolous structural changes to the program so the instructor doesn't think someone is cheating? (Of course, I became so notorious with my instructors that this problem never came up -- they knew me well enough.) And the other student did end up actually accomplishing the learning.

    Returning to my above example. Should John make sure to rename the members of the structure? Alter it stylistically? After all, Bob did the hard gruntwork. In some sense Bob should get credit. What if Bob doesn't want to license or give any permission? Can Bob withhold the know how of how the frobulator encoder works -- especially if it is embedded within open source?

    Cearly, the ideal thing would be for John to contact Bob. But this takes time and effort. If John had simply renamed identifiers and altered the style, would an issue ever be raised on Slashdot in the future? (Even if Bob someday examined John's code and noticed the similarity, of concepts, if not actual cut&paste lines?)

    And as I first stated, I haven't examined the sources, and this may be a very clear case of cut&past without any credit given. These questions are intended to be hypothetical. Any resemblance to actual persons or events is purely cooincidental and unintentional.
    --

    Those who would give up liberty in exchange for security and DRM should switch to Microsoft Palladium!
  9. Structures *do* have IP! by mperrin · · Score: 4, Insightful
    (1) A structure is just that: a structure.

    Au contraire. Compare the following two snippets of code, taken arbitrarily from one of the other raid header files in the kernel:



    struct m {
    int a;
    int b;
    kdev_t c;
    int d;

    /*
    * State bits:
    */
    int e;
    int f;
    int g;

    int h;
    };

    And:


    struct mirror_info {
    int number;
    int raid_disk;
    kdev_t dev;
    int head_position;

    /*
    * State bits:
    */
    int operational;
    int write_only;
    int spare;

    int used_slot;
    };



    Those are the same exact structure, no? Exact same data types and everything. I even left in the comments. Now, which of those would you rather have to program with? A structure is *not* just a structure; different source codes for the same structure can be of radically different usefulness. There's definitely intellectual property there.

  10. Re:And yet... by lupercalia · · Score: 5, Insightful

    This is clearly the fault of just one PROGRAMMER.

    Does your boss see all the code you write, and if s/he did would s/he recognize BSD ATA code? Mine sure wouldn't.

  11. Not creative work == not copyrightable by Eric+Seppanen · · Score: 4, Insightful
    While it sure would be nice, and right, to give credit, I'm not convinced that it's legally necessary.

    It seems likely to be that header file structure definitions are a functional description of how a piece of hardware works. And if that's the case, that information is no more copyrightable than the telephone book. And if it's not copyrightable, it's perfectly legal to remove the credits and license and redistribute however you want. Not right, mind you, but legal.

    Looks to me like he's screaming about copyright infringement and/or license violations without understanding the limited scope of copyright.

    --
    314-15-9265
  12. Porting code, copying req'd header info... by pjrc · · Score: 4, Insightful
    A couple days ago, I started some work to port Nullsoft's NSIS Win32 Installer Builder to a native linux app (that builds win32 installers). After converting several HANDLEs into FILE*'rs and just ifdef'ing out a few difficult bits that I don't care about, I ran into all sorts of constants that get defined somewhere in the giant mess that is #include<windows.h>. Lots of things like MB_OKCANCEL, MB_YESNOCANCEL, SW_SHOWMAXIMIZED, IDCANCEL, HOTKEYF_ALT, FILE_ATTRIBUTE_ARCHIVE, etc.

    After a few grim moments of comtemplating actually buying and installing Visual C++, it occured to me that these things are probably defined somewhere in the mingw stuff. Sure enough, I found them all in various headers within the mingw package. I copied all these (and a bunch of other little win32 kludges) into a win32stuff.h file that I started including in the various .cpp files.

    So did I cross the line? I copied a few dozen lines from various header files in the mingw package (I didn't mention in the file that I got them from the mingw project, but I probably should before I release the port to anyone). Did the the mingw guys copy this stuff from somewhere in all the stuff included by #include <windows.h> ??

    Ok, I'll admit that a bit struct that represents the on-disk format of something that was reverse engineered is a bit more substantial than a bunch of constants... but calling it "IP Theft" seems to be leaping to some strong conclusions. Even if both programmers did their reverse engineering independently, aside from using different names, there's not a lot of different ways the struct could look. Even if the linux developer did look at the BSD header file to learn the data formats, how different could one expect his code to possibly be ?? If it's an algorithm with some creative implementation, I can see the accusation, but over a header file that simply documents simple facts seems a bit much. Sure, it can be hard work to get those facts by reverse engineering, but still, the "IP Theft" is simple facts (not really protected by copyright, in my limited understanding of copyright law... IANAL).

    And finally, if Søren really does hope "an amicable solution can be reached", why's he turning this into a bunch of bad PR for linux and redhat ?? It's sounds to me like a case of getting mad and posting flames instead of cooling off for a day and thinking it through more carefully.

    As far as my porting work for Nullsoft's really cool (SuperPiMP) installer, I hit a big block of very win32 specific code, CEXEBuild::do_add_file at the end of script.cpp. Unlike many of the other bits that I ifdef'd out, this is the one that actually puts the files into the install image, so I can't just chop it off. I will need to completely rewrite this using unix/posix APIs, probably using C library regex patterns instead of whatever wildcard matching win32's FindFirstFile does. I'll probably get back to porting NSIS in a week or two... I might even try rebooting and running it in windows a few times! And, I'm not going to lose any sleep over copying a few dozen constants out of someone else's header files.

  13. Re:And yet... by Jabes · · Score: 5, Insightful
    I wonder where Microsoft (or anyone else distributing binary BSD-licensed software) does this. At least I didn't find it in Windows 2000's documentation (both online and offline). I have only the OEM version so my only manual is a quick start guide, but still the notice should be somewhere if Microsoft doesn't break the license.

    I don't know about Windows 2000, but I've got RTM Windows XP here. On the CD in the root directory is a README file. Here's some of it...

    Acknowledgements Portions of this product are based in part on the work of Mark H. Colburn and sponsored by the USENIX Association. Copyright © 1989 Mark H. Colburn. All rights reserved.

    This product includes software developed by the University of California, Berkeley and its contributors.

    Portions of this product are based in part on the work of the Regents of the University of California, Berkeley and its contributors. Because Microsoft has included the Regents of the University of California, Berkeley, software in this product, Microsoft is required to include the following text that accompanied such software:

    Copyright © 1985, 1988 Regents of the University of California. All rights reserved.

    Redistribution and use in source and binary forms are permitted provided that the above copyright notice and this paragraph are duplicated in all such forms and that any documentation, advertising materials, and other materials related to such distribution and use acknowledge that the software was developed by the University of California, Berkeley. The name of the University may not be used to endorse or promote products derived from this software without specific prior written permission. THIS SOFTWARE IS PROVIDED "AS IS" AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTIBILITY AND FITNESS FOR A PARTICULAR PURPOSE.

    Portions of this product are based in part on the work of Greg Roelofs. Because Microsoft has included the Greg Roelofs software in this product, Microsoft is required to include the following text that accompanied such software:

    Copyright © 1998-1999 Greg Roelofs. All rights reserved.

    This software is provided "as is," without warranty of any kind, express or implied. In no event shall the author or contributors be held liable for any damages arising in any way from the use of this software.

    Permission is granted to anyone to use this software for any purpose, including commercial applications, and to alter it and redistribute it freely, subject to the following restrictions:

    Redistributions of source code must retain the above copyright notice, disclaimer, and this list of conditions.

    Redistributions in binary form must reproduce the above copyright notice, disclaimer, and this list of conditions in the documentation and/or other materials provided with the distribution. All advertising materials mentioning features or use of this software must display the following acknowledgment:

    This product includes software developed by Greg Roelofs and contributors for the book, PNG: The Definitive Guide, published by O'Reilly and Associates.

    Portions of this software are based in part on the work of Hewlett-Packard Company. Because Microsoft has included the Hewlett-Packard Company software in this product, Microsoft is required to include the following text that accompanied such software:

    Copyright © 1994 Hewlett-Packard Company

    Permission to use, copy, modify, distribute and sell this software and its documentation for any purpose is hereby granted without fee, provided that the above copyright notice appear in all copies and that both that copyright notice and this permission notice appear in supporting documentation. Hewlett-Packard Company makes no representations about the suitability of this software for any purpose. It is provided "as is" without express or implied warranty.

    Portions of this software are based in part on the work of the University of Southern California. Because Microsoft has included the University of Southern California software in this product, Microsoft is required to include the following text that accompanied such software:

    Copyright © 1996 by the University of Southern California. All rights reserved.

    Permission to use, copy, modify, and distribute this software and its documentation in source and binary forms for any purpose and without fee is hereby granted, provided that both the above copyright notice and this permission notice appear in all copies - and that any documentation, advertising materials, and other materials related to such distribution and use acknowledge that the software was developed in part by the University of Southern California, Information Sciences Institute. The name of the University may not be used to endorse or promote products derived from this software without specific prior written permission.

    THE UNIVERSITY OF SOUTHERN CALIFORNIA makes no representations about the suitability of this software for any purpose. THIS SOFTWARE IS PROVIDED "AS IS" AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.

    Portions of this software are based in part on the work of Luigi Rizzo. Because Microsoft has included the Luigi Rizzo software in this product, Microsoft is required to include the following text that accompanied such software:

    © 1997-98 Luigi Rizzo (luigi@iet.unipi.it)

    Portions derived from code by Phil Karn (karn@ka9q.ampr.org), Robert Morelos-Zaragoza (robert@spectra.eng.hawaii.edu) and Hari Thirumoorthy (harit@spectra.eng.hawaii.edu), Aug 1995

    Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:

    Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.

    THIS SOFTWARE IS PROVIDED BY THE AUTHORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

    Portions of this software are based in part on the work of Massachusetts Institute of Technology. Because Microsoft has included the Massachusetts Institute of Technology software in this product, Microsoft is required to include the following text that accompanied such software:

    Copyright © 1989,1990 by the Massachusetts Institute of Technology. All Rights Reserved.

    WITHIN THAT CONSTRAINT, permission to use, copy, modify, and distribute this software and its documentation for any purpose and without fee is hereby granted, provided that the above copyright notice appear in all copies and that both that copyright notice and this permission notice appear in supporting documentation, and that the name of M.I.T. not be used in advertising or publicity pertaining to distribution of the software without specific, written prior permission. M.I.T. makes no representations about the suitability of this software for any purpose. It is provided "as is" without express or implied warranty.

    Under U.S. law, this software may not be exported outside the US without license from the U.S. Commerce department.

    Portions of this software are based in part on the work of Regents of The University of Michigan. Because Microsoft has included the Regents of The University of Michigan software in this product, Microsoft is required to include the following text that accompanied such software:

    Copyright © 1995,1996 Regents of The University of Michigan. All Rights Reserved.

    Permission to use, copy, modify, and distribute this software and its documentation for any purpose and without fee is hereby granted, provided that the above copyright notice appears in all copies and that both that copyright notice and this permission notice appear in supporting documentation, and that the name of The University of Michigan not be used in advertising or publicity pertaining to distribution of the software without specific, written prior permission. This software is supplied as is without expressed or implied warranties of any kind.