Slashdot Mirror


Claimed Proof That UNIX Code Was Copied Into Linux

walterbyrd writes "SCO's ex-CEO's brother, a lawyer named Kevin McBride, has finally revealed some of the UNIX code that SCO claimed was copied into Linux. Scroll down to the comments where it reads: 'SCO submitted a very material amount of literal copying from UNIX to Linux in the SCO v. IBM case. For example, see the following excerpts from SCO's evidence submission in Dec. 2005 in the SCO v. IBM case:' There are a number of links to PDF files containing UNIX code that SCO claimed was copied into Linux (until they lost the battle by losing ownership of UNIX)." Many of the snippets I looked at are pretty generic. Others, like this one (PDF), would require an extremely liberal view of the term "copy and paste."

22 of 578 comments (clear)

  1. More details and downladable archive by tomhudson · · Score: 5, Informative
    More details, and a downloadable archive here - because there's no telling how long those files will remain on McBride's blog,

    Also, we find out more about streams, and how SCOsource was bogus.

  2. Re:More details and downloadable archive by tomhudson · · Score: 5, Informative
    For those not logged in who don't see the download url in my sig

    "In a blog post dated July 10th, 2010, Kevin McBride has leaked almost 50 of the code comparisons that were submitted in evidence in SCO vs Novell. You can download the archive.

    Read on to view individual files if you don't want to download the whole thing.

    Linux STREAMS

    We also learned that the whole STREAMS fuss was not about linux, but about a product distributed by gcom, a provider of legacy solutions.

    Their Linux STREAMS (LiS) product provides a couple of loadable drivers that would intercept calls to the old streams api and convert them. In other words, far from the allegations that the linux kernel contained code that infringed streams, it's evident from the need of an add-on loadable module that the linux kernel does not contain any STREAMS code.

    Of particular note, and probably a source of much consternation to SCO and their proponents, is that LiS itself doesn't implement streams either, just does protocol translation. So neither linux nor LiS contains infringing code.

    The whole end-user $699 license was a scam

    In my view, contract violations by IBM would not result in liabilities by other Linux users.

    So according to Kevin McBride, one of the lawyers who worked on the case, there was no reason for end users to take out a license. It's logical to conclude that SCOsource was a protection scam. So what happened? To me, it looks like SCO lawyer-shopped until they found attorneys who were willing to go along with the scheme for a price - everyone has their price, and in this case, it was $30,000,000.00.

    The Appeal of SCO's loss to Novell - Novell will probably win.

    Will Novell win the current SCO appeal? Probably. Will Novell donate the UNIX copyrights to the Linux community if it wins the current appeal? Probably-although Novell's Linux activities have been difficult to predict in recent years.

    So it's pretty much as we suspected all along.

  3. libelf!?! by Dahamma · · Score: 5, Informative

    I actually find it ironic that libelf was picked as an example of infringement. I can tell you first hand that the (more standard) UNIX/Solaris libelf is NOT compatible with the Linux/libc libelf. And I can also tell you that after pointing this out to Ulrich Drepper he really didn't give a shit... (I think his approximate words were "It's been like that for a while, too late, I won't change it").

    Their only mistake was actually naming it "libelf"... since it is most definitely NOT the same library...

  4. Re:What's so liberal about it? by tsalmark · · Score: 4, Informative

    You're not a programmer are you. The header files are pretty much just a bunch of definitions. there is no programming to speak of in either of those files. From reading the Posix standards you will end up with the same code but with your own comments. The header files are a lot like the ingredients section of a recipe. So it's like looking at two recipes for omelets then complaining that both have eggs listed in the ingredients section.

  5. Re:What's so liberal about it? by mewyn · · Score: 5, Informative

    It's a header file for a standardized interface. All this stuff needs to be the same for any *NIX-like operating system to be *NIX-like, otherwise, you're making an incompatible operating system. To make source-compatible operating systems you need to have common interfaces, and those interfaces lie in the header files. Saying that this is copyright infringement is like saying that they patented a hole in the wall as a way of getting in and out of a room.

  6. Re:What's so liberal about it? by Anonymous Coward · · Score: 5, Informative

    In case you are not trolling, virtually all of the allegedly copied code is boilerplate stuff defining types and structs or function interfaces. These have to be the same for Linux to be posix compatible. The little actual code there is, it isn't similar at all. Copyright can't keep you from writing a function that acts like another, that is for software patents, there should be actual copying and for such tiny functions it would be pretty hard to demonstrate (!s) was copied from (s==NULL).

    I still think the jury, knowing nothing about computers, would have ruled against Linux, but the claims were ridiculous.

  7. It's only a header file by DrJimbo · · Score: 4, Informative
    From Sega v. Accolade:

    Computer programs pose unique problems for the application of the "idea/expression distinction" that determines the extent of copyright protection. To the extent that there are many possible ways of accomplishing a given task or fulfilling a particular market demand, the programmer's choice of program structure and design may be highly creative and idiosyncratic. However, computer programs are, in essence, utilitarian articles -- articles that accomplish tasks. As such, they contain many logical, structural, and visual display elements that are dictated by external factors such as compatibility requirements and industry demands... In some circumstances, even the exact set of commands used by the programmer is deemed functional rather than creative for the purposes of copyright. When specific instructions, even though previously copyrighted, are the only and essential means of accomplishing a given task, their later use by another will not amount to infringement.

    It is nearly impossible to win a copyright suit over a header file. The only chance you would have would be if it was a straight copy-and-paste which this was clearly not. The reason for this is that there is just not much room for creative expression in header files. Likewise, you can't copyright a word or a short sentence.

    There are a very limited number of ways to declare functions. If someone was allowed to copyright certain function declarations then they would have control over a large segment of the software industry. Likewise, if someone was allowed to copyright particular words, they would have control over a segment of the publishing industry.

    --
    We don't see the world as it is, we see it as we are.
    -- Anais Nin
  8. Re:What's so liberal about it? by SpazmodeusG · · Score: 4, Informative

    No, read the POSIX interface standard (or in this case specifically the ELF executable standard).
    You have to give your functions certain names to be compliant to the specification. The code shown is interface code, the implementation is somewhere else. Interface code simply names the functions, parameters and variables. As the functions must have certain names and parameters to fit the standard you will get the exact same line that declares a function. Any C programmer could recreate that same block of code with just a list of functions names and parameters that must be declared.

    eg. If you have to have a global function called elf_version with return of unsigned int and parameter of the version you'll get the line
    extern unsigned in elf_version( unsigned int __version );

    We see that same line of code in both files as they both implement the same specification. I'm sure there's a ton of other UNIXes out there that have the same line of code.

  9. Disbarment? Jail time? by DrJimbo · · Score: 5, Informative

    This code was the last big unknown in this long sorry saga. Even if SCO owned the copyrights, (and hadn't distributed it under the GPL, and hadn't signed the UnitedLinux agreement, etc.) it is now crystal clear that SCO's Microsoft-funded anti-Linux campaign was based on a stack of frivolous law suits.

    I think Darl's brother is scrambling to cover his backside so that when the disbarments and criminal charges come down, he has a chance to escape.

    Groklaw (of course) has IBM's response to SCO's claims that these paltry examples are worth BILLIONS of dollars in copyright damages. None of the code they offered is protectable under copyright law. Some of it is BSD code that everyone is free to use however they want (if they include the copyright notice). A lot of it is header files that were not copy-and-pasted which are nearly impossible to protect under copyright law. Then they have some snippets of generic code. Given the size of the source code for Linux, it would be astounding if there weren't some similar snippets. The idea that this is proof that Linux violated any Unix copyrights is totally absurd. The idea that these generic snippets are what made Linux enterprise-ready is beyond insane.

    The recent SCO v. Novell case decided that SCO never even owned the copyrights it was suing about. And then instead of the millions of lines of code they claimed were infringing, they presented this meager collection of totally unprotectable snippets. I sure hope SCO's lawyers get severely punished for perpetrating this fraud on the court for the past seven years.

    --
    We don't see the world as it is, we see it as we are.
    -- Anais Nin
  10. Re:variable names and data structures. by johnmoe · · Score: 4, Informative
  11. Re:oops I meant 331 not 251 by tomhudson · · Score: 4, Informative

    re 331: It's from BSD

    Man Pages
    Manual Reference Pages - ELF (5)

    NAME
    elf - format of ELF executable binary files CONTENTS

    Synopsis
    Description
    See Also
    History
    Authors

    SYNOPSIS

    .In elf.h

    DESCRIPTION

    The header file
    .In elf.h defines the format of ELF executable binary files. Amongst these files are normal executable files, relocatable object files, core files and shared libraries.

    An executable file using the ELF file format consists of an ELF header, followed by a program header table or a section header table, or both. The ELF header is always at offset zero of the file. The program header table and the section header table's offset in the file are defined in the ELF header. The two tables describe the rest of the particularities of the file.

    Applications which wish to process ELF binary files for their native architecture only should include .In elf.h in their source code. These applications should need to refer to all the types and structures by their generic names "Elf_xxx" and to the macros by "ELF_xxx". Applications written this way can be compiled on any architecture, regardless whether the host is 32-bit or 64-bit.

    Should an application need to process ELF files of an unknown architecture then the application needs to include both .In sys/elf32.h and .In sys/elf64.h instead of .In elf.h . Furthermore, all types and structures need to be identified by either "Elf32_xxx" or "Elf64_xxx". The macros need to be identified by "ELF32_xxx" or "ELF64_xxx".

    Whatever the system's architecture is, it will always include .In sys/elf_common.h as well as .In sys/elf_generic.h .

    These header files describe the above mentioned headers as C structures and also include structures for dynamic sections, relocation sections and symbol tables.

    ...

    [snippage]

    ...

    HISTORY

    The ELF header files made their appearance in Fx 2.2.6 . ELF in itself first appeared in AT&T V . The ELF format is an adopted standard.

    This is the problem with SCO's case - OldSCO/Caldera only could have gotten what Novell originally had to give, if Novell HAD assigned copyrights to OldSCO. A lot of the stuff was from BSD.

  12. Re:What's so liberal about it? by SpazmodeusG · · Score: 4, Informative

    It's not even that. It's plain old rewriting a library to remain compatible.
    Here's an example of some end-user programs that use those very enumerations. The ELF_Type enumeration is used on page 37 in an end user application and ELF_T_WORD value is assigned to it on page 45.
    http://elftoolchain.sourceforge.net/for-review/libelf-by-example-20100112.pdf

    There's no coincidence involved. If you write applications that use the ELF_Type enumeration and you decided to write a new elf library to support that app you'd end up having the same enumeration names to maintain compatibility.

    Copyright allows you to recreate something that's compatible as long as it isn't copied directly.

  13. Re:Shocking by physicsdot · · Score: 4, Informative

    How dare they copy/paste those blank lines!

    Just in case you thought you were kidding: http://www.mcbride-law.com/wp-content/uploads/2010/07/Tab-422.pdf

    Line 22 is blank, and is indicated as being copied.

  14. To Bogosity and beyond! by DrJimbo · · Score: 4, Informative
    The courts have established that in order to determine software copyright infringement (for non-literal copying, which is what we have here but filtration is required even for literal copying), one must perform what is called the Abstraction, Filtration, Comparison Test. In court documents related to the code in question, SCO admitted the did not perform this test on this code. They claimed that that was IBM's job. The article linked to above explains the test:

    1. break down the plaintiff’s program into its constituent structural parts (“abstraction”);

    2. examine each part for incorporated “ideas,” elements taken from the public domain, methods of operation, processes or procedures, or otherwise unprotected material (“filtration”); and

    3. compare the remaining kernel of creative expression, if any, to the work alleged to infringe at each level of abstraction (“comparison”).

    They further explain:

    The scenes à faire doctrine is often applied in software cases because it is frequently impossible to write a program in a particular computing environment without employing certain standard programming techniques and design elements. This is because certain functions, data elements, and the order of operation of a program can be dictated by such things as the type of computer on which the program will run, the programming language used, the operating system environment, governmental requirements, industry demands and standards, and widely accepted programming practices.

    I suspect the reason SCO didn't filter this code is because if they did, there would be nothing at all left to present to the court as their fig leaf to avoid being charged with perpetrating a fraud on the court.

    --
    We don't see the world as it is, we see it as we are.
    -- Anais Nin
  15. Re:Disbarment? Jail time? by UnknowingFool · · Score: 4, Informative

    In Gates v Bando the Tenth Circuit established the abstraction-filtration-comparison test that would become the standard in software copyright infringement. Specifically in the filtration step, all elements which are not protected by copyright must be removed from consideration. In this case, most of the code falls under scenes a faire: "expressions that are standard, stock, or common to a particular topic or that necessarily follow from a common theme or setting . .these external factors may include: hardware standards and mechanical specifications." Most of the code were simply declarations needed for compatibility and cannot be copyrighted.

    --
    Well, there's spam egg sausage and spam, that's not got much spam in it.
  16. Re:More details and downloadable archive by sg_oneill · · Score: 4, Informative

    And heres the Magic. Linus learned his style by closely reading Andrew Tanenbaum's books, and reading the Minix code. Which of course is what your supposed to do with Minix. So have most OS coders who had their education back then.

    The end result of course is that everyones code ends up looking like Tanenbaums , which is not a bad thing, the guy is up there with the gods in terms of importance to O/S theory.

    --
    Excuse the Unicode crap in my posts. That's an apostrophe, and slashdot is busted.
  17. Re:More details and downloadable archive by Anne+Thwacks · · Score: 5, Informative
    Because I, J, K, L, M, or variable names beginnning with them, are integers in Fortran. Otherwise variables are floating point, and floating point loop variables is a bad idea especially in Fortran.

    Foobar is a WW2 acronym for F*%&'d up beyond all recognition. (Typically referring to the military situation (See "The Longest Day").

    Now get off my lawn.

    --
    Sent from my ASR33 using ASCII
  18. Missed verdict by The+Cornishman · · Score: 4, Informative

    You only missed a verdict if you haven't looked up for seven years! Recently a jury in Utah confirmed what a judge found in a bench trial: Caldera (later SCO Group) did not get, and was not entitled to get, the UNIX copyrights in the 1995 deal they did with Novell. Unless you think that the jury was unreasonable in that finding (and guess what, SCOG and its lawyers do), SCOG does not 'own' UNIX in any useful sense.

  19. Re:More details and downloadable archive by gmack · · Score: 4, Informative

    These are not small words like "kissing" that are under dispute, this is not about reusing some very common routines that everyone uses, that's just silly. Rather it's about companies wanting to maintain compatibility with legacy versions of UNIX and doing so by referring directly to the legacy UNIX at best, and plagiarizing their code at worst.

    Except it's not about that at all. It's about implementing standard interfaces that are defined by POSIX. POSIX defines things down to the variable type so it's natural that the resulting header files will look similar. In fact, some of the differences I'm seeing in these files are from SCO not implementing POSIX properly ex return type int where it should be size_t.

    You also need to keep in mind that SCO's predecessor (AT&T) was itself caught copying code from Berkely.

    There was exactly one case where there was copying shown between SCO and Linux. In that case the code was from Berkely (licensed open source) copied into SCO and copied into Linux by SGI as one of their internal filesystem driver headers. The code was determined to be non infringing due to it's history but deleted because it was old and reimplemented in a better way elsewhere.

    From working with the Linux kernel maintainers I know they take copyright very seriously and investigate even the possibility that code was copied and you owe them an huge apology for that uninformed set of accusations.

  20. Re:First post by joss · · Score: 5, Informative

    > The similarities in code flow, layout, variable names, filenames, etc. are conclusive.

    I did look at the code on the linked site and what I saw looked entirely like a clean room implementation to me. The similarities were superficial and much less impressive than similarities I have seen in code that I know was written independently (because I wrote it). Two programmers working on the same problem can easily come up with strikingly similar looking solutions which a non-programmer (or an inexperienced programmer) would never believe was independent. I was astounded at how pathetic the supposed similarities were.

    --
    http://rareformnewmedia.com/
  21. Re:More details and downloadable archive by Xtifr · · Score: 4, Informative

    The truth is that code was reused (if not copied, exactly, in the same way you don't submit a copied essay which you've taken from a classmate) from a UNIX derivative, which is now (somewhat disputably) owned by SCO.

    The truth is that SCO does not own the copyrights to UNIX code, as ruled by a judge, a jury and a second judge in Utah.

    Beyond that, SCO already turned over all their evidence to IBM several years ago, where it was analyzed by an expert named Dr. Brian W. Kernighan (if the name doesn't ring a bell, you're not qualifed to be commenting on this topic), and he examined the comparisons and came to the conclusion that no illegal copying had taken place. Note, that's not "no copying", but "no illegal copying". UNIX is based on a number of sources. Most famously, they stole (and I use that word advisedly, since they removed copyright notices, which was illegal) from BSD. They also contributed parts of UNIX to public standards, including the ELF standard which is one of the examples shown here.

    The problem is that you seem to think there's a single copyright to UNIX. There isn't. The three biggest stakeholders are Novell (inherited from AT&T), the University of Cal. regents (all the BSD code in UNIX), and Sun, but IBM and SGI also own largish chunks. The people you're accusing of being "lazy and careless" actually own parts of the code you're claiming they illegally copied. Do you know who actually owns the parts in question? I don't, but no evidence of illegal copying has been shown in court!

    Releasing this thoroughly debunked information at this late date can only be a desperate attempt to spread FUD. Don't fall for it. I'm sorry that you once had your code plagiarized, but this case is nothing like yours.

  22. Copying, and copyright notices.. by mengel · · Score: 4, Informative

    The code was first copied, correctly.

    The copyright notices in the comments, etc were then replaced with AT&T ones, replacing the Berkeley ones (also replacing the earlier AT&T ones, btw.) I can vouch for this personally, having worked on the "vi" source code both at Purdue (original BSD 4.3 code) and at AT&T (System Vr4 code) -- all of the BSD copyrights, as well as the (bad) poetry, had been removed from the comments in the vi sources.

    The Folks from the UCB law school took advantage of this in the counter suit, since the AT&T folks, having changed the copyright notices in the troff sources, ended up doing this this then in the printed manuals. So while AT&T was suing about vague things like including code derived from code derived from code they wrote; UC Berkeley countersued about printed, published, paper manuals, where AT&T was clearly publishing them without the UCB copyright and license info. Clear, obvious, game-set-match, paper copyright violations.

    So rather than have to find and "Destroy all Copies" of SystemVr4 manuals (including those published in turn by licencees like HP, IBM, etc.) AT&T agreed to drop their initial suit and make the countersuit go away.

    --
    - "History shows again and again how nature points out the folly of men" -- Blue Oyster Cult, 'Godzilla'