The Open-Source Detector
McDutchie writes "With open-source related lawsuits on the rise, a
market is developing for automated tools that detect the presence of open-source code within larger
application development environments.
Palamida Inc.
stepped in with IP Amplifier 3.0,
essentially a search tool and a database that consists of more than 38 million
of the most commonly used open-source files. Something Google-inspired called
CodeRank is claimed to match code against the database. Hmm...
maybe
someone should run it on
this,
or even
this." Of course, some open source code is perfectly welcome in commercial software, even if that software's code is not itself open; it's no secret or surprise that Microsoft, for instance, has taken advantage in some products of BSD-licensed code.
Because the BSD license explicitly allows them to do this.
appears to be the whole point of this tool anyway.
This tool is meant for commercial software companies to use, to ensure that they are not mistakenly using GPL code in their programs. It is not for open source developers to find misuses of their own code.
You have confused Open Source with GPL. There is nothing wrong with using Open Source in applications as long as the license permits it.
Why should Microsoft be singled out for it? Expecially when we had people taking GPL'ed code and selling it as closed source...
These comments are my own and do not necessarily reflect the views or opinions of my employer or colleagues...
Could this tool be used in reverse?
For example, one could write a bug-filled line of code, perhaps something with a buffer-overflow. This could then be matched with open-source projects and projects with buffer overflows are found. Of course, this could also be used to find vulnerabilities and so on.
>Of course, some open source code is perfectly >welcome in commercial software, even if that >software's code is not itself open; it's no secret >or surprise that Microsoft, for instance, has taken >advantage in some products of BSD-licensed code.
This example (socket code) often pops up, and is often used in GPL advocacy.
Note however that the TCP/IP work was done under a DARPA grant, paid for by the US government, so it is not only legal, but even moral right for Microsoft to use this code.
Um, last time I checked, this is a quite reasonable approach. You can paraphrase your book report in school, you can paraphrase your predecessor's speech, you can take photographs from famous vistas, and you can rewrite your own closed code inspired from Open Source algorithms.
Source code is protected by copyright-- that is, literal or near-literal copies containing the essence of expression. Open Source code doesn't require that reverse engineering must be done in a clinical clean-room black-box methodology. That's kinda the POINT of Open Source: show people how it's done.
[
Usually the key to things is not the actual implementation used, but the algorithm behind it.
That's fine. Algorithms cannot/should not be copyrighted or patented.
"Mistakenly using GPL code"? How can anyone use GPL code on accident? You downloaded a tarball, you extracted it, you opened it in a text editor, you copied and pasted the code. And then you tell your boss that you did that "on accident"?
Can anyone explain this to me?
Heh. Soon someone will write a 'Gpl encrypter' that does this automatically. Whee, a new version of encryption wars!
...it's really a sad day for America when we require a goddamn ACT OF CONGRESS to make our DVD players work properly. ~
Palamida charges $50,000 to $250,000 for an annual subscription to IP Amplifier. Cost depends upon the size of the customer's development environment.
That seems rather steep. Are they doing something really complicated or is this something that a well-maintained (open-source?) project could do? Of course they are storing a major amount of information (i.e. all of sourceforge/freshmeat).
This might in fact be a feature that sourceforge might want to implement (for a fee): doing a search in their database.
On the other hand, it might make more sense to check against proprietary source, data and images. They are, by their nature, harder to find.
Also: when outsourcing parts of a project, wouldn't a contract have to state explicitly conditions such as not stealing/borrowing code from elsewhere? It would be a minimum requirement that the licensing of any (sub-)code would have to fit the overall product.
see a Text Widget
"This tool can't possibly ensure that some binary wasn't made by someone who looked at the open source version, and just reimplemented the same ideas."
I wouldn't be so sure about that. Reputable colleges and universities do exactly that sort of check in CS courses - there are any number of tools designed to check for cheating, and they are not fooled by anything so trivial as changing variable names or swapping a couple statements. They are pretty good at catching cheaters, too.
You are correct in that it can't check "some [random] binary", but this tool was made to run against source.
I'm trying to remember where I'm not allowed to reimplement other people's ideas to begin with, though.
-Erwos
Plausible conjecture should not be misrepresented as proof positive.
The whole advantage of open source is you are not tied to the whims of the original developer.
This seems to be a resurrection of an old attack strategy, pretend that open source is such an burdensome onerouse license that you have to hunt open source code down like a virus.
Its not something to be encouraged!
The whole concept of code seems to scream "Some will be the same". Very basic things will look very similar between several things and with the current "justice" system and ignorance of most people this is going to screw OSS.
I just think it's pathetic that we live in an era where people trying to do something nice gets stabbed in the back for it..
I like muppets.
> This tool can't possibly ensure that some binary wasn't made by someone who looked at the open source version, and just reimplemented the same ideas.
What the fuck are you talking about ?
GPL is a based on copyright. You can't copy/paste the code.
Re-implementing the algos is fine, and have always been.
It is 100% FUD to pretend that code become tainted because you looked a GPL source. Don't spread this. Microsoft would LOVE people to beleive that. It would end up like this in interviews:
- Did you contributed to an open-source project ?
- Well, I once fixed a bug in mozilla
- Sorry, our lawyers said we can't hire you
- Why ?
- You would contamine our IP
Repeat after me. GPL is COPYRIGHT. There is no IP involved. There have NEVER been.
This sounds more like an auditing software. It looks like this tool would allow you to scan an existing codebase to check for the existence of open-source code nuggets. Considering the licensing minefields that exist today, it's probably a good thing for a release manager to do before a "release to production". This is especially so because a lot of developers routinely copy-paste code from the net and usually don't read the license accompanying the code.
IMHO, this is quite an innovative tool, and would save a release or a project manager a lot of headaches in terms of legal compliance.
Now its wonderfull theat they help people get the most out of OSS software but i dont like the fact they are making outsourcing easier .This is not so much a problem where i live but in the USA as i understand it many people are loosing their jobs in the tech industry thanks to companys trying to save a fair bit by outsourcing to cheaper areas .
Again my second problem is there strong patent support here .It just makes me as someone who uses and contributes to OSS uneasy.(just my opinion and how i feel , not a statment of fact )
On to the legal section ,Their bussines model is basicaly that of enforcing IP rights , sure that may help us find companys abusing GPL code , but it also swings both ways and can open up a whole host of patent cases against GPL software.
Fair enough this can be usefull in this day and age , allowing you to pay them to make sure your not infringing on any patents , But this just dosn't work on 90% of the OSS projects out there , i am betting it costs a fair whack.Most people using this on OSS are IMHO going to be looking to enforce a patent case ala SCO.The potential minefield here is not fun.
Now that is alot better ,I can strongly respect what they are doing here .Still i dont like that they keep harping on about IP compliance..
I am probably just being paranoid an
The only things certain in war are Propaganda and Death. You can never be sure which is which though
How can a perfectly acceptable use of BSD code (BSD code in non-OSS projects) be abuse ?
The BSD goal is good code, not open code.
There are places where the networks are not touching,and there are places where they are-Boeing's Lori Gunter
Except decrypting the code before running it takes significant portion of CPU time, effectively making the "open source alternatives" much faster. Hiding, obscuring, obfuscating, all that creates a lot of overhead...
And of course it can be done by examining the memory dump instead of executable file. It must be decrypted to run.
Anagram("United States of America") == "Dine out, taste a Mac, fries"
Maybe you farmed it out to Elbonia, and got back thinly-veiled rip of some Free Software code.
You dont get the point of the whole thing at all. This is not for searching open source code that you could use.
;-)
This is so that you can detect OS code in your own source code. Presumably if you're managing a commercial software company you'd want to know if your developers have simply been copying code from some OS project.
It can do binaries too if you actually read the thing.
Now if you'll excuse me, I have some code I need to obfuscate
I worked at a ruthless company. Part of the culture was to get results as fast as possible and completely ignore things like licenses, rules and laws, if it helped to make money.
We certainly would have violated the GPL in a second, given that one couldn't really prove damage to the other party (aging idealist hippies with beards who were naive enough to give away software with a silly "license").
The ripoff of commercial software was driving me nuts though -- it seemed quite wrong, esp. given that we were raking in the dough and were not paying just because we could easily avoid it through technical measures.
However, part of the "culture" was that we were so busy that we were sloppy about the misdeeds. We wouldn't have had time to cover our tracks.
Such tools would have caught us, so I'm guessing such tools will lead to finding many similar violators.
http://www.thebricktestament.com/the_law/when_to_
This tool can't possibly ensure that some binary wasn't made by someone who looked at the open source version, and just reimplemented the same ideas.
Good. So long as all they are doing is gathering ideas there is nothing wrong with that. Its like me reading harry potter and then writing a book about wizards. Of course I should be allowed to.
Next you'll be telling us that someone could just look at an application working and then write their own implementation incorporating some of the same ideas. Should they be stopped from that as well? Oh wait, they can be. That's what software patents are often used for.
-- MartinG To mail me: echo kewyjlcxyzvjfxbqwh | tr bcefhjklqvwxyz
Actualy thats a bit wrong , the nature of the BSD license allows people to do what the hell they want with it , so in essence you cant abuse the BSD license. .
This is why some people love the BSD license as they see it as total freedom and i have much respect for it myself
I just prefer the GPL way as we get back any changes and thats gaurenteed by the license(if the software is released , i belive its ok not to feed the changes if its an internal tool only)
The only things certain in war are Propaganda and Death. You can never be sure which is which though
...seriously, have you looked at how well people respect copyright? Do you expect employees to cease being human when they walk in the door? All it takes is one worker to "download a tarball, extract it, open it in a text editor, copy and past the code", then tell his boss the task is done.
Kjella
Live today, because you never know what tomorrow brings
As far as I understand it, the GPL has a clause saying that any patents that cover the code being distributed must be licensed for everyone's free use. That's not the case with Microsoft's shared source.
The GPL is less free than BSD because it does not grant the licensee as many freedoms.
No, the GPL is more free because it does not permit anyone to take away anyone else's freedom. Say I write some GPL code. You are free to use it, modify it, sell it if you want, but you may not tell any later user or developer that they can't enjoy the same freedoms you have enjoyed.
Scenario 1: Person A writes some GPL code. Person B uses it and modifies it, and releases the code. Everyone else is free to use that code as they wish, as long as they don't try to restrict anyone else's rights.
Scenario 2: Person A writes some BSD-licensed code. Person B uses it, modifies it and starts selling it as a shrink-wrapped product. All his users are restricted by EULAs. They can't have the source code, they can't legally share the program, and they're stuck if B discontinues the product.
In which scenario do you think the licensees have more freedom? It's free as in liberty, not free as in 'free ride'.
#define struct union
Note however that the TCP/IP work was done under a DARPA grant, paid for by the US government, so it is not only legal, but even moral right for Microsoft to use this code.
Not only that but whenever I've been present when someone has asked the people who wrote the code if it's OK for Microsoft to use it, they didn't say "we can't stop them", they said "we want them to use it".
I don't see how you can possibly come up with a more ethical or moral justification for it than that.
The reason I said "regardless of whether you think it is good or bad" was to ignore discussions such as this.
It is very simple: the BSD license is more free, because it grants more freedoms.
Yes, to take this to its logical extreme means that anarchy is maximum freedom. No, this would not be a good thing; but by trying to argue that the GPL is more free (when you should have said that it is better for the user of Person A's software) you have already accepted that unlimited freedom isn't such a good thing anyway.
You downloaded a tarball, you extracted it, you opened it in a text editor, you copied and pasted the code. And then you tell your boss that you did that "on accident"? Can anyone explain this to me?
Muscle memory?
this tool can help you to make sure you change just enough the stolen implementation so that the tool won't detect the similarities, giving you an approval stamp without too much work :)
Sneak teach kids Algebra using a game
It's not as hard as you make out to use GPL code by accident, especially library code. Consider the plight of a poor developer, forced with unmeetable deadlines and a fire-breathing boss with a P45 waiting (I've been there, it happens).
He needs to implement a specific piece of functionality and fast. He searches the web and finds some 'sample' code and thinks "just the job".
Copy.. paste..
You now have GPL code in your application, copied and pasted direct. Why? Malicious and callous hatred of free software? No, an accident. Carelessness. A quick fix in a tight spot.
It happens. I've seen it.
Further, not everything that takes time is wasteful. Copyright is intended to protect the expression of ideas, not the underlying ideas. Thus, you don't protect the idea of love or even the words I love you, but you can protect the expression of love and the words I love you in the context of lyrics to a song possibly with a musical score.
They can demand you open-source any application that contains GPL'd code.
No, they can't. Stop spreading this myth.
I've had enough abrasive sigs. Kittens are cute and fuzzy.
"For the submitter to assume that Microsoft has GPL code is nothing short of trolling. Internally, Microsoft has a strict policy against GPL code.""
The submitter's article did not state that the submitter assumed that there was GPL'd code in MS products.
"On the other hand, what I would like to know is how many OSS projects reverse engineer Microsoft products to implement functionality"
Why do you believe that any laws or the EULA were broken by people implementing any funtionality in GPL'd software? If there were laws broken, do you not believe that Microsoft would have the people who broke the laws or the EULA in court?
"Did anyone notice that the Firefox popup blocked notification changed to look like the IE 6 SP2 blocker?
Did you notice that MS Windows looks alot like a windowing system that Xerox invented, or that MS Windows looks like the windowing system used on the Apple Lisa and the Apple Macintosh -- all of which predate MS Windows. Did you notice that Excel looks like VisiCalc and Lotus 1-2-3? Do you feel that it was wrong for MS to have copied the look and feel (and possibly even the name) of products invented by Xerox, Apple, and VisiCalc?
OH NOES TEH DLL ARE ENCRYPTED!!1one
The code must be decrypted at some point in order to be run. If what you said was true, we would have uncrackable copy protection.
Your scheme is a variant of DRM, and like all DRM schemes is fundamentally flawed, because the person you are trying to keep the data from, is the exact same person that you are making the data available to.
No, the GPL is more free because it does not permit anyone to take away anyone else's freedom. Being able to take away somebody's freedom is a freedom in itself. The BSD licence provides this freedom. The GPL does not. Therefore, the BSD license provides a freedom the GPL does not, meaning it is more free.
Slow Down, Cowboy! It's been 60 minutes since you last successfully posted a comment.
For one of our second year programming assignments, our lecturer posted a bunch of example code that she used during lecture.
:D Still am! *shakes fist*
:D Google does a decent job for those who don't have access to a fancy OSS database.
It was sockets in C. The code was very poorly written, it actually contained a couple of GOTO statements. One of the files contained a typo in the commenting, so I figured... Let's google it!
And wouldn't you know it, several hundred results.
I'm not sure what I was angry at: Our lecturer not giving any indication that she didn't write the code, or not citing her sources, or giving us such crappy code to start with...
But needless to say, I was angry.
So, to tie this to the topic, nothing works better than searching for typos!
- shazow
Frankly, that's why I never really understood the point of copyright.
The point of copyright is to let people derive commercial rewards from the expression of ideas; copyright does not protect the ideas themselves.
(I apply this word here to code as well as other textual material) is alright, even though fundamentally it's the same thing, only more time-consuming;
No, it's not "fundamentally the same thing". There have been thousands of Mary-with-baby pictures. It's the expression--the actual painting--that is the work. If you create a new painting yourself, it contains the same ideas, but the work is, as you observe, in the actual creation of the painting. That's what copyright is supposed to do.
Patents are designed for protecting ideas themselves; patents are deliberately harder to get and more limited.
The reason you are tainted from looking at shared source is the two headed. First the license itself prevents you from utilizing the knowledge with contract law. Second, everything there is software patented.
Copyright does not require a cleanroom implementation. Patents do. Open source code is not patented.