JBoss Queries Apache Geronimo Code Similarity
Kanagawa writes "This morning, Jim Jagielski, Exec. V.P. and Secretary of the Apache Software Foundation, announced on the geronimo-dev mailing list that 'the ASF received a letter from JBoss's lawyers regarding... the similarity of code between [J2EE implementation] Geronimo and JBoss.' The letter
is available in PDF. According to the letter, similarities were noticed back in July, and haven't been fixed."
...see this post to TheServerSide. A lot of these look like common design patterns and standard Java/J2EE naming conventions.
You can also see Jim Jagielski's response to some questions here about this issue. Sounds pretty reasonable.
The Army reading list
"Good programmer's copy, great programmer's steal!!"
meh.. I got nothing.
I'm not into this case, but at a first glance it seem to me that Geronimo really is just what JBoss is, right ? - So what's the point in remaking it? JBoss is already free? (LGPL!)
henc
Call out the lawyers!
I mean, who couldn't see this coming, after the issues this summer?
At least SCO had some verbatim (albeit legitimate) copying that they could show. This stuff isn't even exact, and in most cases it appears methods of operation have changed, variable names and defines have changed.
I call bullshit.
HBI's Law: Frequency of calling others Nazis is directly correlated with the likelihood of the accuser being Communist.
Neither Geronimo or JBoss are based on the J2EE RI.
The first exhibit seems to be originally derived from:
/ ex amples/customLevel/XLevel.java
http://cvs.apache.org/viewcvs.cgi/jakarta-log4j
which is apache licensed in the first place.
I'm no expert coder, but these don't look the same to me. There are similarities, but one would presume they are doing the similar things.
One of the functions is to convert an integer to a level. How many different ways could you do actually do this? Another function converts a string.
If you assign a class to write functions that would change variable types. All 30 people will come up with different code, but the code is likely to look very similar - especially if you're encouraging them to use proper function/variable naming and comments.
Kudos to JBoss for posting the code, but I don't see much here to be suing over.
Which one is SCO and which is IBM, I'm a little slow on this stuff.
-Tim Louden
Geronimo has requested that all developers confirm that either a) they didn't just submit JBoss code or b) they had the right as the original creators of the JBoss code section to also submit it to Geronimo.
No FUD. No hyperbole in extremis. No crazed threats. Oh, wait: No SCO. Of course. What a breath of fresh air.
Damn, must make more GTK/Gnome clocks.
Bye!!!!
Check out the source code on page 8. Since when is the copyright symbol allowable in Java syntax?
In the example on page 8 of the letter you can see they are BOTH attempting to copyright the freaking SWITCH construct!
--
This post (c) 2003, Knights who say Ni, LTD.
-- This post (c) 2003, Knights who say Ni, LTD.
A letter like that is really all everyone has been asking for from SCO. JBoss is doing this to protect their code. It makes you wonder why SCO hasn't done the same already. Unless of course their actions are not at all about protecting their source code and patents. Reminds of that bible story where King Solomon must decide a dispute over a newborn child. JBoss reacted in the interests of protecting their code, SCO has not. From this simple story, we see whose intentions are what they claim to be, and whose are not.
/.ers were on the money.
On a side note, SCO's recent behavior has made it clear as to who the puppetmaster of this debacle really is:
Here are two quotes from the Computer Business Review:
"SCO would probably provide customers with financial incentives and discounts to migrate to SCO Unix, other vendors' Unix, and what he referred to as 'other proprietary operating systems' but probably Windows."
"'We are offering a migration path to other operating systems that have a stronger IP basis than Linux,' the spokesperson said. Incentives will be offered 'in the coming months.'"
If that move doesn't reveal the puppetmaster, nothing will.
I sincerely doubted Microsoft's involvement for a while, this time though, the paranoid
It does sound like there were a few particular instances where a class' design and the set of methods in the class were directly patterned on the JBoss design - not necessarily copied line-by-line, but the solution to a fundamental part of the J2EE specification "problem" was ripped from JBoss and modified to suit the code needs of Geronimo. Whether this is violation of copyright or not is a tough question. Copyright doesn't protect a design pattern, a solution to a problem, the logical organization of a set of objects, or an algorithm. Proving that somebody actually violated copyright in this case seems rather hard to me - though perhaps a bit of credit to the JBoss folks for their thoughts and design work is in order.
I've had programming classes where the teacher would specifically spell out EXACTLY how your code should look, such as full nouns for variable names (no abbreviations), and very specific capitalization schemes. Documentation was specified as well.
If you look at most of the code excerpts, they're for basic things like string and integer conversions. Given a classroom full of people, and very specific instructions on what code should look like, you're not going to get much variation.
One would need to look at the rest of the code as well to see if the excerpts from each side are consistent with the rest of the codebase. Does one use "CELL_PADDING" everywhere, but in this snippet they use "CELLPAD"?
See, to do a joke, you do not just need to think about it. People do not read your thoughts. So you have to give a clue away. Like if I say: "Today the weather is nice", no one will have a clue I'm joking and it is raining outside, because they don't even know where I am. So it is not funny.
However, if I say "Today the weather is nice, looks like I could go windsurfing on the highway", then they know I am joking. Even if the joke is awful.
Write boring code, not shiny code!
C'mon! CamelCase names in Java follow a some simple rules, there's even a documented way of how you're supposed to do it. As for CELLPADDING, since that's how it's named in HTML, it wouldn't surprise me to see it done identically in another place. Better go sue Netscape too.
To see if the code is actually similar you'd have to look at algorithms and innovations. Looking at interfaces and their names isn't going to tell you anything at all.
The first example in the letter is
org.jboss.logging.XLevel vs. org.apache.geronimo.core.log.XLevel
Both seem to be copied from log4j's examples.customLevel.XLevel
However, there are much more substantial allegations made here
more about me
At what level though do you say that source was copied? Obviously the code isn't a 100% match, and for each problem a coder faces there is a shortest distance/most efficient solution, what's not to say that two developers wouldn't reach similar conclusions? Seeing as some of the exhibits were based around logging which is a very common task which I'd figure that a large portion of projects tackle the problem in the same fashion, I fail to see you could point out that someone had copied the solution if it was the best answer and other people could arrive at the same conclusion.
If it was a line for line copy then I can see it being different, but IMHO I think there are sufficient difference between the two portions of code. Personally I think if JBoss doesn't have better things to do with it's time and money it should slash the cost of its ridiculously expensive (and pathetic) documentation and spend some time improving it instead!
How could this be avoided? Both are implemented against the same guidelines, using the same suggested/implied patterns. I guess it's just a matter of who did it first a this point. Java's syntax does not allow for (thankfully) a million different ways of expressing the same idea (at the lanuage level anyway). Given the pervasiveness of design patterns, it's not unlikely that large pieces of architecture will be functionally and syntactically similar. And given that both are open source software, what are the chances that one developer happened to peek at the other's code for a little insight? Chances are pretty good. Once you see a solution or pattern/class design that works nicely, it's hard not to follow the idea.
TallGreen CMS hosting
Good try, but no, really. First of all, CELLPADDING only appears in the jBoss part of the diff, not in Geronimo. Secondly, that's how you are supposed to specify the padding for cells in an HTML table. So, if Geronimo had decided to use an HTML table in their javadoc with cell padding, they would have had to use CELLPADDING. But all that is irrelevant since they didn't.
No mystery there. ThreadNDCConverter is capitalized *exactly* according to very established java code conventions. See for example http://java.sun.com/docs/codeconv/html/CodeConvent ions.doc8.html
Maybe this is just an artefact of the way these program samples were generated, but it's pretty obvious that the author's name in the 'author' comment at the foot of the left-hand column on page seven (of the pdf of the original complaint letter) is in a completely different font to that of the rest of the code on that page: check out for instance the 'g' character.
Umm... aren't you supposed to sue for gobs of money before you show the infringements? Don't they know how our legal system works?!
I had never heard of Geronimo before, so I did the lemming thing and clicked on the link in the article and got the message in the subject. Now I'm not sure about you, but is it telling me that I should revisit their website after I feel relieved by urinating?
Matt
The JBoss code and the Apache code both appear to be copied from an example that was originally created by Apache. Exibit A and B are both logging classes, both use Log4J (Apache's logging utility) and can be expected to be similar. Exibit C looks Almost identical, but not entirely. The similarities are so trivial, Apache is bound to make a few quick changes and be done with this thing before it starts. What sillyness.
TallGreen CMS hosting
Did anyone not see this coming? And if you didn't here's why you should have:
Mark Fleury's original response to Apache Geronimo
As our customers know, we are a business, a serious one and we seriously believe in and defend "professional open source". That includes legal protection of IP. Make no mistakes, JBoss will AGGRESIVELY defend its copyright and LGPL license.
And from the Elba website
Think of Elba as a latticework for Geronimo--and as a shield to buffer the Geronimo codebase and CVS repository from any LGPL code. As Geronimo is built, its code will replace the code from Elba, bit by bit until there's nothing left in Elba at all. At that time, Elba will cease to exist and only Geronimo will remain; we'll have a big party and you're all invited.
So if Geronimo is being developed as outlined at the Elba website then they'd have to have the exact same method signatures....
My Hello World is 512 bytes. But it's also a valid Fat12 boot sector, Fat12 file reader, and Pmode routine.
Which brings up an important question: can code be re-licensed by people other than the authors when the original license was less restrictive?
Example: Alice in Wonderland is in the public domain. Peter Zelchenko made an ebook out of it with nice typography and claims copyright on the derived work. Can I cut the text & paste it into a document of mine?
Example: the Almquist Shell (ash) seems to have been a contribution to some form of BSD Unix. It's also in busybox with a GPL at the top and a Berkeley license at the bottom.
What if Kenneth Almquist doesn't like the GPL and wants his code to be distributed that way? The BSD license pretty much says he's already given up the right to say anything, but using ash in a closed source project now gives me a funny feeling:
1) I'm worried that someone will claim ash is GPL and I must release the source. The later license doesn't affect earlier versions.
2) I have a copy of busybox source in my account. I've only looked at the docs & looked at the sources enough to figure out where they originally came from, but if there are bug fixes in the GPL'd code, they'd better not be in my ash, at least in the same form.
One more twist: the ash I have is licensed under the "Almquist Public License" which is BSD-like. The copyright message in the busybox version suggests that K.A. contributed it to Berkeley and the license for that *is* the BSD license.
If I want a later version than my 1989 one, I run the risk of hitting the part of the timeline where GPL contributions began.
I respectfull disagree.
a) I don't think anyone would mix up CELLPADDING with PADCELL. What should PADCELL be or mean?
b) Its standard java coding style rules NOT TO USE a "_" in a constant.
Everybody using "cell padding" as a name for a constat wich is used like a enum would write CELLPADDING. Everybody.
The capitalization rules are also well defined. So if you consider to have a class "ThreadNDCConverter" a companay sticking to the original coding style rules will name that class or interface: ThreadNdcConverter. While my company OTOH will name it ThreadNDCConverter, as our rule is to capitalize all abreviations, like FTP, RMI and such.
angel'o'sphere
Cost free eBook I read (by iBook/Kobo/Amazon/ObookO/Gutenberg etc.): "The Green Odyssey" by Philip Jose Farmer.
Looking at the code as a programmer, some things stand out:
- The "copying" JBoss claims doesn't fit. There's differences in braces, keywords and other things that wouldn't be accounted for by automatic reformatting of code. I can't see a programm who's copying code directly going back in and doing that kind of editing. I'd expect braces to be maintained, for example, yet in several places they aren't.
- The similar names are obvious names for types, variables and functions. Given the same spec to start from, without having seen the JBoss code at all, I'd pick the same names.
- The places they cite as having code-structure similarity are very simple. Frankly, it looks to me like there's only one sane way to write that code.
It can't hurt to do a check, but I suspect JBoss is seeing copying where there's just only one obvious way to do something and most programmers, working independently, will make basically the same set of choices for that code."CELLPADDING" is a term in HTML, for example:
<TD CELLPADDING='3'>data</TD>
So any two people both familiar with this could very easily pick that same variable name, just as any two unix C programmers could both use "grep" for a searching function, or decide to name something that destroys threads based on a name "killall". It was already a convention before they used it.
Don't label something "offtopic" unless you know the topic well enough to tell what's on topic.
Two open source projects quibiling over licences instead of producing software, and the project with the less restrictive licence needing to "re-invent the wheel."
What is the reason in "redesigning" an open source project under a different license? Is JBoss so poorly written that it can't be the base of another LGPL project? Is the Apache license so much better for open source projects that it needs to be done?
In the immortal words of Rodney King, "Can't we all just get along?" There sould be no issue here.
(except maybe that "Free, as in freedom" doesn't mean what it should)
I am living proof of the Peter Principle
I would .. but in my version I spelt "somebody" correctly.
Still, following the principle of karmic balance I've misspelt a word or tow in this relpy.
Given the similarities between this and the SCO, Linux claims, is it possible that this could be a "ploy" by the JBoss people to establish a public precedent for the GPL?
I mean, if the public see Apache and JBoss figuring it out, could the outcry against SCO and their detrimental case against Linux be enough to quash it?
After reading the letter, and looking through the exhibits it is evident that this particular Apache project has a systemic problem. In reading much of the preceding posts it would appear that many people equate this letter with the action taken by SCO, and are thus opposed to granting it any merit. On the contary, this is far from the opaque stance taken by the aforementioned SCO. The JBoss Group has shown specific instances of infringement, whereas SCO has not.
What many do not seem to understand, is that this specific instance is exactly the kind of enforcement, of open source licenses, we should be encouraging! If we are to take them at face value, the JBoss Group is merely trying to maintain the integrity of the intellectual property rights of its contributors. I see no reason to demonish them for that! Futhermore, it should also be noted they have not instigated or blatantly threatened legal action, this is also to their credit.
- First they ignore you, then they laugh at you, then ???, then profit.
Simone Bordet, David Blevins, David Jencks, Dain Sundstrom, Greg Wilkins, Bruce Snyder, Jan Bartel, Jeremy Boynes , James Strachan, Jules Golsnell, Richard Monson-Haefel and Jason Dillon.
Almost ALL of the Geronimo developers with commit rights have also worked on the same JBoss code base. Thats too many developers in common to provide a fresh perspective nessary to create a non-derivative clone.
Found this on TSS. Looks pretty crazy that a Geronimo developer admits in a CVS comment that it is derived code.
....". This file was moved to and renamed to the Geronimo project. As I believe in the integrity of the LGPL, I was greatly disturbed by this.
:
o ni mo/modules/core/src/java/org/apache/geronimo/ejb/S ynchronizationRegistry.java?rev=1.1&content-type=t ext/vnd.viewcvs-markup
* /j boss/jboss/src/main/org/jboss/ejb/entity/Attic/Ent ityInvocationRegistry.java?content-type=text%2Fpla in&rev=1.1
N am e=geronimo-dev@incubator.apache.org&msgId=9981 28
...."
:
"As an open source developer I choose to submit my code under LGPL because it ensures me that this code will remain open source, yet the license is flexible enough to allow for embedding. When I first became aware of Geronimo, I took a look through the codebase just for kicks and was deeply concerned that some of my code was derived from or distributed under the ASL license.
As an example, below is a comment from the JBoss CVS from Dain Sundstrom. Dain contributed EntityInvocationRegistry to the JBoss project back in March of 2003. He clearly states in his commit message to the JBoss CVS that this file is a derivative of certain files that I wrote "This functionality was merged from
Date : 2003/3/23 4:28:42
Author : 'dsundstrom'
State : 'Exp'
Lines : +0 0
Description
Tracks the entities and contexts associated with a transaction.This
functionality was merged from GlobalTxMap, TxEntityMap, and
EntitySynchronizationInterceptor.
http://cvs.apache.org/viewcvs.cgi/incubator-ger
http://cvs.sourceforge.net/viewcvs.py/*checkout
Add to this is comments on the Geronimo mail list stating that they are taking JBoss code concerned me even more. Here's a comment from David Blevins:
http://nagoya.apache.org/eyebrowse/ReadMsg?list
And Elba == JBoss 3.2.
"We're taking the Elba/OpenEJB JAAS code, merging it together
So I spent an hour or two looking through the Geronimo codebase back in August of this year....Here are some of my findings.....
Go to theserverside.com to see more.
Dude. That's not a "diff". That's the bits of those files that are the SAME. All the "***" means areas that DID NOT MATCH.
In other words, the few instances where the code appears to be copied are a couple of methods having to do with Logs. Those methods (at least the similar parts) also seem to be little more than wrapping a call to an apache library function.
So... The wrappers probably use the same parameter names as the apache function they are calling. So they should be pretty similar. The method names are [something]Log, following the normal conventions of adding "Log" as a suffix to "something" when you're making the method that Log stuff.
Further, the "copied" bits of similarity have enough differences in them to render it completely moronic. The entire thing is basically a template. The few chances the authors have to alter the template (variable names, and the (brief) comments) were different, but given the limited scope of those bits, they were still similar.
If that's all they have, this is just silly.
blog
Posted By: Jim Jagielski on November 10, 2003 @ 03:49 PM in response to Message #101148.
Just a short note: It is, and has always been, the stated baseline of Geronimo that it not contain any (L)GPL code, whether JBoss derived (in legally specified copyright sense) or not. It's not for any political reasons (and I'm glad to see that this is not degrading into such a forum) but simply because of the letter and spirit of the Apache License. It should also be noted that Geronimo itself is an "project in incubation" within the ASF. It is not (yet) a formal, official ASF project (or subproject under one of the other top level ASF projects). If there is any (L)GPL code within Geronimo, or code that is derived from (L)GPL code (in the legal sense), it will be stripped and replaced. That's just the way it is and it's the way the ASF has always operated.
Also, it should be noted that some exhibits referred to are no longer applicable. For example, Geronimo's Invocation class was entirely rewritten from what was noted in the letter. In other cases, the similarities are due to the fact that they are simple (and trivial) extensions. With XLevel, org.apache.log4j.Level is itself extended, which imposes and provides some of the common structure and names. It has also been noted that for PatternParser, the similarities come from the fact that both code bases implement "nested diagnostic contexts" as described by Neil Harrison in "Patterns for Logging Diagnostic Messages", which can be found in the book "Pattern Languages of Program Design 3", published in 1997 by Addison-Wesley (ISBN: 0201310112). Apache Log4J implements this class in org.apache.log4j.NDC. This class describes how it is to be used, including the use of a "distinctive stamp."
Running with Linux for over 20 years!
There's no protection in copyright law for copying ideas, style and design (this is why we have patent and design protection). There's nothing to stop one person reading a copyright protected work, and extracting basic ideas, style, themes, etc and using that in their own work.
So, there's nothing that can stop one programmer from looking at one set of code, and then walking away and producing an independent version of similar design, but different expression. This could mean that there are similar functions and mechanisms, but looking at the detail it would be obvious that they might be similar, but are not exact copies.
There's a difference here between commercial strategy of clean-room software development. In clean-room approach, what you're getting around is not just copyright, but issues of commercial confidentiality and so on. Confidentiality is not a problem with open source software.