Domain: apache.org
Stories and comments across the archive that link to apache.org.
Comments · 2,937
-
Re:Companies blocking Gmail?
-
Re:Psst. Copyright doesn't work like that!
Not sure what religion has to do with this economics question.
Then you didn't read my post. I was pointing out the flaw in your argument, where you said that Imaginary Property was something that we made up, and so it did not need to be "believed in" in order to be considered real. In case I didn't make myself clear:
I disagree.
As for your questions...
Do you have an alternate proposal for incentivizing content creators?
If not, do you believe that software/books/movies/songs/etc. will be created by individuals for no compensation? Or by companies for no compensation?
When I was growing up, artists and musicians did what they did for the love of the music/art. Metallica went on MTV and said they would never make a music video because they "weren't in it for the money." Now they've alienated their fanbase by suing them for giving away their music, and they're pissed off about the lack of money in the equation. Nevermind that they went on TV in front of millions of people and lied. Seems to me that the artists need to figure out which way they're trying to be, and stick to it once they've chosen. This, of course, is just anecdotal evidence and meandering ranting, and is also totally off topic. Get off my lawn!
Back to the issue at hand.
Yes, I have an "alternate proposal" for "incentivizing content creators". I like to call it "cut out the middlemen who do no work, maximize your profits without raking your customers over the coals, and stop pinching everyone's wallets while whining about how you don't make any money because you only get 1% of the profit of each CD sold". It involves content creators posting their content for the world to enjoy, and asking for (non-specific amounts of) money.
(As an aside, I'm typically insulted when people tell me I have to donate a specific amount of money. If I want to give you 50 cents for a single song, I should be able to. Don't tell me I have to give you $5 and take the whole disk!)If this "economic model" seems surprising to you, then it would seem I need to point out to you that several big names are already doing it, as you can see with a little googling on the subject. To get you started, allow me to offer the following as potential search terms:
"Nine Inch Nails", "Radiohead", "donation", "free music"
It may interest you to learn that the results from these attempts have been successful (or at least, that's what I read). You may also be interested in Jonathan Coulton, who seems to be making quite a decent living by giving his stuff away for free.Despite answering in the affirmative (and with proof!) and thus excusing myself from the "bonus questions", I will continue in this monologue, answering your second and third questions, too. Please note that the emphasis is mine.
The answers are simple and undeniable. Yes, I believe that software, books, movies, songs, etc. are being created by individuals for no direct financial compensation. Yes, I believe that software, books, movies, songs, etc. are being created by companies for no direct financial compensation. As proof, I offer up open source software (Linux, for example; The Apache Software Foundation, for another), free books (check out the Baen Free Library, or Project Gutenberg), free movies, free music, and the beginnings of an economic model that depends on having products and services that have more than just a financial value to the consumers and producers... which raises the questio
-
Re:Why would we care?
Thanks for the flame A. Coward, I hope you can contact me under your normal account, I'll send you the mailinglists that get my special attention, and where I base my (bad) judgment on.
You say that the entire AXIS2 libraries are maintained by INDIAN coders, now I don't want to be a smartass but looking at the team I see some pretty non-Indian sounding names out there. (It is probably pointless to say that I am avoiding Java as much as I can, and didn't notice this effort. It might also be true that I'm avoiding Apache as much as I can, but that is offtopic here)
Refering to the 'funny' characters that was refering to footnotes and 'names' in the from field. Now you mentioned a project an Indian university works on, do you also have a big sourceforge project that is ran by lets say 'a Chinese university'? Just to add them to my radar... -
Re:low performance java
You are missing the point. Most, if not all people who work here are domain experts, not programmers. I will gladly take a 30 % performance hit, but still allow the experts to write something useful, in a realatively clean and simple language that will scale nicely. Plus, the amount of scientific libraries for java is insane. And if you are interested in performance, check out hadoop : http://hadoop.apache.org/core/, i is written in java, mind you....
-
Re:Java?
Well, no so sure, check Hadoop out : http://hadoop.apache.org/core/
Yep, it is written in Java...
-
Re:GPL Requirements
Yes, really. Struts isn't licensed under the GPL. It is, as are all Apache products, licensed under the Apache license. Not all Open Source software is GPL software.
-
Content-negotiation problem, not geolocation.
It's localization and language problem, not a geolocation problem. Where you are or where they think you are has nothing to do with the problem. Quebec is officially bilingual English and French, so while you are correct in that the services like Yahoo! are wrong to serve you French ads, it's because they are ignoring your browser's language preference settings.
Many other countries and regions have more than one official language. It's pathetic to see the slow, steady evaporation of technical knowledge in the market. Ten years ago, anyone and everyone working with WWW services knew how to deal with user-specified language preferences and, where more than one language was required, used the HTTP content negotiation. It's very easy in Apache to support this HTTP function. For Lighttpd you need a lua script, but that too is easy. For Yahoo or Google, they have their own home grown HTTP servers, so have to file a bug report directly with them.
-
Re:Language Compatibility vs. Class Libraries
Last time I checked, Xalan wasn't part of the Java standard library. You'll have to take this one up with Apache.
And if you think that's bad, take a POJO and run it through Apache Axis2's Java2WDSL then that WDSL through WDSL2Java to generate the client. The generated client code is huge, somewhere in the 2-3k line range. -
Re:Your server was coded by a hamster
Well, that may be, but I found his real name anyway:
Ken Coar (no relation to this Ken.) -
Google style file system!!
You can build a Google style filesystem with them that would perform very fast. Check out this opensource solution: http://hadoop.apache.org/core/ Good luck.
-
BigTable, HBase and SimpleDB are the future
I recently blogged on this, but essentially, as long as your average PHP developer thinks of MySQL as a glorified flat file system to place their serialized PHP objects, an always-available, pay-as-you-go distributed database is going to revolutionize application development in the coming years. For those that want to keep control of their data, HBase is coming along quite nicely.
-
Re:Cue the "M$" bashing shrills
Here's the example I've mentioned: http://httpd.apache.org/docs/2.0/mod/mod_auth_digest.html
The interesting part is under the "Working with MS Internet Explorer" header.
I've seen the source code for that module and it's a bug that uses the wrong URI when generating a response. Not completely broken (and only a few servers use digest authentication), but since it's dealing with MD5 hashes the thing wouldn't work at all if Apache didn't introduce the hack. -
Hillary also uses ApacheAs of now, her official site, which uses Microsoft-IIS/6.0, redirects to this page. According to the server's response,
Server: Apache/2.2.6 (Unix) mod_ssl/2.2.6 OpenSSL/0.9.7a mod_apreq2-20051231/2.6.0 mod_perl/2.0.3 Perl/v5.8.8
it's also an Apache site. -
Re:Functional programming
Yes, I agree that the software is more interesting than the hardware. After all, it is their software that makes their data center organization possible.
Concepts from functional programming have really helped out Google, but at the same time they introduce limitations (at least when considering the MapReduce/GFS framework).
I think the larger problem with parallel programming for multiple processors/cores really comes with finding a conceptual model for expressing the computation. Functional programming (in the sense of MapReduce) is one way, but can't be used to parallelize every problem.
Google is able to scale their processing because they have rather singular needs. Most of their computations likely fall into the "embarrassingly parallel" category. This even shows in their file system, GFS, which is optimized for the computations done at Google and not necessarily the general case.
I'm hoping that the work that Google and others have done in this area will result in more parallel frameworks. Perhaps my favorite thing about the way that Google operates is that they use knowledge of computer science and software engineering to build their own, world class, solutions instead of buying a lot of off the shelf systems.
And yes, I agree that Google's engEdu videos are great. They should probably have more exposure than they currently do.
Another good resource for those interested in MapReduce is the open source implementation of the same concept:
Hadoop http://hadoop.apache.org/ -
Re:whyConsider an in-memory database. OK. Instead, you'd like at most only partitions of the data where massive working-sets reside on each partition and do inter-data operations. Got it. Can't find a link, but I'm thinking specifically the hashing mechanism. Given a key, I can find which node should be caching that key. Thus for certain problems that do not nicely break down into small messages, you are indeed limited to single-memory-space hardware. I'm not sure I've seen such a problem. For example, the CPU cache alone is an example of what happens when you break a problem down into smaller chunks.
I can see where a single memory space might do better, though. a simultaneous 700 thread application is NOT hard to write in java at all. Once you know how, I suppose. Consider that most programmers who use threads find ways to deadlock on one or two cores.
The reason I'm drawn to message-passing systems is that pretty much any higher-level abstraction is a Good Thing, as far as threads are concerned. I've come to believe that threads are as harmful as GOTOs. Sure, we'll use them under the hood, but we really need something more structured on top of them.
Also: Message-passing and shared memory are not mutually exclusive. If the message is being passed between, say, two Erlang "processes" on the same machine, I see no reason the contents of that message need to be copied, even if those "processes" are in different OS threads. -
Re:Just 200 bugs?
OS X uses quite a bit of OSS stuff. There's a good chance that a good portion of these bugs aren't theirs.
http://httpd.apache.org/security/vulnerabilities_20.html
I see 3 vulnerabilities in Apache 2 right there.
My Leopard install is showing "OpenSSL 0.9.7l 28 Sep 2006" while my Debian machine is showing "OpenSSL 0.9.8g 19 Oct 2007". I imagine there might be a few bugs there, and it's late enough that it wouldn't have been released close enough to be included in 10.5.0.
Lets see in /usr/(s)bin, zip, gunzip, tar, efax, cron, ip6config, postfix, cups. No chance they had any bugs. They're good open source software.
Responding to you and the guy below, the reason that these bugs are 'so big' is that Apple isn't sending out a bunch of .diff files as updates. If they're upgrading Apache 2 they have to recompile as a universal binary and send out that entire file. -
high failures for me
$ ab -n 5000 -c 5 http://beta.slashdot.org/
This is ApacheBench, Version 2.0.40-dev apache-2.0
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Copyright 2006 The Apache Software Foundation, http://www.apache.org/
Benchmarking beta.slashdot.org (be patient)
Completed 500 requests
Completed 1000 requests
Completed 1500 requests
Completed 2000 requests
Completed 2500 requests
Completed 3000 requests
Completed 3500 requests
Completed 4000 requests
Completed 4500 requests
Finished 5000 requests
Server Software: Apache/1.3.41
Server Hostname: beta.slashdot.org
Server Port: 80
Document Path: /
Document Length: 71365 bytes
Concurrency Level: 5
Time taken for tests: 1685.538664 seconds
Complete requests: 5000
Failed requests: 3715
(Connect: 0, Length: 3715, Exceptions: 0)
Write errors: 0
Total transferred: 357671196 bytes
HTML transferred: 355887895 bytes
Requests per second: 2.97 [#/sec] (mean)
Time per request: 1685.539 [ms] (mean)
Time per request: 337.108 [ms] (mean, across all concurrent requests)
Transfer rate: 207.23 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 111 231 523.0 137 9163
Processing: 857 1452 518.7 1347 7750
Waiting: 133 213 177.1 184 6685
Total: 1008 1684 737.6 1504 12390
Percentage of the requests served within a certain time (ms)
50% 1504
66% 1567
75% 1619
80% 1659
90% 1930
95% 3568
98% 4470
99% 4654
100% 12390 (longest request)
ed@ed-desktop:~$ -
Re:Abnormally?
-
Re:Abnormally?
ab -n 100000 -c 10 http://beta.slashdot.org/
This is ApacheBench, Version 2.0.40-dev apache-2.0
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Copyright 2006 The Apache Software Foundation, http://www.apache.org/
Benchmarking beta.slashdot.org (be patient)
Test aborted after 10 failures
apr_socket_connect(): Operation already in progress (37)
Total of 7 requests complete -
Re:Java based DNS server?
ApacheDS too and its not too terrible. http://directory.apache.org/ Kerberos, DHCP, DNS and user information all storing their information in a multi-master LDAP database out of the box. I think it could be a pretty exciting project once it matures.
-
Nothing's perfect...As you've found, an automated system can be tuned, but you'll always have false positives/negatives.
I like the way spamassassin works - it can provide a rating for each message, which provides a mechanism for users to set the bar to their own preference, instead of having a single setting for the entire organization.
I'm not talking about using individual configurations for spamassassin, it's not realistic to expect most users to be able to deal with all the gory detail of spam filters.
Rather, spamassassin can set a header to indicate its confidence that a message is spam:X-Spam-Level: ****
It adds an asterisk for each "point" of spam score. Users should be able to create an email filter which picks off suspected spam and puts it into a separate folder based on a header like that. Maybe drop all 10+ messages centrally, and let users tweak a local filter to their liking, depending on whether they prefer false positives or negatives.
I use spamassassin as an example only because that's what I use. There are no doubt others which can provide something similar which users could filter on. -
SpamAssassin
-
Re:I'm surprised they didn't do it sooner
Actually they do:
http://activemq.apache.org/ -
Re:processes
Apache2 uses processes or threads, and I believe processes are still the default (not sure though, I'm sure somebody will correct me).
There are some gotchas when using Apache threads - read here: http://httpd.apache.org/docs/2.0/developer/thread_safety.html -
Re:Microsoft's Official View of the Situation
One reason I use Embedded Perl and never bothered with learning PHP. It actually requires effort to not at least escape untrusted input.
-
Re:Microsoft's Official View of the Situation
One reason I use Embedded Perl and never bothered with learning PHP. It actually requires effort to not at least escape untrusted input.
-
Re:Weird disjoint
Erm... Apache is not, in fact, GPLed. And never has been. It's licensed under the Apache License.
-
2-Factor Authentication
RSA securid is pretty good, a bit pricy. Or look at Apache TripleSec, it looks pretty good, it looks a bit young though still.
-
They have only themselves to blame...
Specifically, the closed-source software vendors.
Consider: No matter how much marketing you have, it is ultimately up to the end user of a product to decide if they've gotten the value they expected to get. If said user finds that the closed-source product they paid (possibly) big bucks for isn't worth the media it was recorded on, they're going to cut their losses and try something else.
Alternatively, there are many small businesses that simply can't afford the kinds of prices that closed-source vendors often charge. I know this for a fact, because I'm one of those tiny businesses! If not for FreeBSD, Apache, and Postfix, to say nothing of the surplus hardware market, I would never have been able to get my Internet presence off the ground.
It's not just Freeware, either. How many of us have found low-cost Shareware products to be incredibly useful for the stuff we do, when comparable commercial products would have nearly required a second mortgage? Hex Workshop is, I think, a great example.
If that $60 billion figure is accurate, the commercial software vendors have no one but themselves to blame. Oh, there are some good values Out There, yes, but I think they've been largely drowned out by the flood of questionable products that are turned out with far more marketing than quality engineering.
Happy tweaking. -
Re:And a Pony!
I think it's currently designed to blow in the wind in a decorative manner. (They wanted to be able to make Tomcat reload all applications by touching a file. There's a URL you can call to reload any app any time you like, but they wanted a script. NOW, mummy!) Tomorrow I fully expect to see a request for the blades of the propeller to be make of red, yellow and blue sparkly stuff.
-
Re:Why?Google certainly won't be licensing out BigTable anytime soon. And certainly not for small-scale uses. Maybe not really Google but you still have hypertable and Hadoop
-
Re:Wow, that's a big fat ASS^H^HPI
Have a chip on your shoulder much? Most of what you're saying is simply incorrect. e.g. Java does not have half-a-dozen containers. Yes, the switch from the STL-inspired Vector to the more Java-ish ArrayList was annoying. Same with HashTable to HashMap. But beyond that, all those different containers you think you see are actually interfaces for wiring up complex functionality. Either that or completely different data structures with different performance characteristics. (Remember your CompSci courses?) The Java Collections package (which seems to be the only thing in Java you're remotely familiar with) provides enough functionality to write a complete database engine. Which, as a matter of fact, has been done quite a few times. (Sorry, ran out of words to link. Doh! Still more. Ah, to hell with it.)
The rest of the Java API is also not bloat. There are libraries for printing, crytography, sound, graphics, DOM, file I/O, text parsing, text formatting, text display, mathematics, directory interfaces (e.g. LDAP), distributed object systems, reflection, security, SQL database interface, logging, cross-platform preferences, regular expressions, ZIP/GZip support, accessibility, networking, the compiler, scripting engines, etc., etc., etc. Very little of the core API is redundant, with most of the (few!) redundancies being a result of the early days of Java before they moved away from the C++ style objects.
Nearly all of the post-1.0 APIs were done correctly the first time. Which means that the core Java API is actually quite slim for the amount of functionality it provides. And even then, there is a HUGE number of official expansion APIs for mail, multimedia codecs, network request/response handlers (e.g. servlets), 3D graphics, 3D sound, text-to-speech, speech recognition, telephony, SOAP, REST, USB, Bluetooth, scientific units, cross-platform desktop integration, Instant Messaging, P2P, and quite a bit more. And that's just the official JSR-approved expansions! The OSS and (bleh) commercial worlds are full of unofficial libraries to deal with nearly any problem you can come up with.
If you want bloat, stop looking at Java. Try compiling a few Linux apps sometime and tell me how many redundant libraries you come across. If you know what they all do (which is a miracle in of itself), compiling just ONE of those programs is enough to make a person blush with embarrassment. Not to mention that when a platform IS solidified (e.g. GNOME), it suffers from versionitis. (i.e. The constant need to upgrade your version of the libraries because this latest program no longer targets the version you just compiled. Or even worse, it requires a specific minor release, thus requiring you to have multiple minor releases of the library compiled and installed.) I won't even go into Microsoft's practice of inventing a new API for the same technology over, and over, and over again. (ODBC, DAO, ADO, JET, anyone?)
Now I happen to think that a lot of the choice that Linux offers is good. But don't point fingers at other platforms when there are more than enough examples of far worse situations close to home. -
Re:Microsoft Bribe?since over 50% of hosts are running apache, i suspect your stats might be a bit off. That would be Apache that also runs on windows?
-
Re:FastCGI != Apache Modulea "real world app" ran 98% as fast under FastCGI as under mod_perl
I know this is getting off-topic, but the reason people usually choose mod_perl over (?:plain|fast)CGI is its featureset. Those apache handlers are really nice for transparent or demanding applications.
-
Re:As of now
I apologize, the link for sample instructions in the previous post was wrong.
-
Re:As of now
Unless servers normally don't compress their responses
... hm. I should reread the Apache documentation.
Apache does not compress by default. You have to install mod_deflate and set up the DEFLATE output handler first. The sample instructions are a bit simplistic, but they should work.
As someone else noted in one of the sibling replies, gzipping images isn't going to get you anywhere... the Opera proxies actually downgrade the image quality (and size? I'm not sure, I've never used Opera Mini) to improve speed. -
Re:As of now
Unless servers normally don't compress their responses
... hm. I should reread the Apache documentation.
Apache does not compress by default. You have to install mod_deflate and set up the DEFLATE output handler first. The sample instructions are a bit simplistic, but they should work.
As someone else noted in one of the sibling replies, gzipping images isn't going to get you anywhere... the Opera proxies actually downgrade the image quality (and size? I'm not sure, I've never used Opera Mini) to improve speed. -
Apache project.
-
Re:No search feature
I think this is a great idea, but from the brief glance at the site that I took, it would appear that is has absolutely no search feature at all. LexusNexxus and the other sites have sophisticated search features. 1.8 million records stored in 1000 pdfs is more or less worthless IMO.
I expect someone will use something like Nutch to index and make this searchable pretty soon. -
Hadoop Distributed File System
You could put a Hadoop Distributed File System (HDFS) on them. HDFS allows you to use the storage as a single file system that is stable and reliable. We have multiple 2000 node clusters with petabytes of user data on them. Because the blocks are each replicated to 3 hosts, if a node goes down, your data on that node is not lost.
-
Re:distributed file systems
Hadoop is an open source DFS that is Java based, so it runs on Windows. It is pretty fault-tolerant so it might work in a workplace environment. We run it in a computer lab where the machines are constantly up and down, and it works pretty well. It also has MapReduce, which lets you distribute IO tasks. http://hadoop.apache.org/core/ Jacob
-
hdfs
hadoops's hdfs is the only thing which comes to my mind but i don't know if this could be any useful at all.
http://wiki.apache.org/hadoop/DFS
another project is wuala ( http://wua.la/ ), but that's not for internal use... -
Re:A project for Google? - whoops here it is
Google hasn't released anything other than papers on GFS and their implementation of MapReduce. At this point, though, I'm not sure it matters since we have Hadoop (which, being mainly Java, C, and a little bash) runs perfectly fine on all of the major operating systems, including Windows.
-
Re:"How will you use XML in years to come?"
JSON is inflicting Javascript on everyone.
No, it really doesn't, but if "JavaScript" in the name bothers you, you might feel better with YAML.
No, it wouldn't because JSON is bare bones data. It's simply nested hash tables, arrays and strings. XML does much more than that. XML can represent a lot of information in a simple, easy-to-understand format. JSON strips it out for speed & efficiency. Which sort of gets into the point I did want to make but was too impatient to explain: JSON is good where JSON is best, and XML is good where XML is best. I dislike the one-uber-alles arguments because it's ignoring other situations and their needs.
There are other programming languages out there.
And there are JSON and/or YAML libraries for quite a lot of them. So what?
Would you like to live in a world of S-expressions? The LISP people would point out there are libraries to read/write S-expressions, so why use JSON? The answer of course is that we want more than simply nesting lists of strings. We want our markup languages to fit our requirements, not the other way around. And saying "JSON for Everything", which the original poster did was... silly.
My problems with JSON are:
- No schema: XML Schema not only makes it easier to unit test, but it can be fed into tools that can do useful things like automatic creation of Java classes and code to read/write. Does JSON have anything like that? Of course not, because it would defeat JSON's purpose: easy Javascript data transmission.
- Expressability: With XML, I can create a model that fits my logical model of the data where I use attributes to augment the data in the child elements. Doing that in JSON is a kludge with a hash-table to represent an element which can't be easily converted into a graph for easy understanding.
- Diversity: I use GML in my day job. A lot. I can easily set up an object conversion rule with Jakarta Digester that I can painlessly drop into future projects without modification. That's the power of namespaces. I can build an XML document using tags from a dozen different schema, and then feed it to another application that only looks for the tags it cares about.
- XPath. 'Nuff said. Ok, one thing: this should have replaced SAX/DOM years ago.
JSON is great for AJAX where XML is clunky and a little bit slower (my own speed tests hasn't shown there's a huge hit, but it is significant). XML is great for document-type data like formatted documents or electronic data interchange between heavy-weight processes. My point was that the original poster's JSON is everything was narrow-minded, and that XML answers a very specific need. There are tonnes of mark-up languages out there, and I think XML is a great machine-based language. I hate it when humans have to write XML to configure something though. That really ticks me off. But that's the point: there should not be one mark-up language to rule them all. A mark-up language for every purpose.
-
Desktop Kernel Instability?
That's some kind of contradiction along the lines of "military intelligence." I kid.
Slightly off topic:
Vista desktop + openldap win32 binaries + apache and bind = GNU Windows Server?
openldap on win32: http://www.openldap.org/lists/openldap-software/200705/msg00152.html
apache2: http://httpd.apache.org/download.cgi
kerberos5: http://web.mit.edu/Kerberos/kfw-3.2/kfw-3.2.2.html
Granted, the average win32 admin will hit a wall because Microsoft does not design their product, documents and services for an admin smart enough to DIY.
Openldap/kerberos5/apache2 opens many, many more security/identity/authentication possibilities than Microsoft's active directory. -
What about the OTHER open source contributions?
It is a bit disappointing and surprising to see that the author spent more time talking about Silverlight and moderation and other things when he could have written about the *other* open source things that Yahoo! either supports directly or where patches have been contributed. Where is the discussion about Hadoop, Pig, ZooKeeper, YUI, PHP,
... ? Do these not impact Linux users from his point of view? -
What about the OTHER open source contributions?
It is a bit disappointing and surprising to see that the author spent more time talking about Silverlight and moderation and other things when he could have written about the *other* open source things that Yahoo! either supports directly or where patches have been contributed. Where is the discussion about Hadoop, Pig, ZooKeeper, YUI, PHP,
... ? Do these not impact Linux users from his point of view? -
Re:If you don't filter, you get blocked.
If an ISP doesn't filter their outgoing email to make sure that it's own users aren't spamming, they WILL get blocked. I'm on a super-secret anti-spam mailing list which I can't tell you about, and everybody there cheerfully admits to blocking their own users' outgoing spam. It's necessary.
dude, spamassassin-users isn't that secret. :) -
Re:Microsoft has given everyone a bad name.
-
FYI: Not knowing ...+ a good guide ...?
For the User/Developer, among the best are
... "Open".
Apache FOP: http://freshmeat.net/projects/fop/
Apache FOP: http://xmlgraphics.apache.org/fop/download.html
NetBeans: http://download.netbeans.org/netbeans/6.0/final/
Alfresco: http://www.alfresco.com/
Good Guide: http://www.vrcommunications.com/PDFs/ditaotug141-03122007-pdf.pdf
Title DITA Open Toolkit User Guide: Fourth edition, December 17, 2007. Based on release 1.4.1 of DITA Open Toolkit. All files copyright 2006-2007 by VR Communications, Inc., unless otherwise indicated. Licensing Edition, release, copyright and usage of this document and related materials is regulated by a Common Public License (CPL) granted by OASIS (Organization for the Advancement of Structured Information Standards), http://www.oasis-open.org/ . DITA Open Toolkit is an open-source, reference implementation of the OASIS DITA standard (currently DITA 1.1).
JAVA: http://www.java2s.com/Open-Source/Java/CatalogJava.htm
Open Office: http://www.2008-official.com/openoffice/