Domain: apache.org
Stories and comments across the archive that link to apache.org.
Comments · 2,937
-
Those that provide an alternative to closed sourceThe big winners (to me) are those projects who provide a viable or better alternative to available closed source software and those that you'd put into a business and trust to "just work". To find them you need to test, test and test some more. My winners, those that spring to mind immediately as being trusted not to embarrass me, are
- mOnOwall - firewalling
- IPCop - firewalling
- Metadot - CMS
- Apache - web server
- Bind - Name Server
- asterisk - telephony/voip
- Sendmail - cussed but stable MTA
- SpamAssassin - spam filtering
- MIME-Defang - email content filtering/manipulation
- ClamAV - Virus filtering
- Freebsd - the best OS since sliced bread (IMHO)
- Centos - Not to shabby an OS either
- ...
-
Those that provide an alternative to closed sourceThe big winners (to me) are those projects who provide a viable or better alternative to available closed source software and those that you'd put into a business and trust to "just work". To find them you need to test, test and test some more. My winners, those that spring to mind immediately as being trusted not to embarrass me, are
- mOnOwall - firewalling
- IPCop - firewalling
- Metadot - CMS
- Apache - web server
- Bind - Name Server
- asterisk - telephony/voip
- Sendmail - cussed but stable MTA
- SpamAssassin - spam filtering
- MIME-Defang - email content filtering/manipulation
- ClamAV - Virus filtering
- Freebsd - the best OS since sliced bread (IMHO)
- Centos - Not to shabby an OS either
- ...
-
SourceForge isn't the only show in town...
While projects like Azureus, Gimp, and countless others have originated or flourished in some form on SourceForge, far more telling cases for the power of open source are the repositories like the Apache Software Foundation. While most Apache projects are based on Java, it's impact on the open source community (and software in general) can't be overstated.
That aside, it's really hard to classify open source projects from the perspective of applications. Many of the projects are utilities, satisfying a narrow solution like reporting, connectivity, adaptation, etc. These types of releases will never see broad interest from end users, but will instead find their niche within the development community.
But, if the real focus is what open source has to offer the end user, SourceForge provides ranking for each of the projects. The higher the rank, the more activity the project has enjoyed, the more downloads that have been made. It's a fairly reliable indicator on the success of the project.
-jjk -
Re:Um
Page rank was/is hands above its peers at introduction, but it's not pagerank that makes google spectacular. It's about making pagerank fast. Google successfully improved web search, and then made it scalable both in number of users and in pages indexed.
Open source is good at providing new ideas, but high performance doesn't have a huge market for users. Many of Google's most important optimizations require a special configuration and hardware, like thousands of computers laden with RAM, power etc. As it happens, wikipedia often times cannot handle it's users demands. Their search feature is sometimes handed off to google. Lets also not forget the amount of data required to be stored. Sure, you can download tons of websites, but you can't discard everything; you'll need the url, relevant links, and context, excerpts to show users, etc.
Of course, all this pessmism hasn't stopped people from trying. If you discover a list of sites running Nutch, you'll notice most are niche sites, either offering to search a single website, or subtopic. The only general web search deployment I found was fast but crappy. -
Re:Um
Yeah, the shine's definitely gone off Google, eh? at the rate google (and yahoo) are swallowing up other sites there's going to be some major monopolising going on.
I think searching the web is one of the few bastions where closed source still rules, and it surprises me that no-one's really made an open source search engine. I'm aware that there are things like Nutch and ht:dig out there but their scope is completely different (site-wide searching primarily).
So - why don't we have an open source search engine? Pagerank is fairly easy to implement, and would serve as a good starting point for improvement. Writing apps to rank and sort web pages strikes me as the type of problem that a lot of smart people would find a lot of fun.
I know that it requires a crap load of infrastructure, but if Wikipedia can handle it. Besides, you can index one hell of a lot of pages with the standard few GB of bandwidth a month on cheap-ish hosting plans.
So - why not? -
Re:We use JiraI've used JIRA at two different employers and it's a very nice product. The interface is powerful and at the same time simple enough (looking at you bugzilla). It works with different databases, but it also has a standalone install which a breeze to install. Comes with nice integration to cvs ( & subversion too I think) and (paying?) customers get the source code too so you can integrate however you like.
The only downside to Jira is that it's price tag (for business users) has risen steadily, but at least they've given free licenses to open source projects like Apache Software Foundation, Codehaus and JBoss.
I've also used Mercury's TestDirector, but it seems like a glorified excel-sheet when compared to JIRA. TD is more suitable for reporting bugs, and it doesn't support the software development process like JIRA does. Jira has projects, components, issue links, releases, change notes, workflows, security levels, reports and so on.
We also have an inhouse built issue tracking system. It works to some extent, but its GUI doesn't really scale to handling large number of tickets. And since it's not developed actively it will probably stay as it is for some time.
-
Innovation isn't always recognized
unless it is coming out of a big software house and is associated with a big media and marketing event.
Don't you know there exist many smaller groups that are building out new ideas?
Here are two examples of largly unrecognized and unknown software innovation:
1) the Ofbiz web applications framework and
2) the SmartVariables network-shared-memory framework
Both of these outfits are too small for any big media blitz campaign, yet they still have released outstanding, innovative open-source software. I see all too often execs claiming that there is nothing new. But these folks are not looking in the right places.. -
SpamAssassin still works
In spite of the rise in spam, you can still keep everything but the stray message or two a day hitting your inbox if you configure SpamAssassin well. Get a guide like McDonalds' SpamAssassin and follow the steps for the usual configuration based on examining headers and referring to Razor. Then, take a massive collection of all sorts of spam, from text pump 'n' dump to image spam, and feed it into sa-learn, SpamAssassin's Bayesian training system. A good setup with extensive Bayesian training will cut out almost everything. And it's not too hard. If you can install a Linux distro, you can configure SpamAssassin.
However, this is obviously only to filter spam coming into your own box. When I am travelling, I try to force myself to leave my laptop behind in order to truly relax, but that means that I have to use my e-mail provider's web interface. And when I see that my Inbox has 500 messages after just 36 hours, then I start to understand the grumbling that SMTP is broken and we need a drastically reformed protocol.
-
Re:MS Office Compatability?
Don't they all use the same Open Office conversion code underneath to read
.doc files?
You would be AssUMe-ing too much. For the purposes of Ajax13, I'm fairly certain (based on their serverside messages) that they're using Jakarta POI to read Microsoft Documents. Thinkfree's development actually predates the general availability of OpenOffice and, I believe, uses their own in-house API. (Though I may be incorrect about that last part.) Google uses... whatever Google uses. I don't think the information on their backend is really available.
Long story short, there are more APIs out there than just OpenOffice. -
"Caching" not the answer
As I replied for the previous Netscape RSS DTD article http://slashdot.org/comments.pl?sid=216818&cid=17
6 03480, caching DTDs from the network is not the answer if there is the possibility they will not be there in the future:
The proper thing to do is for your application to use an XML catalog for resolving entities/URIs and bundle the DTD files with the application. There is a good article at http://xml.apache.org/commons/components/resolver/ resolver-article.html that helped me out. In addition, if you are using Eclipse with the web tools platform, you can customize the catalog so it resolves DTDs and entities locally. See http://wiki.eclipse.org/index.php/Using_the_XML_Ca talog. -
Joost is big on open source
Joost is based on Mozilla's XUL Runner framework .
Dirk-Willem van Gulik from Apache the Foundation is the CTO .
Some of the Open source tech used
Apache, Cocoon, Dojo, Jena, Mozilla, RDF, SVG, XML, XUL
http://cruisecontrol.sourceforge.net/
http://ant.apache.org/
http://wicket.sourceforge.net/
http://lucene.apache.org/ -
Joost is big on open source
Joost is based on Mozilla's XUL Runner framework .
Dirk-Willem van Gulik from Apache the Foundation is the CTO .
Some of the Open source tech used
Apache, Cocoon, Dojo, Jena, Mozilla, RDF, SVG, XML, XUL
http://cruisecontrol.sourceforge.net/
http://ant.apache.org/
http://wicket.sourceforge.net/
http://lucene.apache.org/ -
Re:Bull
There is no need to host the DTDs on an actual server. I usually copy all the DTDs I need into a subdirectory of my application's installation path. See my original post at http://slashdot.org/comments.pl?sid=216818&cid=17
6 03580, or a great article on entity and DTD resolving at http://xml.apache.org/commons/components/resolver/ resolver-article.html. -
Re:The point...
Resolving DTDs and entities in XML parsing does work like CLASSPATHs in Java. Applications need to properly set up an XML catalog which tells the parser to look in a local store before the Internet for certain URIs. Please see my earlier post at http://slashdot.org/comments.pl?sid=216818&thresh
o ld=0&commentsort=0&mode=nested&cid=17603480. Or jump straight to Norman Walsh's informative paper at http://xml.apache.org/commons/components/resolver/ resolver-article.html. -
Re:Why would this break RSS readers?
You are right. I wish I would have seen this article earlier so that I could have posted sooner -- and others to get to see the "solution"!
Ever since I started developing on a laptop during my commute, I discovered that XML-based programs like J2EE servers would simply stop working. I experienced the same thing at work where, by default, your desktop applications (namely Eclipse) do not have access to the internet, and the servers will never have access to the "Internet".
The proper thing to do is for your application to use an XML catalog for resolving entities/URIs. There is a good article at http://xml.apache.org/commons/components/resolver/ resolver-article.html that helped me out. In addition, if you are using Eclipse with the web tools platform, you can customize the catalog so it resolves DTDs and entities locally. See http://wiki.eclipse.org/index.php/Using_the_XML_Ca talog. -
Having long abandoned PHP
It's reading about issues like this that make me love Hibernate, Struts and Tomcat. At least at work.
;-) It's all about the sensible security defaults and maintainability, neither of which are particularly common in PHP development. Seriously. All you PHP fanboys, I don't know if you're just scared of the learning curve or what, but the jump to J2EE is totally worth it for any serious application. -
Having long abandoned PHP
It's reading about issues like this that make me love Hibernate, Struts and Tomcat. At least at work.
;-) It's all about the sensible security defaults and maintainability, neither of which are particularly common in PHP development. Seriously. All you PHP fanboys, I don't know if you're just scared of the learning curve or what, but the jump to J2EE is totally worth it for any serious application. -
Re:Scare Tactics
Would IBM pour as many engineering resources into the Linux kernel if it was not GPL?
Yes -
Marketshare != Bette Target
I've seen a lot of comments sugest the WIndows is easier to target because it has a larger marketshare.
This is a BS argument. Here is one example of a program with larger marketshare but fewer cracks, both attempts and percentage successes:
Apache
IIS
Just because it's a bigger target doesn't mean it's a better target. Windows is a good target because it's big AND because it has a shit-ton of security flaws. You need to be a security expert to properly safeguard Windows, and most people don't have enough security expertise.
Weylin -
Re:SORBS doesn't block mail
> What I would like to see is a lovely set of SpamAssassin rules that
> knows about SORBS and knows about all the major ISP's and adjusts
> scores appropriately. I tried Googling for such a thing myself and
> didn't come up with any. Pointers appreciated.
Here's a pointer: dnswl.org may get included in one of the upcoming
SpamAssassin releases. Currently, there is a rule in the "Rule Sandbox"
http://svn.apache.org/viewvc/spamassassin/rules/tr unk/sandbox/felicity/70_dnswl.cf?view=markup
dnswl.org lists "good" mailservers at four trust levels (none, low, med,
hi). All levels can be used to eg bypass greylisting (because all listed
addresses are supposed to be real mailservers) and outright blocking caused
by RBLs, and all can be used in a scoring mechanism (eg -0.1, -1, -10,
-100 points in SpamAssassin).
Yes, there is a certain risk that a spam may slip through (especially in the
"none" and "low" categories), but in many cases a missed spam is a lower
risk than lost legitimate mails.
More information (including on how to get your own server listed) can
be found at http://www.dnswl.org/
Disclaimer: I'm involved with the project. Btw., we welcome support in
the form of well-maintained whitelist data and DNS mirrors :) -
Re:Ubuntu is pretty good stuff.
It took 3 minutes to get an apache 3.x series server with mod_perl up.
I have to try this Ubuntu, which can run versions of Apache that don't exist. -
shut down?
Why shut down your home system? Why not have it available as a server to make your life easier? I agree with other posters about using "offline" mode of Thunderbird and like clients.
In case you're thinking that you have a particularly repressive ISP...
My ISP blocks ports 80 and 25 - particularly irritating, if you ask me. My ISPs TOS, if read to the letter, would mean that multiple browser windows or tabbed browsing are inappropriate because it's more than one session over the broadband pipe.
I agree that it would be ideal if I could use every port I want, block the ones I want to firewall - but I'm too cheap to pay for that kind of access.
So I work around it. I use dyndns [dyndns.com] to create a pointer to my dynamic IP address. My ISP does not block https or ssh ports, so I leverage those to get what I want.
I use cron, fetchmail [berlios.de],
procmail [procmail.org],
spamassassin [apache.org], and
postfix [postfix.org] to bring mail from my ISP to my local system.
I use uw-imapd [washington.edu] to share my mail with other computers on my home network
I use ssh and pine, or apache+php+MySQL+https (self-signed cert) with roundcube [roundcube.net] to get remote access to my IMAP server.
I use WinSCP [winscp.net] to get access to my files at home when I'm at work. My data is *MINE* and I easily back it up (nightly and offsite qurterly - snapshot backups coming soon thanks to rsnapshot [rsnapshot.org], perl and rsync)
Every tool that I use is free of charge and as free as the GPL and apache licenses are free (zealots can feel free to argue with someone else about the relative freedom of the GPL, thanks.)
I certainly could pay for more open TOS with an ISP - I could even host my applications at an ISP. I'm cheap, and this solution works well enough for me.
Hope you find a solution that works for you!
Respectfully,
Anomaly -
Simple Defense
Since date and time information isn't included in TCP/IP packets, this kind of attack won't work for all services. Assuming that the "hidden servers" in question are HTTP servers, there is a rather simple workaround: simply disable sending the "Date" header. This can probably be accomplished with mod_headers in Apache, but I've never tried using it myself. Oddly enough, the server would still be standards compliant. Obviously, servers that leak the current time by some other means would still be vulnerable.
A simpler, less precise attack of this nature would simply be to continuously ping the suspected server via both Tor and the public internet. If they (reproducibly) fail at the same time (and we could launch a denial-of-service attack to make it fail), they're probably the same machine. Attacks of this nature might even be able to confirm if a hidden server is on the same network as another computer.... But any of these attacks require someone to suspect you of running the server in the first place—and if they do, you probably have bigger problems to worry about.
The bottom line is, as Tor's manual clearly indicates, having a hidden server machine accessible from both Tor and the internet is a bad thing. Operators of hidden services should use a dedicated machine and block all incoming traffic (on all TCP and UDP ports) that is not via Tor.
-
There are workarounds
My ISP blocks ports 80 and 25 - particularly irritating, if you ask me. My ISPs TOS, if read to the letter, would mean that multiple browser windows or tabbed browsing are inappropriate because it's more than one session over the broadband pipe.
I agree that it would be ideal if I could use every port I want, block the ones I want to firewall - but I'm too cheap to pay for that kind of access.
So I work around it. I use dyndns to create a pointer to my dynamic IP address. My ISP does not block https or ssh ports, so I leverage those to get what I want.
I use cron, fetchmail,
procmail,
spamassassin, and
postfix to bring mail from my ISP to my local system.
I use uw-imapd to share my mail with other computers on my home network
I use ssh and pine, or apache+php+MySQL+https (self-signed cert) with roundcube to get remote access to my IMAP server.
I use WinSCP to get access to my files at home when I'm at work. My data is *MINE* and I easily back it up (nightly and offsite qurterly - snapshot backups coming soon thanks to rsnapshot, perl and rsync)
Every tool that I use is free of charge and as free as the GPL and apache licenses are free (zealots can feel free to argue with someone else about the relative freedom of the GPL, thanks.)
I certainly could pay for more open TOS with an ISP - I could even host my applications at an ISP. I'm cheap, and this solution works well enough for me.
Respectfully,
Anomaly -
Amendment IV, United States Constitution
They do so need a warrant. See: Amendment IV, United States Constitution
"The right of the people to be secure in their persons, houses, papers, AND EFFECTS, against unreasonable searches and seizures, shall not be violated, and no Warrants shall issue, but upon probable cause, supported by Oath or affirmation, and particularly describing the place to be searched, and the persons or things to be seized."
In any case, they still DO need a warrant to search that 3rd party server. The warrant would simply have to describe the place to be searched, and specify the things to be seized, in accord with the ammendment.
There are lots of analogies: P.O. Box, Voice Mail, Tapped phone lines, Gym locker, direct ip-ip chat (with no brokering middleman server, except routers). Each one of them has a slightly different feel, but in each case it seems clear that the RIGHT thing to do is respect the person's privacy. That the email sits on a server with a delay does not seem relevant (any more than the latent speed of light transmission time when the sound is IN the phone lines)
However, until the authorities have been duly punished for violating the man's right to privacy, it would behoove those who WANT their rights protected to run their own mail servers (either in foreign, non-extraditing countries or in their own homes.) :-)
http://james.apache.org/
If electronic communications had existed at the time of the framing of the constitution, I really doubt they would have left gaps for the government to abuse our privacy by means of raiding electronic mailboxes.
PS -- It wouldn't hurt to use pgp encrypted mail ...uh... sure.
"a-l-w-a-y-s---d-r-i-n-k---y-o-u-r---o-v-a-l-t-i-n -e" :-D -
Even Java6 considers JavaScript important
I never touched JavaScript myself (well, to open and close a window, which is only an example I grabbed from a website) yet have been annoyed at people who didn't even manage to seperate Java from JavaScript. When it comes to web programming I've only used Java thus far through jsp or full servlets. Still, JavaScript is absolutely an important issue. It allows for server-side scripting without the hassle of a full container (in my case java container; tomcat or glassfish).
Still, what company knows best about JavaScript than Sun themselves? Recently the latest version of Java (Java SE 6) has been released and guess what one of its keyfeatures is? An API to support scripting languages. Right now the so called scripting engine only supports JavaScript, read about it here. For those really interested, here is the API documentation (javadoc).
I know I'm biased but heck; if people still don't realize the possibilities of JavaScript I'm pretty convinced that the combination of Java and JavaScript will enhance some of it. Java SE6 is pretty extensive, and with the addition of JavaScript even more flexible. -
Re:WagerThere's an easier way. You can hand mod_access netblocks and more. This method will avoid eating cycles with mod_rewrite. If you can put it in your conf instead of
.htaccess, you'll save even more time/processing. Just put it in for your doc root. From my httpd.conf:<Directory "/var/www/htdocs/">
Now I gotta look up IPs for these clowns... damn copyright ambulance chasers... arin.net here I come!
# BRYN'S DENIALS
# allresearch.com
deny from 209.73.228.160/28
# branddimensions.com user-agent: BDFetch
deny from 204.92.59.0/24
# cyveillance.com
deny from 63.148.99.224/27
deny from 65.118.41.192/27
# www.markwatch.com user-agent: markwatch
deny from 204.62.224.0/22
deny from 204.62.228.0/23
deny from 206.190.160.0/19
# nameprotect.com user-agent: NPBot
deny from 12.40.85.0/24
deny from 12.148.196.128/25
deny from 12.148.209.192/26
deny from 12.175.0.32/28
# rocketinfo.com
deny from 209.167.132.224/28
# END BRYN'S DENIALS
</Directory> -
Re:Question from a .NET developer trying to go OSS
It depends on your task. If you are building small to medium-sized web-applications, I would recommend Seaside. For larger projects, there are things like GNUstepWeb and Struts. If you want something slow that doesn't scale well, but is 100% buzzword-compliant, then there's Ruby on Rails. If you want to re-use existing ASP.NET code then you could try Mono.
For many needs, Apache is not a good choice. I personally prefer Lighttpd, which is lighter, faster, and easier to configure. It has nice FastCGI integration, so you can use it with most frameworks.
As for databases, I still haven't found a good reason to use MySQL. If you need a real database, I'd go with PostgreSQL, which is more standards compliant than MySQL, and faster for complex queries. If you want something slightly more structured than a flat file, then try SQLite, which is simple, lightweight, and faster than MySQL for simple queries.
-
Java made easier
A big problem with regards to PHP vs. Java is that its awfully harder to get Java going in a server-side fashion on your box. Where you only need a simple Apache module for Java you'd need a product like the Tomcat server to get the best out of Java. Which, for a common user, means more trouble and overhead. Now you got 2 software products to configure which can make your life a lot harder.
When looking at the Java Enterprise Edition (EE5) you'll notice that it comes with its own application server called Glassfish, its even fully open sourced. However, even though its a lot easier to setup and tune Glassfish when compared to Tomcat (a nice spiffy web interface in which you can do everything vs. a limited administration interface and lots of manual editing of config files) its still making things too complex for common use. You still need at least 2 ports opened up (one for webserver, one for application server) or figure out a way how to start forwarding requests.
No more. A very good alternative for all this complexity is the Sun Java Webserver 7. Its not offcially released yet, this is the 3rd release candidate, but despite that its very useable. This is basicly a combination of both an extensive webserver which can easily compete with the likes of Apache and a java container (or "application server") fully embedded into the system. So you only need to worry about a single software product to setup both your web and application -server needs.
There is a little thing to keep in mind: when it comes to Java technology (EE5) then you'll notice that the Java webserver 7 is a little behind in some regards. The support for JSP, servlets, etc. doesn't keep up with the latest versions but supports standards (jsp, serlvets, jsf) which are one release or such behind. But that doesn't mean its functionality is any less than Tomcat or the Sun application server.
If you're now considering Java but looking up to maintaining 2 software products I'd definatly check this out. It runs on Windows, Linux and even Solaris (duh, as if that wasn't to be expected from Sun ;-)). -
Tapestry Integration?
Does anybody know how to use this toolkit along with Tapestry? We have a couple of web apps of considerable size now, done in Tapestry, and are in the process of adding some AJAX functionality. This looks like a great alternative but the web developer told me he hasn't been able to integrate these two frameworks.
-
Why?It's not difficult to put your jar files in the proper place. If you are having Tomcat or J2EE deployment issues, (especially classpath issues) I would suggest automating the process with something like Maven. With one of these tools, you simply declare in the configuration file which jars are needed for compilation and which are needed for runtime, and Maven makes sure that everything winds up in its proper place.
I mean, really. Java classpath issues are so 1999. -
Interesting, but...
I see the following problems with the article.
1. Other than MySQL, it doesn't specify the software in use (it implies Apache Tomcat, but that is not explicitly stated), except...
2. Microsoft Web Application Stress Tool. Pardon me if I refuse to put any faith into tools by Redmond. Particularly since, if Tomcat was in use, MWAST is being used instead of Apache's own ab tool.
3. Why wasn't Java 1.5 tested? By definition, Java 1.4 means that you're testing vs. EJB 2.x instead of EJB 3.x. I don't know what changes have been made between the two, as I haven't learned EJB, but I'm assuming there have been some changes between the two, for better or for worse.
4. What's causing the OutOfMemory errors? If a pair of servers are falling over at 16 simultaneous requests for a 301 row dataset, there's a major problem.
Just some thoughts. -
What are you talking about?
There are plenty of good open source CMS systems in Java.
At the bank I work at we use OpenCMS.
Magnolia Community Edition is probably better.
Apache Lenya is another CMS written by a well known group but I can't vouch for it
JBoss Nukes is poorly documented but written by JBoss so should be good. -
Apache has Lenya
You might want to check out Lenya, which is based on the Apache Cocoon project. I don't know mature and full-featured it is though. More generally speaking, Apache has a lot of Java-based projects that can be used toward building a CMS, so if you did want to write your own, you could do worse than start on top of some Apache framework.
-
Re:Poor Java Support with Webhosts
Here's my guide:
1) Go to http://tomcat.apache.org/download-55.cgi and select your distribution. I assume you are running Windows, so download binary installer.
2) Run installer and click 'Next' until finished.
3) There's no step 3).
You can get Tomcat up and running in minutes and writing JSP pages is very similar to PHP. -
Apache Lenya
I have not had a chance to use it past the online demo*, but you might want to check out Apache Lenya
* A contract webmonkey proposed switching to Lenya halfway through a project. As much as I like F/OSS, I decided I'd rather have his existing VB/ASP mess working "on time" (only months late) rather than a nifty Lenya setup ready sometime after I would be fired for still not having delivered the new website...
-
Re:Picture spam
"Maybe it would be possible to OCR every image as it comes through"
-
Re:I've always liked the IDEA of OpenID
Sorry, I didn't notice there was no download link there... I couldn't find a real project page, but the source is here: https://svn.apache.org/repos/asf/incubator/herald
r y/idp/pip/trunk/ -
video editing in Linux
I moved to Linux in 1994 as my primary desktop and server OS. About three years ago I decided that I wanted to produce some video content. Video editing was theoretically possible in Linux - I hooked up my camcorder to my Linux box and did some editing, but the tools were primitive and cofiguration was unusually difficult.
Eventually I looked at OS X and iLife. I decided to jump to a Mac. What a great move!
I found that Linux made it possible to do some things, but OS X made it simple to do them.
Fast forward a few years. I now have a few macs at home - their licensing policy makes it affordable to have several machines and a five user license for the OS and tools. My family loves the power and usability of the Mac.
Recently my linux server at home began acting a bit flaky. I did some analysis and determined that hardware replacement was needed. After checking prices for CPU/motherboard/RAM (and potentially hard disk) I figured out that I'd need a few hundred bucks to replace the CentOS box with a new one. After thinking about whether to drop a few hundred bucks or not on this server, it occurred to me that I might be able to move all of the services hosted on linux to OS X.
I found that samba,
hotwayd,
dansguardian,
uw-imapd,
fetchmail,
procmail,
spamassassin,
rsync,
rsnapshot,
apache2,
MySQL4,
PHP,
perl,
java, and
squid were all available for OS X.
Most of these are "in the box" with OS X. The only ones that I need to compile from source are uw-imapd and squid! Of course I need the bundled developer tools to get a compiler, and the Apple/BSD startup mechanism and the netinfo wierdness require some tweaks - but since when did Linux *not* require any tweaking?
What this means to me is that after more than a decade of running Linux at home (and work) I am *this* close to shutting down Linux for good at home.
Hope your experience is similar.
Regards,
Anomaly
PS - I share your recent comments about the loss of a pet. :( -
Re:not the whole internet!+5 Funny?
It'll be funny when you try putting an IP in a browser to access a site using name-based virtual hosts.
Enjoy.
;-) -
Seconded; Greylisting is of limited use
Greylisting is no longer completely effective.
Congratulations; you are now a finalist in our "Understatement of the Month" contest.
The Penny Stock botnet very definitely gets past greylisting. It's available as an opt-in service here at my job; I recommend it as the first step these days in addressing user Spam complaints. I get a list of what hit the greylist filter once per day; I can deal with that. We also have a secondary central Spam filter (SpamAssassin?) using some standard definitions, updated weekly, that can catch most of the rest. I have mine set so that anything that gets more than 8 points is moved to my Spam folder.
Around early October, I noticed that I was getting sizable amounts of Spam again. So, I started reading headers. Most of the crap coming through was random text excepts (a mix of Guternberg and various web-accessible mail archives), one to three word subject lines, GIF inserts with penny stock pushes, and at most 2 points from the central spam detector. Within a week, I was getting user complaints-- and I since I try to keep my users both scared and happy, this was a bad sign. So, I pushed the question to the mail list for local support people, asking if anyone else had noticed, and come up with a solution. In then walked away from my desk to help someone; big mistake. I had a dozen "Yes, No clue, HELP!!!" responses in twice as many minutes — and most of the IT crowd doesn't check their Email very regularly.
After sending out a request to limit further responses to helpful suggestions, and sorting through the responses that came in by the end of the day, I didn't have squat. One guy thought Thunderbird's spam filter helped, another swore it didn't. One guy suggested The Fuzzy OCR Plug-in be added to SpamAssassin (which I forwarded to the relevant IT Powers). Another guy suggested a commercial hardware product might be needed; ditto. One guy had resorted to a whitelist (that I was luckily on).
My final solution was to check my email archives for gif attatchments, whitelist those who had sent them, and move anything else with a
.gif included to a new category of spam-folder. I get an average of ten messages per day, and check that folder once per week. I've had one false positive since (dumb HTML stationary user), and warned the sender that I expected my new practice to become more widespread.The problem is, these bad guys are NOT stupid; they're learning, and adapting. Switching from GIF to JPG attachments is the next obvious step. The botnets are growing in sophistication, although not yet to Warhol-worm grade. And the only measures I can think of range are at best grey-hat hacker; some are just plain old-west style black hat.
-
My own
I looked at a couple of the popular ones, installed Awffull and played with it for a bit. But it wasn't immediately
obvious to me that any of the common ones supported aggregating stats across domains / hosts. Eg, I have 10 virtual servers on this
Apache box, give me a sorted list of hits per domain/host. Probably one or more of the popular open-source stats packages
*does* do this, but I didn't feel like spending hours examining different ones and installing them. Since my needs were very basic
I just wrote something of my own.
Since all my domains are ultimately served by a Java webapp running on JBoss (I redirect from Apache to JBoss with mod_jk) I just wrote a servlet filter to write hits to a postgresql database. That's it,one table with the hostname, date-time, user-agent, and a handful of other things I care about. Now, getting the info I need is a simple as a quick sql query with pgadmin III. Although I'm looking at using the Eclipse BIRT stuff for looking at the data, as my next project. -
My own
I looked at a couple of the popular ones, installed Awffull and played with it for a bit. But it wasn't immediately
obvious to me that any of the common ones supported aggregating stats across domains / hosts. Eg, I have 10 virtual servers on this
Apache box, give me a sorted list of hits per domain/host. Probably one or more of the popular open-source stats packages
*does* do this, but I didn't feel like spending hours examining different ones and installing them. Since my needs were very basic
I just wrote something of my own.
Since all my domains are ultimately served by a Java webapp running on JBoss (I redirect from Apache to JBoss with mod_jk) I just wrote a servlet filter to write hits to a postgresql database. That's it,one table with the hostname, date-time, user-agent, and a handful of other things I care about. Now, getting the info I need is a simple as a quick sql query with pgadmin III. Although I'm looking at using the Eclipse BIRT stuff for looking at the data, as my next project. -
Re:Reward for Open Source?
I've often wondered this myself. What is the reward for developing open source software? If companies can come in and use open source components in their own creation in a way that they make money without violating licenses, but at the same time aren't obligated to give anything back to the community, where's the motivation for new developers to go open source? Not everybody operates with an altruistic "I'm giving back to the community" motivation.
The company I work for contributes a lot to the Apache Cocoon community, and so do several other companies. The basic idea here is that that code is useful to us, and the more we contribute, the more the community will be heading in a direction useful to us. We want Cocoon to get better, so we'll have a better framework to work with.
We've released our CMS under an Apache license, and although this is one of our main money makers, we now also make money training others in using it and developing for it. And everything they add, again adds to the value we can sell.
Ofcourse anyone can take our code and use it to sell websites, but will they really build better sites for less money than we do? So far, we're still the experts.
-
Re:OK. Let's pack up and go home
Almost all of the money made by open source has been made by exploiting open source. Yes most of the internet runs on OSS. But how many of the billions if not trillions of dollars has made it back to the pockets of the developers of the big parts like Apache? I would guess not much since even Apache has a 'donations' link on their site.
Apache Software Foundation (apache.org) has a donation link on their site because they are a non-profit corporation. So by definition they don't make money. That does not mean they don't get money and resources; it just means that they use it all on improving the product.
That said, the companies listed (and many others) have indeed contributed to as well as profited from open source software. IBM spends billions every year on Linux alone. And where do you think all that code comes from? the magic code monkeys? People that work for these companies are either paid directly to work on open source software or allowed to do so because of permissive policies that derive directly from the fact that those companies are making money from the profit of their labour.
Meanwhile all of this work is shared and the wheel does not have to be reinvented. IBM benefits from the code contributed by Sun as well as Chucky down the street. And it works the other way too. And all of them are making money
... I mean even Chucky gets a job or can do consulting work because he's been working on this stuff all that time. Like when AOL hired all the Mozilla people. Or RMS's consulting, which probably has not made him particularly rich, though he is not exactly starving to death.There are a lot of ways to make money from open source. Some of the easiest ways involve working with or for companies, but there are others. Still, to focus too much on the aspect of direct monetary gain is to miss the greatest benefits of free software / open source. The best thing about the software is when you actually get to USE the software. Sure, you can contribute code if you want to, and you can customize it for your needs, but ultimately you derive gain from the fact that you can use the software freely, unencumbered by onerous licenses and likely free as in beer as well. That means that whether you need software for your business or for personal use you have easy access to it and you don;t really have to do anything to get it other than go get it.
Maybe your business is making money from free software (lots of people and companies do). Maybe you are doing something else but you use free software to accomplish those ends (way more companies are doing that). Maybe you just use it to learn, or because you feel like it. But no matter what you end up saving time, money, and other resources because you are benefitting from the community, and thus you profit from the use of Open Source / Free Software.
-
Nutch
Why not Nutch?
http://lucene.apache.org/nutch/ -
JavaServer Pages?
They are still widely in use, but if you are up-to-date in Java web application technologies, you are probably aware that JSP is dead. This is not a troll. JSP is rapidly being pushed out by alternatives like Facelets (which is used to define JavaServer Faces views), Tapestry, and Wicket. All of these are XML, disallow any logic in the view (thus encouraging proper MVC), and do not require a mountain of boilerplate code to extend. Why anyone would use JSP these days is totally beyond my understanding. Confusing and hard to maintain, JSP is rapidly diminishing and releasing a new library targeting it is like announcing some great new technology for Windows 95.
-
Re:In Ur Face, Novell
I haven't tried JSF, but I'm the sole maintainer of a Struts application. It works reliably, but changes are a pain in the behind.
I have zero experience with the Stripes web framework, but this comparison between Stripes and Struts has an excellent illustration of the klunkiness of Struts: http://stripes.mc4j.org/confluence/display/stripes /Stripes+vs.+Struts
"One of my prime frustrations with Struts is the fact that just to implement a single page/form, I have to write or edit so many files. And I have to keep them in sync, or else things start going horribly wrong. With Struts I have to write my JSP, my Action, my Form, a form-bean stanza in the struts-config.xml, an action stanza in the struts-config.xml, and if I'm going to do it the Struts way, a bunch of forward stanzas. And let's not go into the fact that since this is all stored in one xml file I'm continually facing merge conflicts with my team mates. Yes, there are annotations for Struts, but they are just literal translations of what's in the XML file, and they don't feel natural to me."
You may with to learn Struts anyway, because it's so common. But if you're builing a new Java web app from scratch and no one on your team is used to Struts, I'd investigate alternatives. The Struts project page even lists a few under the 'Similar Projects' heading. Now, extensibility, stability, and other buzzwords matter just as much as ease of initial configuration. So don't use speed of initial development as your sole criteria. -
image based spamI have two strategies against image based spam, for people using spamassassin (and for answering previous posts - damn this
/. breakage):- add this codesnip to
/etc/spamassassin/local.cfmimeheader MIME_IMAGE Content-Type =~
feel free to pump up the score (and dont forget to restart spamd if you use it) /image\/(?:gif|jpeg|png)/
describe MIME_IMAGE Image in Mime
score MIME_IMAGE 1.0 - since the above was not enough , I started using FuzzyOCR , and it works great (the number of image spam went from 10/day to 0/ever); so I am planning to package it for Debian ; but the web page hints that there may be some security problem, so I am investigating.
- add this codesnip to
-
Re:Reverse OCR
At work we use spam assassin with a gpl OCR plugin, however, it's getting foiled by intentional added noise in the images. I propose we come up with a way to detect these non-character elements (noise) in the associated spam images instead of just trying to OCR the text. The noise I've seen seems to be like it should be easily detectable.
I use a plugin called FuzzyOcr, and it handles animation and noise very well. Unfortunately the OCR itself isn't great, so it reads a lot of gibberish. FuzzyOCR compensates for this by being very liberal with its string matching (hence the name). The nice thing is, it correctly identifies the vast majority of the image-based spam I receive. Unfortunately, it's very easy for it to identify false positives. So far I haven't had this problem, but you might, especially if people often send you screen shots.