Squid, FreeBSD Rock the House at Caching Bake-Off
Blue Lang writes: "Saw on the squid mailing list today that the results of the second polygraph Web-cache benchmarks are in, and squid on FreeBSD captured a few top marks, as well as performing exceptionally well overall. Interesting reading, especially as a comparison of free and open systems versus some very well-architected proprietary solutions."
I've been introduced to Unix through Linux. I must say that the Unix environment just simply kicks ass!
:-)
Out of sheer curiousity I tried out freeBSD. Their kernel is incredible. I know that the bench marks aren't there to show it but their "claims" are true.
Their TCP/IP stack is better, loads can be handled with ease even on a extremely low-end systems and their memory management is out of this world. I was impressed at how fast my shitty unix boxes went.
Now I know that linux heads like myself would become defensive but linux has made big improvements and a lot of issues are being addressed with the next 2.4 kernel. Their "claims" will be seriously tested soon.
I have decided to go back to linux because I prefer it. There's more software and it makes a better desktop for me. Plus it is stable enough, user friendly enough, fast enough and damn good!
However, freeBSD is a great unix OS and the only way to find out is to try any BSD yourself. Even a linux head like me can defend freeBSD.
Keep up the good work to all BSD contributers
"If a show of teeth is not enough, bite
I just want to take this oppertunity to say Squid totally rocks. I put a squid server on a rescued 486/66 with 24MB of RAM. By rescued I mean that when the processor was removed from an old donated Compaq Prolinea server, it flew out of my hand and landed on concrete - then got stepped on while I was trying to find it and every pin got flattened (oops! found it!), and I had to straighten each pin with a butterknife to shove it in the Squid box! Honest! And that's only the processor story! Anyhow, you get the point - we're talking about really crappy and abused hardware I'm working with here.
:)
We have roughly 100 machines on our network, and Internet access was coming to a standstill - especially when everyone in the computer lab was on the Internet. Imagine a 128Kb/s fractional T1 with 25 *active* users all trying to look at mega-image-rich content, plus some other users on campus accessing the Internet at the same time (can you say sub-300 baud and ping times measured in whole-second increments?). I was having to pre-load web sites before a class came into the computer lab because just loading the first page could take roughly five minutes on a good day.
Then I configured and installed a Squid server on a rejuvinated Compaq Deskpro running Linux 2.2 that was donated with the above said specs. I was a little sketchy to implement it across the entire campus at first because I had always heard that proxy servers were a Bad Thing. So I silently pointed browsers to the Squid machine in a few classrooms to see if I would hear anything from anyone. I got calls from people that very day. They were asking me how I had finally coaxed our school district into buying us such a fast connection!
As it goes, the more classrooms I pointed to the proxy server, the faster things got (as the cache was growing and the hit rate was increasing), and the more happy teachers I had. In a school situation, many sites are visited multiple times by different students and classrooms. In the computer lab, every computer often visits the same site as a class. So having a caching-proxy server helps a great deal! I really believe that every school with less than a T1 should have one.
As for statistics, I have an average 'hit' rate of well over 80% because of the multiple viewings of sites. Initially I had 2GB set aside for caching purposes (on an IDE Samsung 2.1GB drive), and I found that as it reached its capacity the server just got way too slow. So first I brought it down to 1.5GB, and now I have it at 1GB (I may even take it to 750MB). It has been running pretty fast at 1GB - by far compared to not having a caching-proxy server at all, but I do see the performance start to degrade at about 750MB with my particular hardware.
Sure, faster server hardware would be *great* and is probably necessary to handle our unusually heavy load due to all of the graphics content on the visited sites, but right now that just isn't an option because we live on donations. My point is that even though we are running Squid on such a crappy box, it has worked wonders on our network. Internet access seems very fast now, whereas before it was almost unbareable. And most importantly people are happy and making use of the technology we have to its fullest extent, where as before they may not have been able to do this. I must admit though that I am writing grants in hopes of getting a faster/newer box because ours is getting tired and I worry about what will happen when the hardware finally kicks the bucket.
For a school in our situation, Squid is great because it even helps when you're using it on otherwise possibly worthless hardware, and the price is just right.
Anyways, I'd like to thank all who have donated their time on the Squid project, you've done great work and you're helping people more than you realize!
--SONET
http://www.hbcsd.k12.ca.us/peterson/technology
Any fool can criticize, condemn and complain and most fools do. --Benjamin Franklin
I've been overseeing a caching (really, website acceleration project) for my company, Excite@Home E-Business Services, over the last three months now. I can personally say that the three I've had experience with, Novell's ICS caches (which comprised ten of the twenty entrants), Network Appliance's NetCache, and Squid (on Solaris, in our case) all rock. Squid 2.3-stable1 was a dream to compile, install, and configure. ICS has a few user interface quirks with their Java administration tool that I don't like, but except for Cisco's cache (Oh My Gosh do you really want to spend $150,000 on a CACHE???) ICS-based systems captured the many top honors in this roundup. Network Appliance's NetCache is also a nice choice, and as the only vendor with streaming media caching/splitting support, they are receiving a lot of attention recently.
It's really important to note that IRCache has no desire to point to any "winner" in this bakeoff, but instead to have real non-partisan numbers to point to when evaluating cache performance. Squid captured top honors in cache hit ratios, but nothing else (AFAICT), showing that those "expensive, proprietary systems" also can be very well-tuned operating systems that eliminate traditional OS overhead for these numbers.
One of the frequently overlooked uses of cache is as a web site accelerator, instead of the standard forward proxy. Using a few simple access control lists and a policy on a router, reverse-proxy caches managed to reduce the instantaneous load on our web servers by up to 94%. We serve about 3.5 million hits a day. A "reverse proxy" is an EXCELLENT use of a proxy cache, and after these technology evaluations I've been involved with in past weeks I'd recommend it to anybody considering running a high-traffic website. This allows your Apache servers to function more as the "cgi engine" of your site, and lets the static images, text, banners, etc. be delivered from a box that can handle 100,000 simultaneous connections. Very cool.
While I'm not allowed to post a "review" of any one of these units, because of various agreements for the evaluation boxes we tested, I can clearly state that Squid, NetCache, and ICS-based systems can and will vastly reduce infrastructure scalability costs for businesses when deployed in a reverse-proxy configuration. Our earlier estimates guessed we'd need to expand our web farm three times to handle our estimated load by the end of the year. Now we can reliably predict that our farm can serve 10 times the amount of hits we're running now by using a cache as an accelerator. VERY cool stuff.
Be sure and check out the system configurations in the bakeoff review. It's very illustrative that the boxes tested have VERY specific audiences. Don't be fooled by the "fastest hit response time" or "most throughput" -- you can spend $6,000 or $150,000 for any setup, depending on your needs.
Noticeably absent from the review was Inktomi, for the second year in a row. I'm hearing FUD from vendors that their performance isn't up to snuff-- any truth to these rumors?
Matthew P. Barnson
I learn what I think when I read what I write