Ask Slashdot: Art, Linux and the Slashdot Effect?
patSPLAT submitted this artful submission: "I'm asking Slashdot: What kind of box does Linux need to handle the Slashdot Effect? I'm an artist, and I'm working on a sculpture which will be self-documenting with a running server/webcam. Since the server will be a part of the piece, I don't want to spend more than I need. I do want it to be able to handle a heavy load if my piece is well recieved. I'm planning on getting a 10/100 Ethernet, but I'm wondering about processor and memory. Could I get away with an older Pentium? Would a Celeron running in console mode do the trick? 64MB? 128? What do you think I could get away with? The website on the piece would be no larger than 5 megabytes, and webcam would obviously require some resources. I'm not sure how much the webcam would take yet, so give me the minimum and I'll go up a step to account for the webcam. "
'Ah come on all he is just trying to help out.' with a 3 word post? Not _that_ informative, not _that_ helpful, imho... more like a joke, or troll :) Cheers Stor
I wonder if you're one of the losers who flood my semi-public address with resumes?
NoNoNo, you need a good grep.
Out of curiosity, why do you say that MaxKeepAlive is broken?
Whatever you do, DON'T base your decisions on those lame studies of serving static pages UNLESS you are serving static pages. If you use DHTLM and CGI and/or mod_perl scripts, you'll want memory, memory, and more memory. For example, the Think Geek site had 128MB and it went down FAST. Also, don't underestimate the value of optimization.
7 hits a second? you must be using some very poorly written CGI scripts. Ever hear of mod_perl? or C?
If somone would be kind enough to post an mrtg graph of how many concurrent connections and how much bandwith is being used, we could define the paramaters a bit better. Personaly I find that apache cannont handle more then 250 concurrent connections well even on an SGI 0200 with a gig of ram. Remember to turn of reverse DNS and turn off the keep alives. (MaxKeepalive has never worked)
*yawn*
Yet again we see proof that slashdot is merely a cosy group for Linux-worshipers, and any non-Linux-worshiping comments are derided as flamebait regardless of the validity of the opinion.
Grow up, moderators!
this is off topic but, oh well. why did you have your email address written with XXXX in it? is that to prvent it from being grabbed propperly from a robot that might scan webpages for email addresses? Just a guess. It just striked me as odd and got me thinking. --Brett
thats all
Why was this moderated down?
FreeBSD? Maybe, it's because of the fact they are sending-out static (ftp versus a web server running perl or PHP connected to a db) content, and they have a very nice I/O system? Multiple Mylex SCSI to SCSI RAID systems should make any system zoom. With my Pentium 90 with 128 Mbytes of RAM and a Mylex DAC960, I average sending-out just over 10 Mbps of content per day (internal CAD file transfers and almost continuous backup of 102 Gbyte filesystem to a mirrored machine). Thursday, the machine sent-out over 125 Gbytes of data with a load average of less than 1.0. AFAIK, the most cdrom.com has sent-out in one day is 1,390 Gbytes of data. I can get within a factor of 10 of the performance of cdrom.com with a machine I bought for $200 (+ $450 RAID controller + $?? drives), and with zero work or tuning. In my experience, Linux runs at near hardware speeds, and you can't get any faster than that.
perhaps you didn't understand the question. he wanted a minimalistic system running linux...smoke less crack...
geeze ... so yer tellin me that a system with 128 megs of ram can't even spare 28 megs to a ramdisk because it needs the remaining 100 megs as disk cache? tchya .. whatever.
Other than that, you're pretty much at the will of the nerds.
Just one more step closer to faschism.
Yes FreeBSD is very good at handling heavaly loaded sites. This also should be considered an option.
Get an Abit BP6 with two C500's. With my dual C333's overclocked to 500mhz with 256 meg of ram a kernel compile takes 2:30. Definately the most bang for your buck. The BP6 has four IDE channels. Two Ultra ATA/66 and two normal. Get some Alpha heatsinks for those celerons and you'll be set.
And who is afraid of learning? Except the ignorant moderators.
slash : Please institute a policy where three moderators have to agree before point deductions are made.
There is nothing wrong with learning proper english and a strong vocabulary.
This is one of my favorites, as whenever
someone uses "duplicity" without understanding
what the word means...
check pricewatch.com for the latest prices. They hover at around $130. Ram is the real killer. In the past two months the price has just about tripled! be prepared to spend nearly $200 for 128 meg.
There's no need to be (that) sarcastic!
For true proxying you don't even need the SSL libs, you can do it without decrypting the session. See for example micro_proxy, which handles http and https in 260 lines of code.
And (back to previous topic) if you want a really tiny http server, see micro_httpd, only 150 lines of code.
-Jef
The ability to serve pages doesn't magically fall, unmetered, out of anyone's ass. That's the point of this article, eh? If you can't stand the "cost" of having your cartoon porn on line, you need to start e-mailing it out to people upon request.
At least the old school pirates with their "I wouldn't buy it anyway" had a weak point. You're taking bandwidth away from people who play by the rules, making everyone else pay for your little game.
Piss off.
Why is anything moderated up or down? If Rob didn't give away these plums more or less on a whim, he's a shining beacon among web-based discussion-group developers. Could we at least sort posts by number of capital letters or percentage of run-on sentences?
i love that a comment recommending M$'s premier OS and WebServer is considered a troll
Until very recently, ftp.cdrom.com used a Ppro200 with 512Mb of ram running FreeBSD. They set several world records for throughput with that configuration. They recently upgraded to a Zeon server, but they are still world champs when it comes to throughput in a 24 hour period. They say it is all because of FreeBSD.
My own website on the Internet runs NT4, SP5, and has only 128 MB, with a Pentium 150. It also handles mail and SQL.
Having seen people take all sorts of approaches to this and having to do it myself quite often, the thing I've learned over the years is that most people go out and blow a lot of money on extremely beefy (processor-wise) servers and end up spending a whole gob of money on a lot more server than they need. If you're just planning on serving a plain old webpage, even with a lot of hits, a Pentium 133 is plenty. If you're doing a lot of dynamic content, get a dual celeron box, they're cheap and it's plenty of power. What lots of people overlook is that you need more RAM than CPU most of the time. If you want a server that you probably won't have to upgrade for a while, get a dual Celeron 333/400 with 256MB of RAM and throw FreeBSD or Linux on it. And don't forget about Hard Drives, the IDEs nowaday are pretty decent, so a Western Digital IDE should be fine, but back it up, nothing sucks worse than having a really cool server with all this neat stuff on it and having your hard drive blow up and having to start over from scratch.
just a note even more to the side
that 14.4 of yours might have been where the bottle neck was...
You comment on overkill, and then you suggest using raid?
Do you know how much raid costs? check pricewatch.com for the best prices. A decent raid system will cost you about 10x the same space on scsi and 20x the same space on ide. His overkill 4k$ system would be cheaper than a decent raid.
I'll state it plain and simple: This is NOT a situation for raid. It is a situation for the information to be ram chached on a 150$ system, given his max bandwidth and the static content.
- Rei
First of all, he gave plenty of data. Did you not read the post? He gave his total bandwidth, 5 megs of static content, and webcam images, which are assumably static. No database at all was mentioned. Guess what that means? There is none!
Second, slashdot as far as I am aware is not dynamic. From what I read on the last optimizing page, they use a script to update it whenever new information is added, not at every request like true dynamic content.
- Rei
I just installed Linux over the weekend on my P166 and was looking for some good resouces on the OS and came here. I have to say I didn't expect a good laugh but you guys gave it to me. Thanks for the belly holding on Manic Monday. From Grip it and Rip it (with two hands)
As an artist you should not waste your time keeping your server safe. My sugestion is: use a server as little as a pentium 166 with 64M RAM running OpenBSD 2.5
Question: What are the issues regarding spreading you content to these free sites. I mean what is to keep me from placing a bunch of images on a GeoCities site and then linking to them from my site? I suppose I could even set up a cron job to check that they were still there and hit the occasional html page to make it look like the site was getting used for something other then just hosting the images of my low-bandwidth site.
Ah come on all he is just trying to help out. If the guy with the art is clueless a NT box and IIS would be a very easy setup and he could build the site super easy. I belive that is what the Troll was trying to get across. Now since 99.9% of us out here think samba and kerbos setup is a snap what would we fear in Apache and Redhat? Give the troll a break.
a T1 or (DS1) can do something like 1.544Megabits per sec. When you / that by 8 you get 193KB/sec. Any 10Mb netcard can do 193KB/sec. so add it up per T1 you have 193KB/sec is a good way to think about it, so you know when you need a 100Mb network. Also remember router to netcard(s) talk, they use up about 100KB/sec at any given time on a loaded/halfloaded T1. So a pentium 133 can handle a T1 easy. /. have and what type of bandwidth. How much bandwidth does slashdot use per month/ per day.
Oh how i love linux and T1s. What i want to know is what does
This FreeBSD: The Power To Serve is just that, marketing FUD coming out of the BSD camp since they cannot leverage any momentum from Linux of late. Yeah they'll continously repeat the same bullshit about Yahoo.com and cdrom.com. FreeBSD is used because thats what they are/were comfortable with when they came about and because Linux wasn't at the time mature enough. The fact is, today, Linux can stand up on its own. Proof == deja.com, slashdot.org, freshmeat.net, redhat.com Ample bandwidth is and always will be the problem, not the OS.
K-6's dont' support SMP so you're stuck with Intel (for the moment). Goto http://www.pricewatch.com and look for the Abit BP6 motherboards. The used to be around $130 and support SMP with the socket 370 Celerons as the previous poster suggested.
Sounds like this person could eaily use a P100, 64 or 128 meg or ram and Linux or NT4. If he's trying to do it for as cheap as possible then Linux is obviously the choice here, although having someone else host his site would likely be a better choice.
Maybe you should try it before mindlessly bashing MS.
In response to the comment (up a couple levels) about being kind to readers I'd suggest using Opera to test the pages, its about the lowest commen denominator in popular browsers these days.
If you think you're going to be serving a *large* amount of traffic, you can always Akamaize it. :) www.akamai.com
not only is it cold, the air is thin from being in a sealed facility. FM200 specifications dictate that the room be relatively air tight. everytime I have to go to our colo facility my nose hurts
"OS's are like masturbation, everybody has their own way of doing things." A wondeful quote for the books ladies and gentleman.
I've seen a lot of comments about people saying the Linux can't handle the slashdot effect. Well folks, Linux seems to be doing pretty darn well handling slashdot and all of its dynamic perl content.
Erhm, actually I'd prefer _not_ to use a webserver that includes an easily exploited remote buffer overflow that provides total remote access, without using anything but port 80.
- Jay Ts
http://jayts.cx
Another valid opinion shot down by the all knowing moderators. In case you haven't noticed this is "News for nerds" not "Linux zealots". Just because you have a different opinion doesn't make it better than his.
Notice I said he should use 500mhz celerons. Nowhere did I say he should overclock them.
Last I checked Linux did not have a unified buffercache, so many pages would be in memory twice anyway.
I don't know, that is over 600,000 hits a day. I think that this is acceptable.
What has OS got to do with anything? The discustion is about bandwidth and server side scripting. FreeBSD isn't going to increase your bandwidth. FreeBSD isn't going to make your perl cgi scripts run any faster. At present Linux and Free BSD on the same hardware have very similar performance characteristics. At the time that Yahoo was orignially designed FreeBSD had better networking code. But that was 5 years ago and 2 Linux network versions ago. FreeBSD is not better or worse than Linux. Each has very minor strengths and weaknesses over the other. Any companies that still use FreeBSD do so because of decisions made many years ago. Although it would be trivial to port code to or from FreeBSD and Linux.
Hmm... didn't this guys specify that he wanted to run the services using ONE relatively minimalistic machine? I don't think he's got the coin to buy a 8-way xeon box with 1GB of RAM...
Thank you so much ... you have brought me completeness ... a form of l33tness yet.
You are too kind ... I think I hear your mother calling eh' ....
CC
"Pray arm me further by your reply" Winston Churchill
Hey now, I'm not a dirty hippy, and I have a job, I probably make more money than you do. And frankly, people pay top dollar for ugly ass arty crap.
Heck, a mexican guy canned his, dated it, and sold it at auction. Made millions.
From a marketing perspective, this is just the kind of arty crap people gobble up with a spoon. At least if you put the right spin on it.
Success in art has a lot more to do with conspiracy than it does with talent. Take for instance Vincent VanGogh - So talented it made him smell bad, but he never got to be part of the whole art media game during his own life span, so he sold his paintings mostly to relatives and siblings, and died dirt poor.
Yeah, talent's great, I'm all for it, but orangutans have made things with rotten fruit that sold for more money than I've made in my life. And that guy who stands behind jet engines and tosses paint in the air, sheesh . . .
This is just like television, only you can see much further.
Do people really have so little humor as to not get this? It's a JOKE for crying out loud. It IS
a troll but along the lines of a poke in the ribs from a buddy. Just laugh.
In Soviet Russia, hot grits put YOU down THEIR pants.
The big question is: Do the have rooms for rent? Now THAT would be a nice place to live :-)
Methinks someone needs to go look up 'artful' in the dictionary. It doesn't mean what you seem to think it means... :)
Here's an idea: why not have a posting filter which counts the uppercase and lowercase and will reject based on that ratio? Should get rid of the lamer script kiddies and warez doodz out there, not to mention the sort of scum-sucking, bottom-feeding lowlife I'm replying to.
dave
Hey don't be a fscking dick to the guy. He asked a legitimate question and you slam his "little project". What the sam-hill can YOU do that's cool?
Not really. Those web hosting facilities are usually too damn cold, there are security cameras everywhere, plues the maddening sound of thousands of server fans and disk drives. You really don't want to live there. (I had to spend a night debugging a box there)
Zigbee Central: A Zigbee weblog
I didn't mean to be mean. Just blunt...this guy should test his project out and measure his needs according
Tschüß,
Tyler
www.savvysasquatch.com ----i have a webcam too
Bye,
TYLER
Since some ISPs and routers don't support a MTU > 1500, setting it higher is silly. In addition, if you drop packets, there's a bigger penalty to pay.
An older pentium should handle all your needs unless you are running some dynamic content in addition to the webcam. User registration, cookies, I don't know what you would do with dynamic content as an artist, but if you use it, it is something that would increase the load on your server, requiring more processor and memory.
"I'm asking Slashdot: What kind of box does Linux need to handle the Slashdot Effect? I'm an artist, and I'm working on a sculpture which will be self-documenting with a running server/webcam. Since the server will be a part of the piece, I don't want to spend more than I need. I do want it to be able to handle a heavy load if my piece is well recieved. I'm planning on getting a 10/100 Ethernet, but I'm wondering about processor and memory. Could I get away with an older Pentium? Would a Celeron running in console mode do the trick? 64MB? 128? What do you think I could get away with? The website on the piece would be no larger than 5 megabytes, and webcam would obviously require some resources. I'm not sure how much the webcam would take yet, so give me the minimum and I'll go up a step to account for the webcam. " 1. System tuning is significantly more important than CPU speed for most webservers. 2. Memory is also generally more important than CPU *begin potentially inflammatory section* 3. You might be better served by ditching apache and going with a faster server like (my personal favorite) Zeus or fhttpd.
Actually, the cache is keyed on physical address, so the way the physical RAM is layed out *can* affect the performance of applications. When two heavily-used pages map to the same cache page because of the physical address they happened to get, performance will decrease.
Did you mean 768 kilobit? I'd be amazed if you could somehow haul 768 megabits over DSL, since that's like OC48.
I've heard (though not benchmarked) that a ramdisk will swap out, allowing other system processes (or cache) to utilize memory, and that access through a swapfile is more efficient than access through a cache. I'd be interested in hearing of|seeing benchmark results or comparisons of ramdisk performance vs. cache hits on otherwise identically configured boxes -- same memory, same OS, same load. The discussion was general (other Unices, NT, etc.), not specific to Linux, so YMMV applies.
Any takers?
What part of "gestalt" don't you understand?
...(a server that I wrote) should be nice for situation when resources are limited but load is high. There is a webcam module for it, however I still need to add the support for cameras other than parallel Quickcam -- the braindead design of their parallel interface (interrupt pin reused for data) wastes too much processor time on polling.
Contrary to the popular belief, there indeed is no God.
sure it can - serving static html pages. *but*, it sounds like he wants to serve streaming video and will need to run some serious codecs to compress it. newer pentium methinks.
If the load gets even higher, split your server. Get your images from another machine, possible with a special web server optimized for static data ("phttpd). If you are using an SQL backend, put the database on a dedicated machine.
And always remember: In servers, memory is more important than I/O. I/O is more important than CPU.
--
I think Slashdot actually ran on a Multia for awhile.
Share data. Share code. Share ideas. Share the wealth.
http://stockfilter.org
This is untrue, unless you are using CGI scripts for dynamic content. Webservers that fork for each connection went out of mainstream use about 5 years ago. Apache uses a pre-forking model, meaning it has a "herd" of seperate processes each of which handles one connection at a time. Multi-threaded servers are more efficient, but the pre-forking model is optimum for stability.
While there are other webservers known faster for static content (Zeus comes to mind), I don't know where Apache stands for dynamic content. It probably depends what language you are using: mod_php4 is supposed to be very fast, and of course if you write a custom module in C you can make it as fast as it needs to be.
It should be noted that Zope can be used with several servers; Bruce Perens in fact uses it with Apache. It is not considered to be especially quick, it is the rich functionality and flexibility that people choose it for.
Philip Greenspun is rapturous in his praise of AOLServer; but then, he thinks we should be using dynamic content for everything, and I've never heard of anyone else who actually uses it (apart from AOL, obviously). Bear in mind that AOLServer is as tied to TCL as Zope is to Python.
fish and pipes
Everybody is telling you (correctly) that bandwidth matters a lot.
Many people are telling you that dynamic content matters a lot too. This is less valid for your application. Slashdot has lots of dynamic content. Everybody can customize his settings, and needs a different page.
If you have a piece of artwork, which is photographed by a webcam, you could go get a snapshot for every "client". Don't do this. It will bog down tremendously. Just have a program make a picture every 10 - 60 seconds. Then you have almost completely static content, and you can serve LOTS of pages using a fairly conservative setup.
Roger.
sure, but my point is that the OS will generally do a better job of deciding which data to keep in memory. if you are doing lots of CGI and exec'ing some interpreter often, do you really want to drive the interpreter's pages out of main memory so that some large gif file that hardly anyone visits anyway gets to stay there?
Nice explanation though.
memory is quite a big thing on webservers; I'd say put 128MB at least. go for 256MB if you need lots of scripting, and avoid CGI (use mod_perl or mod_php instead).
Yeah, and I got moderated down too. Serves me right for being interested in more than the technical aspects of the sculpture.
This is just like television, only you can see much further.
I did read his question and still think that he did not give enough data to give more than an educated guess on what kind of hardware he needs. Also, on the topic of slashdot, if you are a registered user logged in or using cookies, the server creates a page on the fly just for you. "My" slashdot says "this page was created for hanno" on the top. So it must be truely dynamic. There may be optimization for the majority of non-registered readers where personalization is not important.
------------------
You may like my a cappella music
An old 386 on an AOL modem dialup should be sufficient. Make sure your copy of Windows NT has the latest service pack installed. Write your dynamic content in a lightweight language like Visual Basic or GW-BASIC.
get a pair.com account, put your web pages on it,
and just update your webcam image when it needs
updating.
of course, then you get to pay for all that band-
width.. not a happy day when THAT bill comes in.
also, if most of your content is static, use squid
or some other transparent caching proxy.
i browse at -1 because they're funnier than you are.
--
What if your entire disk takes up 50MB of space, and your OS uses only 50MB of RAM? Why not set up a 50MB RAM Disk, and still have 28MB of RAM to spare? That way everything is always in RAM, and you never have to worry about it? Heck, throw in another old, cheap 32MB DIMM, and you'll go up to a spare 60MB of RAM!
I think I could fit a minimal linux-based Web server running in 50MB... That is, if all you want is a web server, and don't care about anything else. Which is what it sounds like is required.
Another non-functioning site was "uncertainty.microsoft.com."
The purpose of that site was not known.
Don't waste your breath. This person obviously is very unsophisticated and doesn't have any appreciation for anything that provokes thought. But for the record, VanGogh sold exactly one paiting during his lifetime, and it was to his brother who felt sorry for him.
I think slashdot builds each page via a series of perl scripts, some of which executing via a daemon (cron job, whatever) such as the comments (which would also explain the delay on posting/viewing comments), others executed at request time such as the "this page was...". So slashdot would neither be static or dynamic per se... more of a quasi-dynamic site. I don't think anything could withhold processing all those comments for all of slashdot's traffic every request... it'd be just brutal (and inhumane :)
I based my recommendations on how much traffic I thought a peice of art might attract, which is probably quite a bit at first since it's a new idea, but not a lot of repeat visits, I'd guess. I mean, you don't read your horoscope in da vinci's armpit.
Either way, I'd like the piece in my living room.
Not a bad idea! ;-) A new internet startup called Akamai does something very similar. The company has ~900 servers around the world, caching their clients static web content. For example, jcrew.com generates their HTML with image URLs pointing to (say) http://a1240.g.akamaitech.net/7/1240/969/ffe0a8c13 22031/jcrew.com/images/ sep99/e2home/clearance.jpg. This distributes the load off of jcrew.com's server to the Akamai servers nearest the user.
I wonder if someone could write a freebie script that does the same thing using free web space like GeoCities or Xoom. The script could automatically create accounts on those sites and shed the load for serving static content to these free servers. >:-) I bet GeoCities would soon figure out a way to block this behaviour, though.. but it still might be fun.
cpeterso
Just about any box with 128+MB of ram and just /. effect. The real limitation is
about any unix w/apache and a sane config can survive the
bandwidth. a T1 is not sufficient. I'm assuming
you will have lots of graphics (an art site), and you would pass 3G in less than 12 hours. Also, if bandwidth is limiting transferrs then each transfer takes longer and you run up to max httpd's sooner. In fact a 486 w/32MB ram and an ide drive would probably do fine on an OC3, but on a T1 you can forget it.
Do you have any URLs for these kinds of dual Processor MBs?
-- memoid
Hi... Link me to the best space to get one of these apparently fly boards, SVP...
-- memoid
But it's always fun to get to look at other people's colocated equipment. Usually there's a lot of nifty stuff there that most of us would never even think of getting a chance to see... :)
-Chris
who cares, by an extra 4-5 gigs of transfers for pennies to the dollar comapiaired against expensing a machine/t1/bandwidth/telco charges..
I'm a bit out-of-date on this, but using a server accelerator like Squid can help in a big way. Since Apache processes can be a bit big with things like mod_perl and mod_ssl running, it's best to delegate the long-distance network transfer phase of the hit to another program.
In other words, you only use apache for local hits taken by a local proxy. The proxy then transfers the data to the client. This way, the amount of time apache spends working on a hit is lowered, giving more clients/sec to apache.
Anyone else know more about this?
Well.. if there are things being done in the background, cgi's and the what, you guarantee that your data stays in memory.
-
ping -f 255.255.255.255 # if only
If you are doing some sorta server-side dhtml (someone reply and tell me what mod_perl, cgi and php, thunderstone and coldfussion all classify as), double the memory, make it a PII 266 (whatever the slowest moddles are). If there are a lot of these programs, double the ram, increase the disk cache.
Now, for tcpip stuff, set your MRU/MTU high. Since you are handling large chunks of data, using less than the MAX won't pay. Think of it. Would you rather send 10k 2 byte packets with extra overhead per packet, or 2 5k packets?
-
ping -f 255.255.255.255 # if only
Practice this long enough, you don't need to actually think about it.
Off topic, but its something people should do...
-
ping -f 255.255.255.255 # if only
I suspect he meant that a RAM disk was irrelevent, not that it couldn't be spared. Since the site is only ~5MB, it would all be able to be cached in RAM by default.
whats the url?
-- your knees hurt, don't they?
how is this done?
-- your knees hurt, don't they?
and grammar ;)
-- your knees hurt, don't they?
On the subject of Linux servers, I have had no end of trouble trying to setup a networked box.
I have an AMD K62-300 running through dual-ISDN (the router uses dhcp to assign addresses, set to permanent, of course!). and a Redhat 6 CD.
What I want to do is set up Your Average Server (i.e. smtp, pop3, ftp, http).
I read through my copies of nag and sag, and read almost all the HOWTOs that pertain to networking.
I gave up on sendmail after a week. Was that a configuration file or line noise? There was another alternative that was suggested in the HOWTOs, but it would never run. The best I got from the mail subsytem was sending mail out (sometimes), but any mail sent in was happily received by whatever program and subsequently disappeared into the ether. ftpd doesn't seem to allow permissions based on user (i.e. allow upload but not delete for user X).
httpd (apache) appears to work out of the box, but ALL network access to the box is flakey (sometimes it is lightning fast, sometimes it is pathetically slow, getting to 300 cps, even from an adjacent machine! - I am using a recent PCI Realtek 10baseT card)
The configuration options for most Linux programs appear to be arbitrary, almost deliberatly cryptic, and stored in the most unlikely of places.
Isn't there some way to configure a linux box that doesn't take more than a day? a week? (I gave up after 2).
Heh. I sleep next to a server bank. At first the fans and the hard drive activity made it hard to sleep (considering none of the machines are in cases), but now I find it oddly comforting.
-- Virtual Windows Project
Will 3gb a month do the trick when you're planning to have your webcam-images slashdotted? I doubt it.
0x or or snor perron?!
control the instances of your web server to re something reasonable
That doesn't well solve the problem. Sure, your box still gives you a command prompt, but web surfers out there are still not able to get to your site because there are no available httpds.
I believe the issue we're dealing with is how to allow a heck of a lot of people come in and use the site, not simply get stopped at the door.
- Scott
------
Scott Stevenson
Scott Stevenson
Tree House Ideas
A problem I've run into more than once (on RedHat) is that you can set Apache's MaxClients directive is high as you like, but you're eventually going to hit a kernel limit, regardless of your hardware.
/usr/src/linux/include/linux/tasks.h, you'll find this:
/* On x86 Max 4092, or 4090 w/APM configured. */
In
#define NR_TASKS 512
For the life of me, I cannot figure out why this is set at 512. Recompiling the kernel can be a really aggrivating experience for those who come from a background of not having to recompile kernels. So this is just another thing that makes Linux unnecessarily diffcult. What would be ideal is that the installer prompts you for this number, and creates a kernel based on your requirements.
- Scott
------
Scott Stevenson
Scott Stevenson
Tree House Ideas
Most of the comments below are very informative and accurate-but if all you have is 2B ISDN or fractional T-1 line, your connection to the net-at-large will choke long before the box itself is no longer capable of serving out pages.
This can be useful. If you don't want your machine dying from overload, purposly putting a bandwith throttle on it protects the machine-while denying some people access when things are busy. This is bad for an e-commerce site, but for your purposes, that may not be an issue.
If you are trying to be up all the time, regardless of load, you will have to have a pipe to the internet that can match what your server can put out. If you've only got an ISDN line, forget about a dual Alpha setup-you'll never get close to slashdotting the box.
Ceci n'est pas une sig.
I don't know what kind of idiot OS would cache a ramdisk, unless a total idiot coder wrote the caching software.
#6495ED - cornflower blue
i believe linux.com has an article about serving up webpages and heavily used filesystems using RAMDISKs..you might want to look at that. Also Abit BP6 does dual celerons for $300 or so (motherboard + 2 socket 370 celerons)..should handle anything you can throw at it.
Something else to be considered is whether the WebCam has linux support. Last time I looked (which, admittedly was a while ago), only the QuickCam had decent linux drivers and I'm not too impressed with the quality it delivers. The best low-cost cam I've seen is the ViCam that's also OEM'd by 3Com. Unfortunately, there's no linux support.
I have seen references to an open standard set of drivers, but I've long since lost the link and don't know the progress... Anyone?
How do you know the piece isnt being funded by the Guggenheim or something? It may or may not be something huge - sorta insulting (to the poster as well as artists in general) to assume that just because something is "art" it will be unpopular. Id love to do tech work for an artist and make a contribution towards creativity someday.....instead of working for corporate america where I just make a contribution to the paychecks of clueless marketers and managers....
Apparently David Filo has been asked this question quite a few times, and a few people asked him at ApacheCon last year. He says he went with FreeBSD because it was free and stable, and at the time Linux was not (stable). At this point in time I haven't seen Linux stand up to the serious loads I've put it up against. I've talked with people at Hotmail who tried it as a lark, they said the Linux boxes failed within 2 minutes of being up.
/.'ed a few times and has not fallen to any ill effects. It only runs into issues of it not being on a very high bandwidth link (ADSL).
I tried this at MSN-Linkexchange as well and got similar results. Linux just didn't seem to stand up to the pounding of big site webtraffic.
On a side note, www.pootpoot.com has been
I find it hard to sleep without the reassuring hum of a computer in the room. Perhaps I should record said noise and loop it on tape if I'm away :)
(Scary, huh?)
Rob Malda introduced the metamoderation...when you see injustices like this (and it obviously is) go over there and beat up some moderators.
:)
I'd rather do almost anything than set up a webserver (at least for my own use) on an MS box. That said, you're quire correct.
--
--
There is no premature anti-fascism. -Ernest Hemingway
What makes you think that I haven't used Apache for win32? In fact, I have. I've also used Notes/Domino (since Domino 1.0) and IIS. I have a MCSD and a Principal CLP on my wall to boot.
None of these would be my preference for a webserver because of the weaknesses of the underlying operating system. It's easy to crash Domino (less easy to crash IIS or Apache, of course) but the bad thing is that many application crashes will leave the system in an unstable state. I've never had that happen with Linux or BSD.
It's easy to whine about people 'mindlessly' bashing MS, but there is such a thing as educated criticism. (Kindly get used to it).
--
--
There is no premature anti-fascism. -Ernest Hemingway
Linux supports software RAID. You don't have to go to the extent of buying a RAID controller and a huge chunk of disk space to go with it.
--
--
There is no premature anti-fascism. -Ernest Hemingway
If you do not have the need for any CGI, or your CGI needs are minimal, you may not even want to use your own machine. You may be best off just getting a web access account -- you know, the kind of think you get with many dial-up accounts, though with better service and the capability for more bandwidth.
Read the post again. He wants to use the machine inside the sculpture as the web server. It'd defeat the purpose to host it somewhere else.
--
Win dain a lotica, en vai tu ri silota
If it where to do this it could stop serving as many pages to Slashdot folks while leaving the site up for other people.
/. people? We wouldn't even necessarily all link from /.
/.ers, but letting all the Yahoo!ers in? It's not like they're any sort of better viewer than us. Maybe even the opposite.
How would you know which people were
Maybe any browser running Netscape Linux or Mozilla gets denied? That won't win you many friends around here.
Besides, what's the good of denying
Snicker snicker pompous snicker
As we all know, webservers (for static pages) under linux don't need much CPU power, so pentium processors are widely used. To make the server fast, nervertheless, linux needs quite much RAM (for the filesystem caching).
The problem I see is, pentium bords do not support (RAM) caching for more than 64MB. What do you nerds think, may a webserver based on linux running on a pentium be faster with, say, 128MB of RAM, than with the maximum that can be chached? I mean, is there a performance gain when doubling the amount of RAM at cost of the RAM caching??
-phaethon.
If you want it cheaper got to Hostsave.com. You get 10 meg of space and ?? of bandwidth per month.
I've finally found the off by one erro
For automatic load balancing, distributed web applications, etc etc etc. You might want to check out EddieWare, an open-source project:
http://www.eddieware.org/
From their web page:
http://www.ericsson.com/) sponsored Open Source effort
being undertaken in partnership with The Royal Melbourne Institute of
Technology (http://www.rmit.edu.au/) aimed at delivering a commercial grade,
quality of service driven web server solution. Core development is being
performed by The Ericsson / RMIT Software Engineering Research Centre
(http://www.serc.rmit.edu.au/) and the Ericsson Advanced Services
Application Centre.
Eddie is a 100% software solution written primarily in the functional
programming language Erlang (www.erlang.org) and is available for Solaris,
Linux and FreeBSD, with Windows NT to come soon.
Eddie provides advanced automatic traffic management and configuration of
geographically distributed server sites, consisting of one or more Local Area
Networks.
Can your IM do this?
Here's a little rule of thumb that applies to the VX/TX/HX moboes, and might apply to the PII/Celeron/K6-3, etc.
The amount of ram cacheable is equal to one fourth of the available L2 cache. IE: I have 128mb of ram, which is cached perfectly on my 512k of L2 cache (512 / 4 = 128).
:-)
--
Internet Explorer (n): Another bug -- that is, a feature that can't be turned off -- in Windows.
You should definitely consider making it a *BSD
box. A lot of large servers are BSD, and from
what I've heard, it can handle load a little
better.
While I agree that the BP-6 is a wonderful
board (I am using it as well with non-overclocked
C-400's). I think on a production web site,
you should play it safe and not overclock your
processors. You want it to be stable.
Your nice processors could burn up, even with a good heatsink and fan, and then you'd be out of a website.
Besides the difference in price between Celeron 333's and 466's is not a whole lot. And, you're
not guaranteed to be able to overclock. Some chips won't do it. And, a kernel compile with a dual C-400 is not much slower than with a C-500.
I'm compiling X right now, or I'd time it.
One good way of avoiding the /. effect is to round robin 2 web servers. A simple addition to the zone file is all that is necessary.
We host a several purely dynamic sites, often with loads of 200,000+ hits daily. We had this running with a P133 64M SCSI
and a 486/66 128M SCSI and completely saturated a T1.
The P133 handled Postgres, DNS, NFS, HTTP, SMTP, DHCP for 900+ users... and the 486 handled strictly HTTP. We also pull live
remote images from an ftp server every 10 seconds.
This is also a good idea as if one falls over you still have the other. If moneys no object, offload DB services and NFS to a 3rd machine.
All hail the power of the penguin!!!
One mans opinion...
Peace of mind isn't at all superficial to technical work, it's the whole thing.
cihost.com
They offer dual PII 450 machines with multiple OC-3 connections and 12GB bandwidth. There are also lots of features including SSH Telnet access, compiling C/C++, Perl, PHP, usw. About $20 PCM.
(no, I don't work for them)
reading some of the previous posts, people suggesting NT to run a webserver over linux? hah. well maybe linux isnt the _best_ OS to run on a webserver but its a hell of a lot better than NT. a NT box would require a hell of a lot of ram and probably at least a pII for a processor, that extra cache on the pII makes em nice. in my humble opinion linux should run a webserver fine on a celeron with 128 megs of ram or so. good idea to make a ramdisk for the webcam too. and might as well throw some swap space on for fun. and for god sakes use SuSE :P
tyler
amen... very importand to check ping times
If you read the previous post http://slashdot.org /article.pl?sid=99/09/11/1418247&threshold=6 the article says
..he would still use FreeBSD if he could do it over again..
You WILL need more bandwidth than i think you are thinking of =[
But Hardware, I would just go Dual celery 366 with 128 pc100.. You can get each chip for around 65 us, then the stick of 128 will be around 125 (cheeze memory prices have gone up!!)
Motherboard : Abit BP6 (got one, love it!) goes for 130 us.
FreeBSD is definitely worth looking at, but don't make the assumption that it's more efficient than Linux just because Yahoo is using it.
I'd say Yahoo is using FreeBSD because they felt it was the best option to them at the time they put Yahoo together.
I'd also say they're running it on 1000 servers, and have a lot of engineers who know FreeBSD inside and out. Why switch now unless there's a really good reason?
If they were putting Yahoo together today, they'd probably seriously consider Linux.
-- CP
Ramdisk, to my knowledge, is just another filesystem. I would think it's strange if they would not cache it.
On Win32, Apache does not fork, the process is multithreaded.
So... if a Slashdot article referred to a second Slashdot article, and this sencond article referred back to the second.... we would have a feedback that would destroy the mankind as we know it.
Bye!
...but you are also a confrontational troll.
= -=-=-=-=-=-=-=-
Have you even bothered to READ the terms of service at any of the free webspace provider sites? Simply put, if it's not explicitly forbidden by the terms of service then it could NOT possibly be called "theft of service".
If you cannot understand that simple concept, it is because you are either:
a) a strict-constructionalist who believes that anything not explicitly permitted is forbidden.
b) an idiot who cares less about the facts than the chance to throw a hissy-fit.
c) a moron.
As I said before, if any web provider feels "used" for having this kinda of one-time usage they are free to modify their terms of service to ban or restrict it.
How can I explain this concept in a way that even a troll like you can understand...?
Let's say that someone on the street is standing under a sign that says "Free T-Shirts". You go over there an discover that all the T-shirts have advertisements written on them. Well now...if I want to take two or three T-shirts is that theft? I think not. Now if the guy changes he sign to read "Free T-Shirts - One per person" then if I took two or three I could believe that could be considered "wrong".
If you have some tangible (read: cut and paste) proof that my suggestion is breaking a written rule by all means post it. Otherwise, stop being such a troll and take this argument to e-mail. I'm not hiding...I have the guts at least to go get a disposable Hotmail account...
Which by the way...I got the Hotmail account so I could deliberately pass it out in case I receive a high volume response (spam). In a couple months I plan to throw it away and get a new one...unless Hotmail decides I am using too much mail space and they yank the account it for me. Sound familiar? Me using free webspace to offset the slashdot effect is no different than using a free mail account to offset the spam effect.
Nuff said...end the thread.
- JoeShmoe
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
-- I wonder which will go down in history as the bigger failure: the War on Drugs or the War on Filesharing
What an ego. Who would want to work for such an anal-retentive, anti-social software dictator such as yourself?
= -=-=-=-=-=-=-=-
Anyhoo...Mr Coward has reached the level of throwing personal insults like so much rotten fruit. It's clear he obviously hasn't been able to come up with any evidence to prove my original suggestion was in any way wrong or "theft of service". Thank you, we can now go on with our lives...
- JoeShmoe
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
-- I wonder which will go down in history as the bigger failure: the War on Drugs or the War on Filesharing
Even a slow computer can handle enough load to saturate a T-1.
A celeron + 128 megs of ram should do it. You can create a ramdisk to hold the entire website + webcam images (say 28 megs ramdisk) and a 10/100 ether should handle anything anyone can throw at it. switch off most of the logging to the bare minimum in apache and strip linux down to its bare minimum..redhat should do ok.
Also turn off reverse DNS lookups if you haven't already... it might only take 1 second for Apache to look up a domain name but in that 1 second, 5000+ hits will come in. Holding open 5000+ connections will kill your server's performance.
From the sounds of it slashdot now imploys a layer 4 switch to balance the load amongst the three web servers. Before it was simply one big box running the database and the web server.. obviously not a good scenario and it does NOT scale well at all. It'd be better and more affordable to have 30 small boxes running web servers as front ends than 1 huge million dollar supercomputer. ;-)
Alpha? Athalon? Fast scsi? Geez, what sort of dynamic content you think he's serving here? Its not like he's having a cgi run the Gimp and generate a customized logo for each person (I've actually considered that, on a low-hit site).
He's having
a) 5 megs of what will be cached, static pages
b) static webcam images, undoubtably well under 1 meg.
Thats *it*
Period.
If he spends more than 150$ on this, he's wasted money.
His total bandwidth is 100 mbs. That means he can unload ~15 megs per second. Are you trying to tell me that a low end pentium can't shove 15 megs *in ram* out per second? Come on, a low end pentium could do half that from a fast scsi disk, let alone from ram. Your main concern is saving bandwidth - shut off reverse DNS, for one. As for the system, turn of unnessisary logging, reduce the max number of forks, etc to minimize disk accesses and limit how much memory apache will eat up. CPU isn't a problem on static content in this day in age, even for slashdotted pages.
- Rei
Just as a sidenote, my 486-class machine with 16MB of RAM survived the slashdot effect two times without even breaking a sweat. At the time, it was running FreeBSD 2.2.8.
Secondly, consider the BSDs. (Moderators: this is not flamebait. I have a valid point). Their TCP/IP implementation is a good deal faster than the Linux equivalent, and while the Linux stack is maxing out on concurrent TCP/IP connections (which is a possibility, especially with lots of images, etc. on your site) BSD will keep on chugging. I'm not sure how much of an issue this will be here, though. I think for the most part, unless you're Yahoo.com, you should be okay with Linux. But hey, you're the judge.
Finally, be sure to think outside the box when it comes to HTTP servers. There are other servers besides Apache, believe it or not. And in your case, there are ones that are a lot more optimized than Apache for serving up static content (I think it's static, save the webcam. You didn't really say). thttpd and Zeus (it's not free, shoot me) come to mind.
I think there is a world market for maybe five personal web logs.
I believe this is an incorrect statement under Linux. If I recall correctly, Linux' ramdisk driver just allocates pages for the ramdisk's filesystem in the disk-cache and just asks that they never be purged. That way, there aren't two copies of the same data in RAM, and you don't have to copy data back and forth between the ramdisk and the cache.
I imagine that the reason you saw better performance with those bdflush parameters is that more of the OS got cached into memory, fewer things got edged to swap, and logs that weren't on the ramdisk before (eg. everything in /var/adm) are benefitting from the huge flush period. I personally would not recommend going 10 minutes between flushes though. What happens, for instance, when a HD starts getting flaky?
--Joe--
Program Intellivision!
I have noticed that a number of people have mentioned Zeus and thttpd as alternatives to Apache. I know a few people who say they have also had success with the Roxen Challenger server. Does anyone know how Roxen performes in comparison to the other three? From what I've heard it is excellent for static pages, which sounds like what is on the art site described here.
Better yet, make sure the sculpture is /really dumb/, and then you can run it on a dialup line without problems!
I'm sitting here trying to think of a sculpture medium that wouldn't be particularly harsh towards functioning electronics that you wish to stay functioning throughout the process.
You certianly can't use a torch for anything, and an arc welder would fry every component in the computer in about 1/13th of a second. With paper mache you'd have to be very careful you didn't drip any goo inside the vents. Clay would similarly be bad juju.
Epoxy based putties would work alright, as long as you got it right the first time. Flying debris from griding or chizzling it down could be very bad for the machine, not to mention the vibration problems.
For the same reasons, if you wanted to use wood you'd have to cut it well away from the computer and finish it before attaching it.
As an artist i'm intregued. Conceptual stuff isn't my bag, so I'd never consider putting a live computer inside something i was working on (More likely a dead one, or I'd save the live system as the last piece of the puzzle), but I'm finally getting around to having a studio to do stuff in, and I wonder how well a computer fits into the game.
You know, I mean, stuff tends to get pretty messy. And live computers aren't the sort of thing that mix well with fits of inspiration that involve picking something up and slamming it back down in a different position. Even excepting a spinning harddrive and fans, cables and cards tend to get knocked loose, sparks fly, etc.
If the computer survives the sculpture, that's art in itself. Go for it. Lemme know how it works out.
This is just like television, only you can see much further.
Ignore most of the hardware recommendations from above, since THERE IS NO ANSWER TO YOUR QUESTION. It *depends* on what you are about to do. But you do not say what you are about to do...
Your question is *way* too generic. "I want a car. What should I buy?"
In your case, it depends on what kind of pages you are about to serve.
You haven't mentioned if your pages are static or dynamic. Dynamic means that they are created "on the fly", e.g. using content from a database. Slashdot itself is a dynamic site. And even then, there are lots of differences - some database engines require more hardware than others, some technologies for dynamic pages require more processing power per page hit than others etc. etc.
Reading your question, it seems that your site is made of static pages only. In that case, you do not need very much processing power and an older CPU will do.
With webcams, that again is something that completely depends on what kind of camera hardware you are about to use. Some of them require a lot of help by your web server, but most of them don't. You don't say what kind of camera you have, so again, no definite answer is possible.
Once you actually know what you will do, feel free to mail me. THEN I could try to help you...
------------------
You may like my a cappella music
That's just about the same config that the folks at thinkgeek had when /. took them down last week.
/.ed don't crash, their bandwidth gets overrun. I'd imagine a webcam site would get clogged pretty quickly.
Of course, the original question is just stated wrong. It's not the hardware config that matters so much, it's the software config, and especially how you serve up dynamic content. And most places that get
I've been considering setting up a dedicated box to do work serving pages generated by PHP3 (off of a mysql database). Though it's going to be for local buisnesses, I'm wondering what the best option would be to hook it up to the net. I'll need to host various pages off of it, including several on different domain names.
Here is a way I came up with to Avoid the /. effect. I haven't implimented it, but it is based off a HTTP feature that used to be a security hole that people complained about.
If the server is reaching saturation, it should enable a log keeping track of WHERE the traffic is coming from (as in what site its being directed from). If it where to do this it could stop serving as many pages to Slashdot folks while leaving the site up for other people. It could also display a error message such as "This page has been swamped. Please try again later."
What do you think?
www.atacomm.com - The Leader in VoIP Product Distributi
I understand the concept of, "Layer 4 switching," or, "server load-balencing," but how does one implement such a solution using current Linux software? Paging Mr. Malda...Mr. Malda to the comment bin please. -AP
You can run a decent web server with *very* little hardware. Assuming you'll be staying away from dynamically built pages and database access, you could probably get away with a pentium 120, 64Mg RAM and (of course), Linux or your fave flavour of BSD. Don't even bother buying a monitor since that'll drive up costs... administer remotely via telnet (if you're not too paranoid) or ssh (if you are)
:)
If you decide to mingle databases and dynamic pages (either in perl or php), I'd pump up the ram to 128 and give it a little more processing juice for good measure. A well tuned apache can be made to not throw up when there's LOTS of requests (ie slashdot), but I'm guessing it'll probably end up puking if it has to produce a page each request.
There's a workaround though. You can write a set of perl scripts that make a static web site every 15 minutes (or whatever time) running as a daemon. That way, you escape building pages for every single request. (correct me if i'm wrong, I think that's how slashdot's built... rob???)
In *all* likelihood, though, I wouldn't worry about it. Let's face it, if you *do* get slashdotted (which is likely), it won't be everyday (which is certain). It *will* force you to configure your server well, though. In your place I'd just go static HTML, or dynamic pages with perl or php3.
Of course, then there's the web cam to take into consideration, but to be honest, I haven't set one up so I'll let that up to someone who has.
And that's my 2 cents
A lot of people have mentioned the fact that you're webserver should be optimized, but no one has really said how.. A couple months ago, I put something together on exactly this subject, hopefully it can help peole that want to tune a server in preperation of the /. effect.
http://evolt.org/index.cfm?menu= 8&cid=193&catid=18.
.djc.
I think that your backend (if you use dynamic content) is just as important as what machine/webserver software.
If you are using som kind of database, think twice and test carefully. Mysql is often used, although me thinks that the sql implementation in mysql is _bad_ , subselects anyone?.
A single badly written cgi-script can also bring down a otherwise good server. (Trust me!
mod_perl can speed things up if you use perl alot, but it alse puts some special requirements on your perl scripts.
I am afraid that I don't agree with the previous author that the method that the /. admins have configured their servers is the best method to provide fast reliable service.
/. could lose one of its web servers and not be affected... but what if it lost;
The simple fundimental mistake that always seems to be made is that there has to be a central resource. The problem with a central resource is that it becomes a single point of failure.
Sure
a) Its database
b) Its switch
c) Its router
d) Its link to the Internet
e) Its uplinked ISP has a BGP problem
etc etc etc...
The only way for a system as popular as slashdot can maintain the availability it deserves and requires is with a fully distributed system.
There are better ways.
noidd
You can have the best box in the world but if you're on a slow connection it doesn't mean much...
I did say heavily dynamic. Very complex pages with realtime generated graphs and tabled data, pulled from SQL databases. Include on top of this user accounting and tracking, a large degree of modularity within the code due to the complexity of the system etc, and 7 hits a second is, in my opinion anyway, quite impressive. A significant number of tricks were utilised to decrease the CPU hit including an httpd accelerator (squid in reverse), various partial pregeneration of pages where possible, and rewrites to cached copies of graphs and data where possible.
It was written in PHP, and I'm more than happy with the performance, in fact (I'm not the author of much of the code itself) I'm more often than not, very impressed with the speed with which it operates. I am looking forward to being able to move it over to Zend, this should increase our capacity even more, without any hardware additions.
You can't win a fight.
Linux uses a paged memory model, hence, RAM doesn't fragment. More precisely, RAM is massively fragmented, but the processor's paging unit hides this from applications. Thus, if you want to make sure that something is in RAM that would usually be on disk, a RAMdisk is a good idea.
Server Sizing rules of thumb, culled from a Moshe Bar artice in Byte, and immortalized in my palm pilot!
1 7200 RPM SCSI disk per 75 hits per sec. Make sure to have a big eough SCSI Bus (Wide/Ultra, etc) to handle the number of drivers you use.
Linux 2.2 kernel much more efficient than the 2.0 kernel.
BSD and Solaris have more efficient memory paging algorythms. This is only an issue if you are serving more data then will comfortably fit in RAM, which you aren't.
As others have said, your connection is probably the weak link. Calculate your average page size and multiple by the number of page hits per second to figure out how much bandwidth you need.
-Loopy
It would seems, judging from various sources, that FreeBSD would be able to handle the load of a high profile site more efficiently than Linux could. I have no experience in the high profile/high demand server field, but from what I've read FreeBSD is made for this sort of thing.
Anyone here have any reasons why Linux would be better than FreeBSD in this situation? After all, yahoo (probably the worlds most visited site) uses FreeBSD and seems to really like it.
...
Bitchslapped? Give Rob a bitchslap from bitchslapped.com.
Not to be too blunt, but that's overkill. Way, way (way way way) overkill.
An old pentium can serve sufficient static pages to saturate your bandwidth. For that matter, an old *macintosh* can serve sufficient pages to saturate your bandwith.
The major thing will be all the side processing that you do to generate the pages and content. In this case, his webcam, probably dynamic generation of archive pages and the like (although a better idea would be to regenerate all the active pages once - your last archive page, the index of the archive page, and the new cam pic page - when the new cam pic comes up.
Especially for a high traffic site, doing it once and then serving from the filesystem will be much more important.
As for your analysis: you forgot the biggest server system speed-up. RAID. Multiple disks on multiple controllers. A single controller and a single disk like you suggested, no matter how fast, will always pale to this relatively low-cost solution (and, for that matter, his data will be much safer, too).
--
--
There is no premature anti-fascism. -Ernest Hemingway
That's a well-written response and I don't have that much to add to it except that perhaps Apache isn't the best platform for generating dynamic content (even with mod_perl) in a high-load situation.
This isn't a knock against the apache group - they made a great webserver. But their emphasis has been on modularity and extensibility. The great drawback of apache is that it forks for each new connection - this can eat up a lot of RAM very quickly.
I would think that a non-forking webserver, such as AOLServer or Zope, would serve you better. Perhaps AOLServer more than Zope, as Zope has to interpret a lot of python on execution.
As endorsements go, Bruce Perens runs Zope for his site (although I'm not sure how much traffic it gets, but it's been mentioned on slashdot at least a half dozen times and should have taken the slashdotting to end all slashdotting by now). Philip Greenspun, the author of Database-Backed Web Sites and Philip and Alex's Guide to Web Publishing (not to mention the brain behind Ars Digita and hence scads of corporate sites), uses AOLServer.
--
--
There is no premature anti-fascism. -Ernest Hemingway
With all the MS bashing about and with hotmail running BSD, check out who's NOT running Linux.
www.linuxplanet.com is running Apache/1.3.3 (Unix) PHP/3.0.7 AuthMySQL/2.20 on Solaris
www.linuxgeneralstore.com is running Apache/1.3.6 (Unix) mod_frontpage/3.0.4.3 on DIGITAL UNIX
I do realize that these sites are in the minority, but it just goes to show you that linux is not the end all be all of OS's. OS's are like masturbation, everybody has their own way of doing things.
Actually, it depends completely on which free website you choose. As I said in my earlier post, some providers prevent deep-linking by dynamically moving content around (sharing the load among several different servers).
= -=-=-=-=-=-=-=-
Each link to an image is actually a link to another HTML file (I don't know how this works when the link says "/something.jpg"). It's very frustrating when you are trying to "Save Target As..." since you end up with an HTML file and inside is a link to a JPG that has already been moved. You have to basically load the HTML page and then right-click on the image itself to save it.
What's the problem with just linking to the HTML files? Well...that's where the frame+banner+ad trickery all happens. So yes...you could link to a bunch of images on free website providers as long you don't mind the fact that clicking on one of these links would spawns a new window or frame with some ad content.
Now...I don't like GeoCities because it spawns a new window. I find it is much more "polite" if they tuck the ad content in another frame. Why? Because I know the code to "break" the frame and just give me a plain, unadulterated, page that looks identical to part of my site.
If you'd like to see a GREAT example of how you can built a COMPLETE site out of nothing but free website providers (with hardly an ad anywhere!)check out...
http://mangaheaven.cjb.net/
(Naughty Anime alert). If it wasn't for the host name, you'd never know this entire collection of pages was run completely on the good graces of providers like Xoom and Tripod. Notice how the main page on cbj.net instantly hands off traffic to ten other websites so that if one site goes down, it only takes a day or two for the site owner to mirror the collection to a new host.
- JoeShmoe
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
-- I wonder which will go down in history as the bigger failure: the War on Drugs or the War on Filesharing
...or "How to Cluster for Free"
= -=-=-=-=-=-=-=-
You might consider putting some of your content on any number of the free webpage providers like GeoCities or Xoom. It's not really classy enough for commercial sites, but they are great for defraying some traffic from your primary site. This may be just what a starving artist needs...?
Give basic information and/or samples on the free site and then if people are interested, they can click-through to your primary site. It's also a great way to tell people about mirrors (if Link A doesn't work, try Link B)
Generally speaking...unless you are reaching abuse levels with MB transferred...most webpage providers could even handle link attention of slashdot proportions.
Some providers like Tripod and Web1000 (porn banner alert) already spread your content over several servers to keep other people from "deep-linking" one particular file...handy if you want people to read your statement and not just download your images.
- JoeShmoe
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
-- I wonder which will go down in history as the bigger failure: the War on Drugs or the War on Filesharing
The web daemon will be reading files from cache repeatedly, not building content on the fly, so a 233 may be overkill.
If you're broadcasting a "live" (1+ second refresh) show, you can improve efficiency a bit more by encoding the JPGs on another system (even a win95 box) and ftp-ing them to the web server automatically. This is how most adult streaming sites work.
j
If you're serving video or audio or images, though, you might need a faster net connection - do the math regarding bandwidth-per-user and how many you can support.
Thanks
Bruce
Bruce Perens.
i've been slashdotted. and starwars'd (i mirrored both trailers for TPM). i survived very well. how? a well tuned server. simple: control the instances of your web server to re something reasonable. apache is smart, the factory defaults wont let a slashdot effect take it down.
:)
for fun, turn any SYN flood program (ie portfuck) on port 80 and bang away. make a bazillion servers TRY and start up and see how your box responds. simple as that.
oh yeah, my stats: PII/266, 64 MB ram, OC-3 connection in. my max instances of httpd? around 50.
jose nazario
jose nazario jose@biocserver.cwru.edu
Erm.. didn't /. run for, oh, a few years on a single intel processor?
This is all really, really bad advice, imnsho. You ignore bandwidth, recommend 2 boxes instead of actually getting the most out of a single box, and even go so far as to insist on an alpha or athlon?
Too much infoweek!
--
Blue
i browse at -1 because they're funnier than you are.
The slashdot effect can be minimalized if you do the following:
1. Look at the processor/motherboard that the server has. You will be handling a large amount of requests. Therefore, you will want to get an Athlon or Compaq Alpha. Both of these models handle multitasking OS's better than the Intel chip. I prefer the Alpha. You will want the fastest memory you can buy, with the best configuration. An Alpha motherboard and chip will give you this.
2. Your disk subsystems are also very important. You will need to maximize bandwidth between the motherboard and the disks. An Adaptec U2W SCSI controller will help you here. I also recommend the Seagate Cheetah series of hard drives. A 9GB drive will not cost you that much, and wil have plenty of performance benefits.
2.5. Your networking subsystem. Make sure your network card is directly supported by Linux and has a good chipset. 3Com Fast Etherlink XL PCI cards are my favorite choice here because of their Parallel Tasking chipset, and because installation of them under Red Hat 6 is a snap. They even work well with NT, which you do not want to run a slashdotted site on unless you want to run a Compaq Proliant 8500 and spend as much on it as you would a house.
3. Make sure your Linux installation has a large enough swap partition, and don't run any extra services. Strip it down to what you need, and preferrably put your DNS on a small extra machine, as well as other system functions that you might want, but do not need to be on that machine.
4. Check with people here about exploits. Every script kiddie that reads this site will want to crack your box and leave messages. The more immature ones will probably quote DMX or other rap artists. There are many cool people here that are really good with Linux security.
I believe the reason a lot of Linux sites get slashdotted like this is because a lot of hardware that Linux is used to run on is not what you'd want to run a commercial website on.
The reason why NT appears somewhat stable in a lot of cases is because the manufacturers of NT servers bend over backwards to make NT work on the BIOS and hardware level.
Linux can get the same effect and maximize performance off a website by tuning the hardware a bit, and knowing what hardware to use. A Celeron ain't gonna cut it. Alpha processors will do your job just fine for you without the Intel issues.
Plus, the system I quoted there can be had for about $4K and can handle heavy loads. Try doing that with 1 processor on an Intel chipset.
If you look at any of those old (and albeit wrong) studies between Windos and LInux for webserving... one, moderately beefy linux box will serve up content for more bandwith than several T1's. The real question is, what kind of bandwidth are you going to need.
Perhaps you should really focus on making sure the sculpture delivers up a pretty small, streamlined image, otherwise, the demands on your internet connection are going to kill you...
Let us know when you get hte project done... hopefully it will be so great, we'll crash your server no matter how beefy it is... Good luck...
Chris MOyer
/* CDM */
While serving up plain HTML is no biggie, and any old box will do for that, serving dynamic content can be orders of magnatude harder. If you have heavily dynamic material (And your concept suggests that perhaps you do) then you will need a fairly capable box in order to respond quickly to requests. It is here and in bandwidth that the critical bottlenecks lie.
/. effect generates :)
To give a suggestion of the CPU power required, the company I work for has several heavily loaded servers:
A celeron 350/128mb ram, maxes out at approximately 7 hits/second (Heavily dynamic material)
A celeron 350/128mb ram, maxes out at ~17hits/sec (Quite heavily dynamic material).
I just don't know how many hits/sec the
Note that these servers have been specially configured to handle the traffic involved, it is unlikely that you will go to the same levels of specialisation, so leave some extra space.
You can't win a fight.
Actually, in your scenario, the most active pages will be in RAM twice -- once in the ramdisk and once in the disk cache. Since the bulk of the serving will be out of the cache, the ramdisk will sit there mostly idle -- and a waste of 50 MB of RAM.
/proc/sys/vm/bdflush I was able to actually get the performance higher than when using the ramdisk. The reason for the improvement was that I now had more memory for caching the disk by not double-caching in the ramdisk and the cache as I had been doing.
Unfortunately, with writable dynamic content, the ramdisk will have to be written to disk periodically, adding complexity, overhead, and, quite possibly, more disk IO than using a disk directly!
My server is a Celeron with 320MB RAM running Linux 2.0.36. I configured it with a 128 MB ramdisk and did a great deal of testing. Performance was significantly better, especially during peak loads, than running straight from the disk. Of course, I had a considerably more complicated set of scripts and still stood to lose some transactions if something bad happened.
As my next excercise, I tried to duplicate that performance without the ramdisk. By tuning the values in
The trick that worked for me was to increase the percentage of dirty buffers before forcing a flush to 80% and to increase the timeout for dirty buffers before flushing them to disk to 10 minutes. That does include some of the disadvantages of the ramdisk but my UPS is good for over 10 minutes so I don't worry much (the Internet connectiond drops when power is lost so my machine, while still up, goes idle). My startup/shutdown/backup scripts are much simpler as a result though.
Geeky modern art T-shirts
Where will the site be hosted? Are you planning to host it with an ISP or at the location of the web-cam? If you are hosting it at the location of the web cam, network bandwidth will be by far your biggest concern. At the very least, you are going to need a frac-T1, frame relay, or DSL connection. Chances are, though, that if you are concerned about PC hardware costs, all of these (except perhaps DSL) are out of the question.
More likely, you will have the webcam connected to a PC, which could do nothing but capture images and upload them (via modem, ISDN, or DSL) to a co-located machine with an ISP. The server located at the ISP will then push them out to the teeming millions.
If you do not have the need for any CGI, or your CGI needs are minimal, you may not even want to use your own machine. You may be best off just getting a web access account -- you know, the kind of think you get with many dial-up accounts, though with better service and the capability for more bandwidth.
Assuming you are doing CGI, and you really do need your own machine, you really ought to answer your own question. By that I mean that you should benchmark your system on whatever hardware you happen to have handy. Depending on the complexity of your site, there are many server-testing tools that can tell you just what type of loads your system is capable of handling, and what type of latency you can expect at those loads.
If those numbers are much more than you expect to receive, then you know a machine like what you have is sufficient. Or, you may discover that a 486 with 32 megs of ram is plenty sufficient. If you have a lot of inefficient CGI, you may need a dual pII with gobs of memory. If you have more time than money, then trial and error will give you by far the most efficient system.
Let me tell you this: building a system to handle a high bandwidth site is not nearly as much fun when money for hardware is no object. Perhaps the e-mail domain may clue you in there...
-p.
just by a 14.95 account from a place like www.jumpline.com, and have your box upload your webcam images and serve the content.. you get 3 gigs of transfers a month, 50 megs space, even ftp support.. they run the servers and offer bandwitdh.. if you need more upgrade accounts..
its a hell of alot cheaper then getting your own t-1 & servers.. as most of these people are sitting directly on the mae's and such.
Details
1. think about the difference between "static" content (just files on the disk) and "dynamic" content (pages generated live, like here at /.). If you are just serving files, a 486 can handle it (assuming T1 speeds). I personally use a Pentium/90 at .3 T1 speeds and CPU never gets high.
1bis. Memory and disk speeds are hugely more critical than CPU speeds (if you are not doing dynamic content). Get a DMA harddisk (SCSI or UltraDMA IDE). 64-meg of RAM should really be enough for your application.
2. the biggest thing that is going to kill you is bandwidth. Now I run a website that gets about 10,000 hits/day (raw) on a 400-kbps link, but I'm just serving HTML and inline GIFs so the link never really gets overloaded. However, you sound like you might be hosting some pretty hefty downloads. One technique is to stick your big-files on a free-hosting website (like GeoCities), but they do monitor their logs and they will kill your download, but hopefully that's after being Slashdotted.
3. Reading other comments, I see a bunch of people suggesting RAMDISKS. That's totally unnecessary; the operating system caches disk access equally as well as a RAMDISK. (In fact, a RAMDISK is just a crude way of tuning your disk-cache).
4. Remember to consider you content. Artistic web-designers tend to put way to much layout/graphics in their pages. This can kill you website, as it can easily reach 10-times the bare minimum in size, but moreover kill your site with unnecessary TCP connections (If you put 4 gifs in a web-page, you will cause 4 TCP connections to your site; and the TCP stack within the machine can handle only so many concurrent TCP connections before bogging down).
4bis. Please be polite to readers. You probably will develope your content only on one browser, but slashdotters use a wide variety of browsers; you'll likely piss off a lot of people if, for example, your pages render well on Netscape/4.61 but look like crap on older/alternative versions. This often means reducing layout.
One way to absolutely guarantee that Slashdot effect won't overload your server is to set up a Slashdot-like setup.
We've been given some sketchy details on the current setup. It would be interesting if there was a page with all of the specs, software, and tunings, including config files, etc.
Slashdot can take quite a load. If the setup was documented, a lot of us who have projects on the horizon will have something to base them on and can avoid mistakes, etc.
However, our site is very heavy on the dynamic content (and uses a lot of SSL for the ordering system).
The machine could handle about 20 minutes of the /. effect at a time before the CPU time went sky high. Luckily, we were able to bring the machine down, put in a second processor and double the memory (we also updated mod_perl); and get the machine back up in a couple hours with the new configuration.
The machine has been running like an absolute champ for the past few days. It's been able to handle requests numbering in the millions (page hits are in the hundreds of thousands, but our site uses a fair amount of graphics also) and has transferred several GBs of data just this weekend. If you do anything securely (SSL), keep in mind that anything on that secure page will take up about 7 to 8 times the CPU time as a non-secure item. And never, never, never run a site using Perl for dynamic content without installing mod_perl for Apache. The difference between a machine with it and one without it is tremendous (especially in memory usage).
One thing you can do for big gains in speed is disable hostname lookups (this makes a huge difference when being slashdotted). Also, turn the log level down on Apache. Because we have space to spare on this particular machine, we have the logging set at a moderate level. After two days since the mention on Slashdot, the logs are a few hundred MB. Not a problem if you have the disk space, but if you don't it will be a major problem.
Anyway, the configuration now is: dual PII-450, 256MB of ECC PC100 SDRAM, 10MB Ethernet on kernel 2.2.12 running Apache 1.3.9 and Perl 5.005 (along with the latest OpenSSL, SSLeay and mod_perl). It's having no problem keeping up with the load at this point, and the traffic is still pretty heavy.
-Jon