Slashdot Mirror


Ask Slashdot: Art, Linux and the Slashdot Effect?

patSPLAT submitted this artful submission: "I'm asking Slashdot: What kind of box does Linux need to handle the Slashdot Effect? I'm an artist, and I'm working on a sculpture which will be self-documenting with a running server/webcam. Since the server will be a part of the piece, I don't want to spend more than I need. I do want it to be able to handle a heavy load if my piece is well recieved. I'm planning on getting a 10/100 Ethernet, but I'm wondering about processor and memory. Could I get away with an older Pentium? Would a Celeron running in console mode do the trick? 64MB? 128? What do you think I could get away with? The website on the piece would be no larger than 5 megabytes, and webcam would obviously require some resources. I'm not sure how much the webcam would take yet, so give me the minimum and I'll go up a step to account for the webcam. "

14 of 204 comments (clear)

  1. Requirements by j · · Score: 3
    We've done events using Boa on a PII 233 and gotten great results. Boa serializes connections rather than forking off multiple processes, so the RAM requirements were a tiny fraction of what Apache (though I love it to death) would require under the same hits-per-second load. For our purposes, 64 meg was plenty.

    The web daemon will be reading files from cache repeatedly, not building content on the fly, so a 233 may be overkill.

    If you're broadcasting a "live" (1+ second refresh) show, you can improve efficiency a bit more by encoding the JPGs on another system (even a win95 box) and ftp-ing them to the web server automatically. This is how most adult streaming sites work.

    j

  2. I get slashdotted all the time. by Bruce+Perens · · Score: 3
    Articles on my site are linked from Slashdot quite often. I have a 768 megabit-per-second DSL which doesn't generally saturate from that and the fact that it's serving a 2.4 gigabyte U.S. street map database for FTP at the same time. Static web pages live on a Pentium 120, 128M RAM, running Apache. Dynamic pages live on a Pentium III 450, 128M RAM, running Zope and Apache. Zope uses a lot more CPU, as it interprets lots of Python to put up a page. Both systems run Debian GNU/Linux "unstable", which means the unreleased development version, which is darned stable dispite that. I recently left these two servers running unattended with no way to reboot them while I took a 2-week vacation, they didn't crash.

    If you're serving video or audio or images, though, you might need a faster net connection - do the math regarding bandwidth-per-user and how many you can support.

    Thanks

    Bruce

  3. Re:Tuning, tuning, tuning! by ratchet69 · · Score: 3

    Apache kicks ass, but it IS slow! If you don't need all of Apache's functionality, don't use it. You might want to look at thttpd. I found that with a 2.2.x kernel I could do 1,000 concurrent connections on my P133 laptop w/ 16MB ram! I used Jeff's http_load for the test, which will throttle down to emulate a 28.8 client.
    More performance info can be found here.

  4. the slashdot effect (and starwars effect) by jnazario · · Score: 3

    i've been slashdotted. and starwars'd (i mirrored both trailers for TPM). i survived very well. how? a well tuned server. simple: control the instances of your web server to re something reasonable. apache is smart, the factory defaults wont let a slashdot effect take it down.

    for fun, turn any SYN flood program (ie portfuck) on port 80 and bang away. make a bazillion servers TRY and start up and see how your box responds. simple as that.

    oh yeah, my stats: PII/266, 64 MB ram, OC-3 connection in. my max instances of httpd? around 50. :)

    jose nazario

    --
    jose nazario jose@biocserver.cwru.edu
  5. Re:Let's look at how to do this systematically :) by Blue+Lang · · Score: 3

    Erm.. didn't /. run for, oh, a few years on a single intel processor?

    This is all really, really bad advice, imnsho. You ignore bandwidth, recommend 2 boxes instead of actually getting the most out of a single box, and even go so far as to insist on an alpha or athlon?

    Too much infoweek!

    --
    Blue

    --
    i browse at -1 because they're funnier than you are.
  6. Let's look at how to do this systematically :) by mbpark · · Score: 3

    The slashdot effect can be minimalized if you do the following:

    1. Look at the processor/motherboard that the server has. You will be handling a large amount of requests. Therefore, you will want to get an Athlon or Compaq Alpha. Both of these models handle multitasking OS's better than the Intel chip. I prefer the Alpha. You will want the fastest memory you can buy, with the best configuration. An Alpha motherboard and chip will give you this.

    2. Your disk subsystems are also very important. You will need to maximize bandwidth between the motherboard and the disks. An Adaptec U2W SCSI controller will help you here. I also recommend the Seagate Cheetah series of hard drives. A 9GB drive will not cost you that much, and wil have plenty of performance benefits.

    2.5. Your networking subsystem. Make sure your network card is directly supported by Linux and has a good chipset. 3Com Fast Etherlink XL PCI cards are my favorite choice here because of their Parallel Tasking chipset, and because installation of them under Red Hat 6 is a snap. They even work well with NT, which you do not want to run a slashdotted site on unless you want to run a Compaq Proliant 8500 and spend as much on it as you would a house.

    3. Make sure your Linux installation has a large enough swap partition, and don't run any extra services. Strip it down to what you need, and preferrably put your DNS on a small extra machine, as well as other system functions that you might want, but do not need to be on that machine.

    4. Check with people here about exploits. Every script kiddie that reads this site will want to crack your box and leave messages. The more immature ones will probably quote DMX or other rap artists. There are many cool people here that are really good with Linux security.

    I believe the reason a lot of Linux sites get slashdotted like this is because a lot of hardware that Linux is used to run on is not what you'd want to run a commercial website on.

    The reason why NT appears somewhat stable in a lot of cases is because the manufacturers of NT servers bend over backwards to make NT work on the BIOS and hardware level.

    Linux can get the same effect and maximize performance off a website by tuning the hardware a bit, and knowing what hardware to use. A Celeron ain't gonna cut it. Alpha processors will do your job just fine for you without the Intel issues.

    Plus, the system I quoted there can be had for about $4K and can handle heavy loads. Try doing that with 1 processor on an Intel chipset.

  7. Your Bandwith is what counts by cdmoyer · · Score: 3

    If you look at any of those old (and albeit wrong) studies between Windos and LInux for webserving... one, moderately beefy linux box will serve up content for more bandwith than several T1's. The real question is, what kind of bandwidth are you going to need.

    Perhaps you should really focus on making sure the sculpture delivers up a pretty small, streamlined image, otherwise, the demands on your internet connection are going to kill you...

    Let us know when you get hte project done... hopefully it will be so great, we'll crash your server no matter how beefy it is... Good luck...

    Chris MOyer

    --
    /* CDM */
  8. Linux servers with dynamic content by PhiRatE · · Score: 4

    While serving up plain HTML is no biggie, and any old box will do for that, serving dynamic content can be orders of magnatude harder. If you have heavily dynamic material (And your concept suggests that perhaps you do) then you will need a fairly capable box in order to respond quickly to requests. It is here and in bandwidth that the critical bottlenecks lie.

    To give a suggestion of the CPU power required, the company I work for has several heavily loaded servers:

    A celeron 350/128mb ram, maxes out at approximately 7 hits/second (Heavily dynamic material)

    A celeron 350/128mb ram, maxes out at ~17hits/sec (Quite heavily dynamic material).

    I just don't know how many hits/sec the /. effect generates :)

    Note that these servers have been specially configured to handle the traffic involved, it is unlikely that you will go to the same levels of specialisation, so leave some extra space.

    --
    You can't win a fight.
  9. Ramdisk versus cached disk... My experience. by Sun+Tzu · · Score: 4

    Actually, in your scenario, the most active pages will be in RAM twice -- once in the ramdisk and once in the disk cache. Since the bulk of the serving will be out of the cache, the ramdisk will sit there mostly idle -- and a waste of 50 MB of RAM.

    Unfortunately, with writable dynamic content, the ramdisk will have to be written to disk periodically, adding complexity, overhead, and, quite possibly, more disk IO than using a disk directly!

    My server is a Celeron with 320MB RAM running Linux 2.0.36. I configured it with a 128 MB ramdisk and did a great deal of testing. Performance was significantly better, especially during peak loads, than running straight from the disk. Of course, I had a considerably more complicated set of scripts and still stood to lose some transactions if something bad happened.

    As my next excercise, I tried to duplicate that performance without the ramdisk. By tuning the values in /proc/sys/vm/bdflush I was able to actually get the performance higher than when using the ramdisk. The reason for the improvement was that I now had more memory for caching the disk by not double-caching in the ramdisk and the cache as I had been doing.

    The trick that worked for me was to increase the percentage of dirty buffers before forcing a flush to 80% and to increase the timeout for dirty buffers before flushing them to disk to 10 minutes. That does include some of the disadvantages of the ramdisk but my UPS is good for over 10 minutes so I don't worry much (the Internet connectiond drops when power is lost so my machine, while still up, goes idle). My startup/shutdown/backup scripts are much simpler as a result though.

  10. Bandwidth and Benchmarking by pkj · · Score: 4
    It is absolutely impossible to answer this question without knowing quite a bit more about the details of your site. There has been a lot of good information posted so far that you should definitely take into account, but let me mention a few things I have not seen mentioned so far.

    Where will the site be hosted? Are you planning to host it with an ISP or at the location of the web-cam? If you are hosting it at the location of the web cam, network bandwidth will be by far your biggest concern. At the very least, you are going to need a frac-T1, frame relay, or DSL connection. Chances are, though, that if you are concerned about PC hardware costs, all of these (except perhaps DSL) are out of the question.

    More likely, you will have the webcam connected to a PC, which could do nothing but capture images and upload them (via modem, ISDN, or DSL) to a co-located machine with an ISP. The server located at the ISP will then push them out to the teeming millions.

    If you do not have the need for any CGI, or your CGI needs are minimal, you may not even want to use your own machine. You may be best off just getting a web access account -- you know, the kind of think you get with many dial-up accounts, though with better service and the capability for more bandwidth.

    Assuming you are doing CGI, and you really do need your own machine, you really ought to answer your own question. By that I mean that you should benchmark your system on whatever hardware you happen to have handy. Depending on the complexity of your site, there are many server-testing tools that can tell you just what type of loads your system is capable of handling, and what type of latency you can expect at those loads.

    If those numbers are much more than you expect to receive, then you know a machine like what you have is sufficient. Or, you may discover that a 486 with 32 megs of ram is plenty sufficient. If you have a lot of inefficient CGI, you may need a dual pII with gobs of memory. If you have more time than money, then trial and error will give you by far the most efficient system.

    Let me tell you this: building a system to handle a high bandwidth site is not nearly as much fun when money for hardware is no object. Perhaps the e-mail domain may clue you in there...

    -p.

  11. just get an account with a websever & upload by cybrthng · · Score: 5

    just by a 14.95 account from a place like www.jumpline.com, and have your box upload your webcam images and serve the content.. you get 3 gigs of transfers a month, 50 megs space, even ftp support.. they run the servers and offer bandwitdh.. if you need more upgrade accounts..

    its a hell of alot cheaper then getting your own t-1 & servers.. as most of these people are sitting directly on the mae's and such.

  12. My Experience by RobertGraham · · Score: 5
    Any system you choose will likely work, except if you choose dynamic content or your link is too slow.

    Details

    1. think about the difference between "static" content (just files on the disk) and "dynamic" content (pages generated live, like here at /.). If you are just serving files, a 486 can handle it (assuming T1 speeds). I personally use a Pentium/90 at .3 T1 speeds and CPU never gets high.

    1bis. Memory and disk speeds are hugely more critical than CPU speeds (if you are not doing dynamic content). Get a DMA harddisk (SCSI or UltraDMA IDE). 64-meg of RAM should really be enough for your application.

    2. the biggest thing that is going to kill you is bandwidth. Now I run a website that gets about 10,000 hits/day (raw) on a 400-kbps link, but I'm just serving HTML and inline GIFs so the link never really gets overloaded. However, you sound like you might be hosting some pretty hefty downloads. One technique is to stick your big-files on a free-hosting website (like GeoCities), but they do monitor their logs and they will kill your download, but hopefully that's after being Slashdotted.

    3. Reading other comments, I see a bunch of people suggesting RAMDISKS. That's totally unnecessary; the operating system caches disk access equally as well as a RAMDISK. (In fact, a RAMDISK is just a crude way of tuning your disk-cache).

    4. Remember to consider you content. Artistic web-designers tend to put way to much layout/graphics in their pages. This can kill you website, as it can easily reach 10-times the bare minimum in size, but moreover kill your site with unnecessary TCP connections (If you put 4 gifs in a web-page, you will cause 4 TCP connections to your site; and the TCP stack within the machine can handle only so many concurrent TCP connections before bogging down).

    4bis. Please be polite to readers. You probably will develope your content only on one browser, but slashdotters use a wide variety of browsers; you'll likely piss off a lot of people if, for example, your pages render well on Netscape/4.61 but look like crap on older/alternative versions. This often means reducing layout.

  13. How to guarantee you won't be affected by jfunk · · Score: 5

    One way to absolutely guarantee that Slashdot effect won't overload your server is to set up a Slashdot-like setup.

    We've been given some sketchy details on the current setup. It would be interesting if there was a page with all of the specs, software, and tunings, including config files, etc.

    Slashdot can take quite a load. If the setup was documented, a lot of us who have projects on the horizon will have something to base them on and can avoid mistakes, etc.

  14. From our experience.... by highcaffeine · · Score: 5
    I co-own a company that was recently slashdotted. We had the site running on a PII-450 with 128MB of memory, on a 10MB Ethernet connection (direct to the backbone; we also own/run an ISP). For a vast majority of sites out there, this machine would never have had any trouble taking whatever Slashdot could throw at it (so this configuration could be great for your needs).

    However, our site is very heavy on the dynamic content (and uses a lot of SSL for the ordering system).

    The machine could handle about 20 minutes of the /. effect at a time before the CPU time went sky high. Luckily, we were able to bring the machine down, put in a second processor and double the memory (we also updated mod_perl); and get the machine back up in a couple hours with the new configuration.

    The machine has been running like an absolute champ for the past few days. It's been able to handle requests numbering in the millions (page hits are in the hundreds of thousands, but our site uses a fair amount of graphics also) and has transferred several GBs of data just this weekend. If you do anything securely (SSL), keep in mind that anything on that secure page will take up about 7 to 8 times the CPU time as a non-secure item. And never, never, never run a site using Perl for dynamic content without installing mod_perl for Apache. The difference between a machine with it and one without it is tremendous (especially in memory usage).

    One thing you can do for big gains in speed is disable hostname lookups (this makes a huge difference when being slashdotted). Also, turn the log level down on Apache. Because we have space to spare on this particular machine, we have the logging set at a moderate level. After two days since the mention on Slashdot, the logs are a few hundred MB. Not a problem if you have the disk space, but if you don't it will be a major problem.

    Anyway, the configuration now is: dual PII-450, 256MB of ECC PC100 SDRAM, 10MB Ethernet on kernel 2.2.12 running Apache 1.3.9 and Perl 5.005 (along with the latest OpenSSL, SSLeay and mod_perl). It's having no problem keeping up with the load at this point, and the traffic is still pretty heavy.

    -Jon