Large Scale Web Apps Built on Open Source
prostoalex writes "Brad Fitzpatrick presented at OSCON with on overview of his little project. Interesting facts about the evolution of the Livejournal back-end architecture."
← Back to Stories (view on slashdot.org)
We are using Fedora, Postgres, and PHP for what I consider a rather large-scale application. It is a storage and query system for research on a few million patients. We could have gone with Oracle and Java (...shiver...), or even MSSQL and a Windows server, but why waste money? The only real headache I've had is figuring out that Apache2 is threaded and Postgres/PHP sits on top of some low-level linux code that is not. I could use Apache instead of Apache2 to fix the problem, but I fixed the non-threaded code instead.
The previous comment is purposely vague and generalized, but all of the facts are completely true.
Back in the .com days, I worked at a huge (now defunct) porn site. We had about 50,000 active hosted sites, 500,000 hit counters and a bunch of other stuff. We were getting tens of millions of page views daily, maxing out two 100 megabit circuits at times. It was all FreeBSD, a little Redhat, Perl, mysql, squid, apache, mod_perl and C. The only real closed stuff we used were BigIPs and traffic monitoring software.
Most of the time those numbers are four or more times that high. It's early in the afternoon, this isn't a peak time.
...
Anyway, those are only the number of entries being posted. For every entry being posted, there are a ton of inserts actually going on:
* log2 table to contain some metadata about the entry
* logtext2 table to contain the actual text
* logprop2 table (multiple rows, 3-5) containing other metadata about entry
So, four times the traffic, about 6 inserts each, 2400 updates per second--and that's just for posting entries. We get a lot more traffic from people posting comments (which also do 3 or 4 update/inserts each comment), plus people editing their userinfo, uploading new userpics,
While LiveJournal definitely isn't a huge site, it's not a lightweight, and definitely doing pretty good for having around 80 machines and doing 30-40 million fully dynamic page views a day.
As a paying subscriber of Livejournal, I can say the only reason I even have an account is because of the friends that I have who use it. I would never use it as a case study for any technology. It's got huge performance problems, data loss issues, and usability issues. This may not be the fault of using OSS, but it definitely doesn't help it look good.
There is no longer anything that can be done with computers that is nontrivial and clearly legal. -- Paul Phillips