Large Scale Web Apps Built on Open Source
prostoalex writes "Brad Fitzpatrick presented at OSCON with on overview of his little project. Interesting facts about the evolution of the Livejournal back-end architecture."
← Back to Stories (view on slashdot.org)
My companies backend is mostly Java.
:)
We are using Oracle as the database, and Solaris as the UNIX, but we could be using MySQL and Linux.
In fact, we are investigating that right now
comment directly in my journal
Back in the .com days, I worked at a huge (now defunct) porn site. We had about 50,000 active hosted sites, 500,000 hit counters and a bunch of other stuff. We were getting tens of millions of page views daily, maxing out two 100 megabit circuits at times. It was all FreeBSD, a little Redhat, Perl, mysql, squid, apache, mod_perl and C. The only real closed stuff we used were BigIPs and traffic monitoring software.
Uhh, the .jsp at the end of every URL and their CTO's continued public advocacy of BEA Weblogic and Oracle makes me think that either I have the wrong idea of what LAMP constitutes, or you do.
Ehhmmm, from the BEA testomnial page..
"Salesforce.com is thrilled to be leveraging WebLogic Workshop 8.1 as an integrated part of our sforce client/service architecture. Using the extreme power of BEA WebLogic Workshop will enable us to make sforce available and accessible to the huge community of mainstream application developers and a wider spectrum of enterprise-class application projects on the industry leading BEA WebLogic Enterprise Platform. I could not imagine a more important combination than BEA and salesforce.com."
- Marc Benioff
Chairman and CEO
salesforce.com
Sure, Slashdot is just Apache, some Linux boxes, some Perl maybe some C -- not a big deal...
The LJ folks faced scaling problems and had financial limits on how much money they could throw at the problem. So they used smarts and OS software instead of huge piles of money. They also built some new tools that are OS themselves, thus contributing back to the community (I hate that phrase, but this is Slashdot).
The presentation is actually interesting technically, and good news for Linux/MySQL/Perl/etc.
(I guess what I'm saying is that I didn't see a huge call for sarcasm).
-- I browse at +5 with stripped sigs
Anyone got an XSL to make those links clickable? (Linkifcation doesn't seem to work with XML)
As an employee, I can tell you that this comment is somewhat full of shit.
It still is a very segregated system with tons and tons of front-end boxes that each do specific things. All the "magic" of Amazon happens in Java and C++ anyway.
Most of the time those numbers are four or more times that high. It's early in the afternoon, this isn't a peak time.
...
Anyway, those are only the number of entries being posted. For every entry being posted, there are a ton of inserts actually going on:
* log2 table to contain some metadata about the entry
* logtext2 table to contain the actual text
* logprop2 table (multiple rows, 3-5) containing other metadata about entry
So, four times the traffic, about 6 inserts each, 2400 updates per second--and that's just for posting entries. We get a lot more traffic from people posting comments (which also do 3 or 4 update/inserts each comment), plus people editing their userinfo, uploading new userpics,
While LiveJournal definitely isn't a huge site, it's not a lightweight, and definitely doing pretty good for having around 80 machines and doing 30-40 million fully dynamic page views a day.
As a paying subscriber of Livejournal, I can say the only reason I even have an account is because of the friends that I have who use it. I would never use it as a case study for any technology. It's got huge performance problems, data loss issues, and usability issues. This may not be the fault of using OSS, but it definitely doesn't help it look good.
There is no longer anything that can be done with computers that is nontrivial and clearly legal. -- Paul Phillips
I agree wholeheartedly. PostgreSQL and FireBird would suit LiveJournal way better than MySQL. However, PostgreSQL's replication is not exactly fail-save (not sure if that's a requirement here) nor automatic, nor does it have the kind of partitioning features that some of the 'bigger' boys have.
I was thinking mostly of Sybase Replication Server combined with Sybase ASE or Oracle 10g/Oracle Clustering, things that would go really, really nicely in the environment and workload the LiveJournal folk are experiencing.
Thanks,
--
Matt