Slashdot Mirror


Large Scale Web Apps Built on Open Source

prostoalex writes "Brad Fitzpatrick presented at OSCON with on overview of his little project. Interesting facts about the evolution of the Livejournal back-end architecture."

53 of 213 comments (clear)

  1. Salesforce.com by Anonymous Coward · · Score: 4, Informative

    It's all LAMP.

    1. Re:Salesforce.com by ccollao · · Score: 2, Informative

      Just a small explanation:

      http://www.onlamp.com/pub/a/onlamp/2001/01/25/la mp .html

    2. Re:Salesforce.com by Anonymous Coward · · Score: 4, Funny

      Hypothetically, if Microsoft ported IIS to Linux, could you actually then run LIMP?

    3. Re:Salesforce.com by Anonymous Coward · · Score: 2, Insightful

      $ HEAD http://www.salesforce.com|grep ^Server:
      Server: Resin/3.0.s040331

    4. Re:Salesforce.com by Citizen+Gold · · Score: 2, Funny

      Why not? You can already run on WIMP.

    5. Re:Salesforce.com by starling · · Score: 3, Funny

      ...leveraging...integrated...client/service architecture...enterprise-class...industry leading..

      BINGO!!!

    6. Re:Salesforce.com by j3110 · · Score: 2, Informative

      It's Java... The only closed part you would need is the Java class library from SUN, which has freely available source that you can modify for your own internal use. The rest you can do with jikes and tomcat. Tomcat is used in all the big J2EE servers... IBM uses it in WebSphere, for example. Tomcat is the reference implementation of JSP/Servlets. You may need to use JBoss, Geronimo, or OpenEJB with SalesForce because I'm pretty sure they use Enterprise JavaBeans.

      My bet on a fully open system would be with Python/ZOPE. It's missing a few things, but it's very eligant.

      --
      Karma Clown
  2. lamp! by Chuck+Bucket · · Score: 2, Informative

    you can do allot with Lamp, just look at....SLASHDOT!

    CB$@#--C

    1. Re:lamp! by Higman · · Score: 2, Informative

      Just to nitpick

      LAMP is php (and linux, apache, mysql) and last I checked Slashdot used mod_perl...

      but it is still open source.

      --
      -- [insert sig here]
    2. Re:lamp! by nv5 · · Score: 3, Informative


      I thought the P means any or all of the P language: PHP, Python, Perl

    3. Re:lamp! by Nos. · · Score: 5, Informative

      Actually, LAMP can also refer to PERL and Python as well as PHP.

  3. /. effect by mreed911 · · Score: 2, Funny

    LiveJournal? Not anymore...

  4. Typical Livejournal by Anonymous Coward · · Score: 5, Funny

    OMG! Today I had CEREAL!!!!!

    With MILK!!!! OMG!!

    1. Re:Typical Livejournal by Anonymous Coward · · Score: 5, Funny

      How true

      From link --
      speakin of french and korea did u no they both opposed the war in iraq? 1 is a comunist country and the othr is a no-fight-anytime country. mabye there in this 2gethr to squash the american gold medals an make ppl think there strong! HEY KOREA WE BLEW U UP IN WW2 W/ TEH ATOM BOMB WE'LL DO IT AGAIN. an french ppl suck.

    2. Re:Typical Livejournal by seanmeister · · Score: 4, Funny

      Brad Fitzpatrick apparently agrees with your take on LJ, judging from the sample user data shown on page 24 of the presentation:

      OMG i like totally hate my parents they just dont understand me and i h8 the world omg lol rofl *! :^- ^^; add me as a friend!!!
    3. Re:Typical Livejournal by MC+Negro · · Score: 5, Funny

      Haha. This part got me here --

      without a doubt Jesus would've been the best president ever. When he was only 12 years old he went to the temple to preach to the Jews and was just amazing everyone. Think how much better the U.S.A. would be if Bill Clinton wasn't Bill Clinto but instead was Jesus? Do you think we'd be fighting a way? NO!! We'd all be loving each other because thats what Jesus was about! LOVE! There isn't enough love in the world! Jesus would also be great because hes not only the son of god, HES A PRINCE OF PEACE!!! Hed probably do things differently. Even George W Bush could learn from Jesus (and thats why George W Bush is a christian and we need to keep him in office!!)

      LiveJournal -- Convincing teenage girls that someone cares about what they have to say since 1999.

      --
      "You and your third dimension."
    4. Re:Typical Livejournal by RazzleFrog · · Score: 3, Insightful

      You left the best part out. She was upset because she got a D on the paper (from which that came) and she thinks she is a good writer. Her explanation, of course, is not that she has a greatly inflated opinion of her abilities but that he teacher is anti-Christian.

      We laugh about this but the really scary part is that there are a lot of people who think like her. People hate Bush so much because of the war but I am much more scared about his connection to the zealots of the religous right. The war in Iraq will end someday but these zealots will continue to try to take control of this country.

    5. Re:Typical Livejournal by RazzleFrog · · Score: 3, Insightful

      A zealot would be somebody who blames their problems on somebody else's religion or lack of religion which is what this girl did. If you read what she posted of that essay I think even the most religous among us would still be generous giving her a D.

      And the right likes to call the left communists. Calling them zealots wouldn't make much sense.

      And I had nothing to do with making this girl think there are anti-christian views in this country. She gets plenty of that from the preachers. If you don't think that zealotry is very real in this country then I suggest you listen to Jimmy Swaggarts opinion of gay marraige. He said something along the lines that if a gay man looked at him wrong he would kill him and the crowd cheered. That is zealotry plain and simple.

      Finally, I don't know how you can say religion is being removed from the public eye. You obviously don't watch much TV or visit the bookstore. Religion has had a huge resurgence since September 11. You can't watch a Yankees game without hearing God bless America. In my opinion America the Beautiful would be much more appropriate.

  5. Uh, the Web itself by FunWithHeadlines · · Score: 4, Insightful
    "Large Scale Web Apps Built on Open Source"

    Uh, like, you mean the Web itself? That's large scale, certainly was built, and is most certainly built on open source.

    So, yeah, I reckon it can be done. I'm using the proof-of-concept to submit this comment.

    1. Re:Uh, the Web itself by drouse · · Score: 2, Interesting

      Sure, Slashdot is just Apache, some Linux boxes, some Perl maybe some C -- not a big deal...

      The LJ folks faced scaling problems and had financial limits on how much money they could throw at the problem. So they used smarts and OS software instead of huge piles of money. They also built some new tools that are OS themselves, thus contributing back to the community (I hate that phrase, but this is Slashdot).

      The presentation is actually interesting technically, and good news for Linux/MySQL/Perl/etc.

      (I guess what I'm saying is that I didn't see a huge call for sarcasm).

      --
      -- I browse at +5 with stripped sigs ... Ha! Ha!
  6. .sxi format? by numbski · · Score: 2, Funny

    Anyone know what that document format is since it's roughly half the size of the pdf?

    --

    Karma: Chameleon (mostly due to the fact that you come and go).

    1. Re:.sxi format? by Higman · · Score: 5, Informative

      Are you serious?

      In the off chance that you are, it's one of the OpenOffice.org formats, inheritted from StarOffice... it's supposed to be their answer to MS PowerPoint.

      --
      -- [insert sig here]
    2. Re:.sxi format? by Vaevictis666 · · Score: 2, Informative

      According to http://www.cryer.co.uk/filetypes/ it's an Open Office Impress file (think power point)

    3. Re:.sxi format? by xtermz · · Score: 2, Funny

      What are you a Office luser ?

      translation: "what, do you have a job? I wish I was part of the corporate world. Better get back to checking the classifieds..."

      --


      I lost my concept of community when my community lost all concept of me.
  7. opensource sxi? by AssProphet · · Score: 2, Funny

    Why is there a password on this sxi file (star office presentation)... is the file not open source?

    1. Re:opensource sxi? by mgkimsal2 · · Score: 2, Informative

      Password is empty - just click through it.

  8. helixcommunity.org is another big one... by tcopeland · · Score: 4, Informative

    ...right here.

    It's powered by GForge, so it's backed by PHP and PostgreSQL.

    There are a bunch of other sites running GForge listed here...

  9. Re:Java, Tomcat, Apache on UNIX by Kainaw · · Score: 4, Interesting

    We are using Fedora, Postgres, and PHP for what I consider a rather large-scale application. It is a storage and query system for research on a few million patients. We could have gone with Oracle and Java (...shiver...), or even MSSQL and a Windows server, but why waste money? The only real headache I've had is figuring out that Apache2 is threaded and Postgres/PHP sits on top of some low-level linux code that is not. I could use Apache instead of Apache2 to fix the problem, but I fixed the non-threaded code instead.

    --
    The previous comment is purposely vague and generalized, but all of the facts are completely true.
  10. Maypole! by Anonymous Coward · · Score: 5, Informative

    Maypole is a Perl framework for MVC-oriented web applications, similar to Jakarta's Struts. Maypole is designed to minimize coding requirements for creating simple web interfaces to databases, while remaining flexible enough to support enterprise web applications.

  11. Livejournal Images by tinla · · Score: 5, Funny

    Ok, so most of the Journals lack even a scrap of entertainment value... but the data feeds are normally fun. Is there anyone left that hasn't wasted a few bytes on the following url?

    http://www.livejournal.com/stats/latest-img.bml

    Hint - its a constantly updating list of all the new images posted to journals. After a while you give up waiting for a hot chick to post and decide crazy survey graphics are as good as it gets. And then some hot chick posts her birthday party pictures, but she's only 14 and suddenly you wish you'd spent the day doing something else.

    --
    0daymeme.com: Great stuff.
    1. Re:Livejournal Images by ricotest · · Score: 2, Interesting

      Anyone got an XSL to make those links clickable? (Linkifcation doesn't seem to work with XML)

    2. Re:Livejournal Images by dq5+studios · · Score: 2, Informative

      Use this site instead, it has 200 images with links to the lj it was posted in.

  12. Porn by Neil+Blender · · Score: 5, Interesting

    Back in the .com days, I worked at a huge (now defunct) porn site. We had about 50,000 active hosted sites, 500,000 hit counters and a bunch of other stuff. We were getting tens of millions of page views daily, maxing out two 100 megabit circuits at times. It was all FreeBSD, a little Redhat, Perl, mysql, squid, apache, mod_perl and C. The only real closed stuff we used were BigIPs and traffic monitoring software.

    1. Re:Porn by ari_j · · Score: 2, Funny
      It was all FreeBSD, a little Redhat...
      So, it wasn't all FreeBSD.
    2. Re:Porn by dmayle · · Score: 3, Funny

      Porn (Score:4, Interesting)

      I worked at a huge (now defunct) porn site.

      The funny thing is, I'm pretty sure the interesting mod is about working for a porn site, and has nothing to do with the hardware or software (for those who even read that far... ;)

  13. layers -- kind a like the onion by yintercept · · Score: 3, Informative

    The web is really a mixed bag that allows a mix of open standards, and proprietary software. To claim it is all open source is misleading. It is a dynamic network that allows development on multiple layers.

    The most important aspect of the web is that the interface of the different layers were well defined and exposed...not that each line of code in the different layers is exposed.

  14. Re:Get a clue by Anonymous Coward · · Score: 5, Insightful

    It's a pervasive belief among the suddenly famous. IBM, MS, or Sun doesn't need this. It's the small website with a bright idea that is all of a sudden gaining popularity which goes through almost each of the stages described in this document.

    This is for people with absolutely no budget and infinite traffic. This is how to live through that and come out winning like Brad apparently has.

  15. Re:Get a clue by Anonymous Coward · · Score: 2, Informative

    I guess Amazon.com is one of those not-properly-designed websites that doesn't do anything real?

  16. Re:Java, Tomcat, Apache on UNIX by FortKnox · · Score: 2, Interesting

    Absolutely. Having a J2EE project running Linux servers with Apache, JBoss, and PostGRES aren't unheard of... and most J2EE developers prefer to use eclipse.

    That's 100% open source, people... and we are talking large corporate intraweb apps and such.

    I work mostly with financial institutions... they prefer IBM backed Linux servers with WebSphere... but still like eclipse (or WSAD, which is eclipse with a Websphere test server plugin), and a commercial DB (oracle, DB2, or informix are popular)... but they still use frameworks like struts, tapestry, spring, and hibernate... all open source.

    --
    Good quote, too many chars. Seriously, the slashdot 120 char limit sucks!
  17. Large scale? by DogDude · · Score: 2

    How is this "large scale?" Maybe it's medium-scale as far as the web goes, but otherwise, it's very much a lightweight app. From livejournal.org:

    Per Hour: 6818
    Per Minute: 114


    That's 2 inserts a second, and maybe a hundred queries a second. Quite honestly, that could be handled by MySQL & PHP. Definitely not what I'd call "large scale".

    --
    I don't respond to AC's.
    1. Re:Large scale? by Anonymous Coward · · Score: 2, Informative

      Thats only posts, it doesnt take into account comments (which is probably most of the traffic) userpics, etc.
      Your assumption would be correct if it was 1 select for each page view, but since there are about 4-5 just for 1 page view (userpic, friends, info, etc) then that number is misleading.. Fortunally most of that static content is memcached and not hitting the DB's.

    2. Re:Large scale? by xb95 · · Score: 5, Interesting

      Most of the time those numbers are four or more times that high. It's early in the afternoon, this isn't a peak time.

      Anyway, those are only the number of entries being posted. For every entry being posted, there are a ton of inserts actually going on:

      * log2 table to contain some metadata about the entry
      * logtext2 table to contain the actual text
      * logprop2 table (multiple rows, 3-5) containing other metadata about entry

      So, four times the traffic, about 6 inserts each, 2400 updates per second--and that's just for posting entries. We get a lot more traffic from people posting comments (which also do 3 or 4 update/inserts each comment), plus people editing their userinfo, uploading new userpics, ...

      While LiveJournal definitely isn't a huge site, it's not a lightweight, and definitely doing pretty good for having around 80 machines and doing 30-40 million fully dynamic page views a day.

  18. Zope Enterprise Objects by TheSync · · Score: 4, Informative

    If you are looking for scalable OSS solutions, also look into Zope with Zope Enterprise Objects (ZEO).

  19. Re:Speaking of slashdot... by BridgeBum · · Score: 2, Informative

    I got a 503 error earlier today using IE from work. So it's not limited to Firefox.

    --
    My UID is the product of 2 primes.
  20. Re:Get a clue by Anonymous Coward · · Score: 3, Insightful

    A little harsh considering the guy's starting point, but it is true that most people / companies don't think things through. I put in a lot of startup web sites in the 90's, and used to give lectures on, among other things, why replicating databases doesn't scale. Looks like people still think that replicating databases is a solution, almost ten years later. It makes me glad I opted out of the e-com performance world, or I'd still be solving exactly the same problems.

    Simple lessons:
    -replicating database all over the place doesn't work
    -adding lots of servers doesn't work unless the apps are designed to work that way
    -object-relational and object databases are useful for a narrow class of problems, and Do Not Scale
    -java/perl/etc. are great, but you have to learn some SQL because doing things like sorting data in code is stupid when the database is 10x faster doing on retrieval than your code
    There's the material I used to get $2000 for for a 1 hour lecture. Share and enjoy.

  21. Server timed out by WreckDiver · · Score: 3, Funny

    Not large enough scale to survive a Slashdotting...

  22. Re:Speaking of slashdot... by Tongo · · Score: 2, Informative

    I was also getting the 503 in Opera. Not for a while though.

  23. Re:Get a clue by Graelin · · Score: 4, Insightful

    You need to get over your favorite language/technology/term you read in the trade-rag you read last week. And then you need to get over yourself.

    Give it up slashdot crowd. mod_perl is not a valid technology for a large scale website! Perl was designed for a task, and that task was NOT enterprise application development.

    Spoken like someone who has never had to build a very large site (doing "real" work) completely in Perl/mod_perl. I can tell you that it most certainly can scale to enterprise needs. Did this guy do it right? I don't think so either but he most certainly learned a valuable lesson. Hopefully other people will study what he has done and improve their own systems based on his work.

    For the record, Java wasn't built for enterprise application development either. As with Perl, people discovered that Java had a future there and here we are today.

    A properly designed website with n-tier sepperation will be able to handle a large load and scale infinitly. You'll note that large websites who actually do real things besides logging people's daily problems don't use mod_perl and a thousand servers. There's a reason for this.

    You're assuming two dangerous things... (1) That you can't have n-tier and Perl. And (2) that large mod_perl sites require lots of servers. To believe any of these things is to demonstrate your horrific misunderstanding of computer science in general. I pity the company that lets you design their architecture. Wait, no I don't.... I'll gladly take their money for fixing your mistakes.

    Oh yeah, and let us not forget some other languages that are showing promise... specifically Python+Zope. In fact, I know of several people implementing n-tier applications with PHP on the front, Python in the middle and PostgreSQL in the back with much success.

    And for the record, here are some large companies and sites heavily using mod_perl.

    Want more?

  24. What about Livejournal? by tshak · · Score: 3, Interesting

    As a paying subscriber of Livejournal, I can say the only reason I even have an account is because of the friends that I have who use it. I would never use it as a case study for any technology. It's got huge performance problems, data loss issues, and usability issues. This may not be the fault of using OSS, but it definitely doesn't help it look good.

    --

    There is no longer anything that can be done with computers that is nontrivial and clearly legal. -- Paul Phillips
  25. LJ - Memcached - Wikipedia by otisg · · Score: 3, Informative

    Some may find it interesting that Wikipedia (covered earlier today on Slashdot) uses some code that came out of LiveJournal for caching: memcached.

    --
    Simpy
    1. Re:LJ - Memcached - Wikipedia by christowang · · Score: 2, Informative

      Slashdot also uses memcached.

  26. Kenny by Trejkaz · · Score: 4, Funny

    It's somewhat amusing that in the first load balancing example, one of the points of failure was Kenny. Especially since Kenny ALWAYS DIES.

    --
    Karma: It's all a bunch of tree-huggin' hippy crap!
  27. Re:Get a clue by MattRog · · Score: 2, Interesting

    I agree wholeheartedly. PostgreSQL and FireBird would suit LiveJournal way better than MySQL. However, PostgreSQL's replication is not exactly fail-save (not sure if that's a requirement here) nor automatic, nor does it have the kind of partitioning features that some of the 'bigger' boys have.

    I was thinking mostly of Sybase Replication Server combined with Sybase ASE or Oracle 10g/Oracle Clustering, things that would go really, really nicely in the environment and workload the LiveJournal folk are experiencing.

    --

    Thanks,
    --
    Matt