Slashdot Mirror


Twitter On Scala

machaut writes "Twitter, one of the highest profile Ruby on Rails-backed websites on the Internet, has in the past year started replacing some of their Ruby infrastructure with an emerging language called Scala, developed by Martin Odersky at Switzerland's École Polytechnique Fédérale de Lausanne. Although they still prefer Ruby on Rails for user-facing web applications, Twitter's developers have started replacing Ruby daemon servers with Scala alternatives, and plan eventually to serve API requests, which comprise the majority of their traffic, with Scala instead of Ruby. This week several articles have appeared that discuss this shift at Twitter. A technical interview with three Twitter developers was published on Artima. One of those developers, Alex Payne, Twitter's API lead, gave a talk on this subject at the Web 2.0 Expo this week, which was covered by Technology Review and The Register."

8 of 324 comments (clear)

  1. Should have used PHP. by 0100010001010011 · · Score: 4, Interesting

    Kidding aside, is this a 'nail' in the coffin of scalable Ruby? 5 years ago people were saying the same thing about PHP scaling but Facebook has done a rather nice job of making it scale. Twitter was supposed to be the poster child of how awesome Ruby and RoR was.

    Difference is, Facebook is still using php, Twitter is going toScala.

    1. Re:Should have used PHP. by TeXMaster · · Score: 3, Interesting

      The problem is that most of those compiler/interpreters suck enormously.

      Exactly. MRI (Matz' Ruby Interpreter) is known to have some serious scalability issues. Interestingly, one of the main issues with MRI comes from the way gcc compiles the big delegator switch in MRI's core, with a large sparse stack that causes ridiculous memory consumption (and sometimes even leaks). There's a set of 8 patches (the MBARI patchset) that drastically improve the situation. The reduced memory footprint and the much smaller stack also give a noticeable speed increase.

      The good news is, these patches are progressively being merged upstream, so it's very likely that future MRI versions will be much better.

      --
      "I'm never quite so stupid as when I'm being smart" (Linus van Pelt)
    2. Re:Should have used PHP. by lotzmana · · Score: 3, Interesting

      I agree, but wish to add a comment about vertical and horizontal scaling.

      Ruby and Python have poor multi-threading. They don't scale well on multi-CPU platforms.

      from the interview:
      Robey Pointer: Green threads don't use the actual operating systemâ(TM)s kernel threads.

      So, a Ruby application can't scale well vertically -- one can't just upgrade the machine with more CPUs for example.

      At the same time, no language is inherently prohibiting horizontal scaling, if application design provides for it -- adding more machines onto which the application can run in parallel.

      Twitter could've been designed to permit horizontal scaling. Regrettably the article didn't say much about this approach. They are improving the vertical scalability of the application by switching to first-class threads (via the JVM), but are they not eventually going to hit the limits for vertical scaling?!

  2. Scala is great by burris · · Score: 5, Interesting

    If I want to use any Java software then I'll use Scala. I see people bashing Scala, saying the languages they know are good enough or they can just use jython/jruby/groovy, but they clearly know little about Scala.

    One thing that's nice about Scala that Java, Jython, JRuby, and Groovy all lack is it's powerful type system and pattern matching. Once you get used to good pattern matching like in Scala, SML, OCaml, or Haskell you won't want to go back. Plus you get all the benefits of running on the JVM at high speed (unlike all the aforementioned JVM languages, except Java itself.)

    Honestly, you should check out Scala before you bash it. It's a very good choice wherever you might choose Java, which is a good choice for the back end. Twitter's developers are smart and experienced. They didn't choose Scala just to be cool. It is a powerful tool that can get the job done in an elegant way.

  3. Re:Good thinking, by The+Slashdolt · · Score: 3, Interesting

    Read this and all will become clear:
    Event-Based Programming without Inversion of Control

    --
    mp3's are only for those with bad memories
  4. Re:There you go again! by Radhruin · · Score: 5, Interesting

    Anyone who thinks Ruby on Rails can't scale is as dogmatic in their anti-hype as the original hypers were. The right tool for the right job and all that.

  5. Re:Good thinking, by burris · · Score: 4, Interesting

    Maybe they use Scala because writing Java code is painful by comparison. Tons of boilerplate, every exception has to be caught in every scope, no pattern matching, no named arguments, and on and on. For people like me, without Scala the JVM wouldn't even be under consideration, though I admit that Java has been more usable since it got generics.

  6. Actually, this is pretty complex by Stu+Charlton · · Score: 5, Interesting

    Twitter is not a trivial application to scale, considering the wide disparity in listeners to follower ratios, that views are dynamically generated by interpolating many-to-many message streams, and that each message is persistent forever.

    As an analogy, It's like managing an IRC server, with persistent messages that are full-text indexed, with one channel per user, and unlimited number of users can join each other's channels. When you join a new user's channel, your chat log is automatically (and quickly) re-woven with messages from that channel according to relative time series of these messages. And, there's a global channel that everyone can watch to see what any user in any channel is saying at any time.

    Now do this, all the while avoiding netsplits (i.e. missing messages), allowing retracts of almost message, recent or historical, and ensuring the channel history (eventually) reflects that change. And handle sudden bursts of activity among unpredictable sets of channels because they're all attending the same conference, or a burst of network-wide high activity because people are watching the World Cup or Obama's inauguration.

    The point is that, while the idea is simple, the variability of use and disparity of activity is what makes life interesting; the messaging & DB architecture that works well for recent activity, for example, doesn't help for having reasonable persistent random-access to historical messages.

    In all, Twitter has gotten a *lot* more reliable the past several months than it was a year ago.

    --
    -Stu