Twitter On Scala
machaut writes "Twitter, one of the highest profile Ruby on Rails-backed websites on the Internet, has in the past year started replacing some of their Ruby infrastructure with an emerging language called Scala, developed by Martin Odersky at Switzerland's École Polytechnique Fédérale de Lausanne. Although they still prefer Ruby on Rails for user-facing web applications, Twitter's developers have started replacing Ruby daemon servers with Scala alternatives, and plan eventually to serve API requests, which comprise the majority of their traffic, with Scala instead of Ruby. This week several articles have appeared that discuss this shift at Twitter. A technical interview with three Twitter developers was published on Artima. One of those developers, Alex Payne, Twitter's API lead, gave a talk on this subject at the Web 2.0 Expo this week, which was covered by Technology Review and The Register."
Kidding aside, is this a 'nail' in the coffin of scalable Ruby? 5 years ago people were saying the same thing about PHP scaling but Facebook has done a rather nice job of making it scale. Twitter was supposed to be the poster child of how awesome Ruby and RoR was.
Difference is, Facebook is still using php, Twitter is going toScala.
They should have just used Java. Wait--
Javascript + Nintendo DSi = DSiCade
Seriously.
Copyright 2010. All rights reserved. This comment may not be copied in any way including, but not limited to caching.
Scala looks and feels like Java with a tiny bit of Python thrown in for good measure. I'm really not certain why anyone would use it over, say, groovy or just plain Java.
The really odd part is trying to imagine Scala as a reasonable replacement for Ruby or any other higher level language.
Twitter's developers care more about being cool and hip and using the latest tool so that they remain popular, than they do about having a site that stays up 7 days a week.
Twitter using new-and-fancy programming languages has a way of load testing them for all of us.
I'm not sure I'd have the balls to take a 5 year old development platform/framework and drop it into something that sees so much traffic. Hopefully they share their experiences in some form.
replace one language that wasn't tested on that scale and replace it with another one that wasn't tested on that scale.
Good thinking~
Oh look, twitter is down..again.
The Kruger Dunning explains most post on
Perl would be better.
p.s. I don't want to here any of this Perl is ugly crap either - it's not my fault I don't need my language to -make- me write readable code. A good Perl programmers code is readable anyways.
Scala is not for me I can confidently say. I am too old to learn a new [programming] language. The languages I know will suffice for now.
There is a saying too: "You cannot teach old dogs new tricks."
Isn't she cute :)
If I want to use any Java software then I'll use Scala. I see people bashing Scala, saying the languages they know are good enough or they can just use jython/jruby/groovy, but they clearly know little about Scala.
One thing that's nice about Scala that Java, Jython, JRuby, and Groovy all lack is it's powerful type system and pattern matching. Once you get used to good pattern matching like in Scala, SML, OCaml, or Haskell you won't want to go back. Plus you get all the benefits of running on the JVM at high speed (unlike all the aforementioned JVM languages, except Java itself.)
Honestly, you should check out Scala before you bash it. It's a very good choice wherever you might choose Java, which is a good choice for the back end. Twitter's developers are smart and experienced. They didn't choose Scala just to be cool. It is a powerful tool that can get the job done in an elegant way.
OP is just a twitter sock puppet.
Java and Erlang did. Python isn't particularly 'functional' despite some recent syntactic grafts.
you had me at #!
Ruby does not have a problem scaling. Neither, for that matter, does even Rails. (As the companies that run Basecamp, Campfire, LinkedIn, Lighthouse, and many others will tell you.)
The fact is that the Twitter folks tried to write their own message queue in Ruby, when there was absolutely no reason to do so: there were plenty of pre-made message queues already available for Ruby, and already optimized. Not only did they choose to write their own, unnecessarily, they did it badly.
And not only that, but Alex Payne has a hidden agenda: he is trying to push Scala to boost interest in the book about Scala he just wrote!
Please get some facts before digging up this long-dead and well-buried "Ruby or Rails doesn't scale" bullshit again.
http://unlimitednovelty.com/2009/04/twitter-blaming-ruby-for-their-mistakes.html
This blog post takes the attitude that Twitter didn't move to Scala because ROR had a problem, but because the in-house messaging system Twitter created performed poorly. The author does not work at Twitter but many of the Twitter developers (including Alex Payne) respond in the comments. I found the article to be very interesting and the comments even more so. They give a sense of how much research Twitter did before this change.
'Nuff said. Those twitter posts were once described as "internet SMS messages", didn't they? Short messages...phone systems and heavy networking...Ericsson...Erlang...
Ezekiel 23:20
The problems they had were not due to Rails. The problems arose from poor implementation of their message queue... some bad engineering decisions.
http://unlimitednovelty.com/2009/04/twitter-blaming-ruby-for-their-mistakes.html
And some of their recent Scala noise might be due to the fact that Alex Payne just wrote a book about Scala that was just released, or is about to be.
At first I read the title as "Twitter on Scalia". Justice Scalia is one of the most conservative US Supreme Court justices, and I'm not sure that this would have been a happy combination.
Well, at a first glance, it looks just as rigid and boring as Java.
I guess that can be considered Falmebait?
Anyone remember this quote?
[..] but it's certainly the case that over the last twenty years or so, many Computer Scientists have come out in opposition to the Art of Programming. In trying to make programming predictable, they've mostly succeeded in making it boring. And in so doing, they've lost sight of the idea that programming is a human pursuit. They've designed languages intended more to keep the computer happy than the programmer. Was any SQL programmer ever happy about having to declare a value to be varchar(255)? Oops, now it's a key, and can't be longer than 60. Who comes up with these numbers?
I think in the beginning it matters a lot what language you use. If the scale of your Web business is small, the salaries of your staff will be the biggest item on the list of expenses. I guess now they'll have quite a number of computers and having to buy 20% more servers will cost them quite a bit. If they are saying now that Scala is good for them, it does not mean anything. At this point it probably would be profitable for them to port a large part of their software to COBOL. A more forward-looking approach would be to improve the Ruby VM. But maybe they need somebody else to do it for them.
Im going to post anonymously not to conceal my identity (that will be obvious by the rest of this post) but to deflect the accusations of karma whoring.
Twitter is not getting the warranted "occasional Slashdot article". It is getting a lot more than that. A couple of days ago, maybe a week, I mentioned it in another comment. That day Twitter had 3 articles in the front page of Slashdot, arguably none of them being news by itself except for the "novelty" that is to involve Twitter in activities that, otherwise, would be totally unremarkable.
To illustrate that, I will replace the mentions to "Twitter" on those articles by "Instant Messenger" or "Social Network", depending on the context. Notice that I understand that Twitter is a little bit more that "Instant messaging" in a sense that it is more permanent and involves more than one media, but it fits the definition,
"a form of real-time communication between two or more people based on typed text. The text is conveyed via devices connected over a network such as the Internet"
Proposal Suggests UK Students Study Wikipedia and Social Networks
Build Your Own Open Source Instant messaging Power Meter
Researchers Can ID Anonymous Social Network users
In the first example, Twitter is only one of the many new tecnologies that would potentially be studied by the UK brats. In the second, it could be any protocol, and it did not impressed the audience (only 31 comments).
But the third one is the stranger. It mentions something, placing enormous emphasis on the fact that it is about "identifying anonymous twitterers", that were already covered before in a more neutral point of view here on Slashdot, in the article Google Researchers Warn of Automated Social Info Sharing, three months before.
It is not occasional, it is habitual, and I don't believe it warrants all that coverage. But yet, there are two stories again on the front page of Slashdot covering peripheral interactions of Twitter with the rest of the world: Organized Online, Students Storm Gov't. Buildings In Moldova (people using instant and relatively anonymous communication to organize themselves, a novelty since the creation of the telegraph) and this very one.
Seems like Second Life all over again.
the Commodore Amiga?
I was looking at the syntax of scala- and it looks like they're trying to do web development in C. Why can't people just learn a new syntax?
It isn't going to kill you. It really wont.
You are correct that Hulu is surpassed by scribd in rank, but if you look at the pageviews instead of the ranking, you'll see that twitter and hulu get more pageviews than scribd and have gotten more since at least january. Reach incorporates unrelated metrics such as unique visits, etc, which doesn't have as much of an effect as just sheer pageviews. Funny enough, that would make scribd's ridiculous bounce rate give it a higher overall rank.
.... irrelevant, now that you lot have gone and Slashdotted it :P
Wondered why I couldn't get a word in edgeways without seeing the over-capacity whale.
The most interesting part of the article to me was that they said they wanted the benefits of a type system, which then ended up reproducing in large part in their code. They also wanted the stability of the JVM and they use Java collections from Scala. Hmmm... I wonder what other language they could have chosen that would have had all of those features, an existing messaging system, and developers on every street corner?
I program in PDP-11 assembly, which is then translated into C, compiled into Java bytecode, and executed on a JVM. I call it Assemblacava, and it's the wave of the future.
Every scripting language tries to do what Perl does. None of them have CPAN.
I've tried repeatedly to use Ruby and RoR. For trivial projects, they are fine. Scalability may come, but Ruby GEMS needs to be rewritten in Perl to remove the HUGE memory footprint to maintain your GEMS. You all know what I mean.
Yep, perl is what Twitter needs.
...gone with minifig scale.
But I approve of this move in general. I think the putting together little bricks to make something much bigger really fits with idea of a lot of little comments to "put together" a person.
"If a nation expects to be ignorant and free in a state of civilization, it expects what never was and never will be."
Twitter is not a trivial application to scale, considering the wide disparity in listeners to follower ratios, that views are dynamically generated by interpolating many-to-many message streams, and that each message is persistent forever.
As an analogy, It's like managing an IRC server, with persistent messages that are full-text indexed, with one channel per user, and unlimited number of users can join each other's channels. When you join a new user's channel, your chat log is automatically (and quickly) re-woven with messages from that channel according to relative time series of these messages. And, there's a global channel that everyone can watch to see what any user in any channel is saying at any time.
Now do this, all the while avoiding netsplits (i.e. missing messages), allowing retracts of almost message, recent or historical, and ensuring the channel history (eventually) reflects that change. And handle sudden bursts of activity among unpredictable sets of channels because they're all attending the same conference, or a burst of network-wide high activity because people are watching the World Cup or Obama's inauguration.
The point is that, while the idea is simple, the variability of use and disparity of activity is what makes life interesting; the messaging & DB architecture that works well for recent activity, for example, doesn't help for having reasonable persistent random-access to historical messages.
In all, Twitter has gotten a *lot* more reliable the past several months than it was a year ago.
-Stu
i figured a lot of people would mention erlang, and thought someone might be interested in this writeup i read the other day http://yarivsblog.com/articles/2008/05/18/erlang-vs-scala/
Java is a flaming turd of a programming language compared to Scala. About time someone made it outdated. Hopefully other "fad" sites follow the trend and switch to Scala as their language of choice, and then serious dev houses follow. Frankly, after reading the Scala book, I can't see a good reason for a team with little to no legacy Java code to even bother with Java.
Scala-like languages have been around for decades; it's just that now people are finally starting to use them.
I'll buy your book Mister. I'll pay lots and lots of money for your book. Gimme NOW!
Words like "inversion of control" and "Actors" are not new buzzwords. They are part of the functional vocabulary of real software engineers, and they are easily over a decade old. Check out the book with the ISBN 0201633612, published in 1994 and is considered a classic containing what I believe is all of the "buzzwords" you are talking about.
It's the idea of abstracting specifics like "callback functions" to patterns such as IoC that separates "programmers", "hackers", and even "computer scientists" from "software engineers".
I have a degree in Computer Science and I am a professional Software Engineer, and let me tell you, Software Engineering is not the same as Computer Science. Both studies have their place, and both are important.