Scaling Large Projects With Erlang

← Back to Stories (view on slashdot.org)

Scaling Large Projects With Erlang

Posted by Soulskill on Sunday July 6, 2008 @01:28AM from the right-tool-for-the-right-job dept.

Delchanat points out a blog entry which notes, "The two biggest computing-providers of today, Amazon and Google, are building their concurrent offerings on top of really concurrent programming languages and systems. Not only because they want to, but because they need to. If you want to build computing into a utility, you need large real-time systems running as efficiently as possible. You need your technology to be able to scale in a similar way as other, comparable utilities or large real-time systems are scaling — utilities like telephony and electricity. Erlang is a language that has all the right properties and mechanisms in place to do what utility computing requires. Amazon SimpleDB is built upon Erlang. IMDB (owned by Amazon) is switching from Perl to Erlang. Google Gears is using Erlang-style concurrency, and the list goes on."

11 of 200 comments (clear)

Min score:

Reason:

Sort:

Erlang: The Movie ! by Enlightenment · 2008-07-06 01:39 · Score: 5, Funny

They were right!
Sufficiently? by Anonymous Coward · 2008-07-06 01:39 · Score: 5, Interesting

Perhaps the systems would be better running efficiently rather than sufficiently?
Who wrote the summary? GWB? by Junior+J.+Junior+III · 2008-07-06 02:08 · Score: 5, Funny

"running as sufficiently as possible"?
Sometimes as a nation we must ask ourselves, is our children learning?

--
You see? You see? Your stupid minds! Stupid! Stupid!
Scala by fils · 2008-07-06 02:09 · Score: 5, Informative

People may also want to check out Scala at:
http://www.scala-lang.org/
It also uses the Erlang style concurrency approach and runs on the JVM with class compatibility with other JVM languages, ie Java, Groovy, etc.
1. Re:Scala by Richard+W.M.+Jones · 2008-07-06 06:37 · Score: 5, Informative
  
  "Last time you checked" was some time last century in that case. Linux kernels have been able to support at least 100,000 threads for ages.
  That doesn't mean that using shared memory concurrency is a good idea though. When your computer comes with 10s or 100s of cores you'll realise that maybe SMP wasn't the best model of concurrency to choose. That's where models such as map-reduce, Erlang's shared nothing concurrency, message passing, and MPI come into their own. Even today they are useful because you'll be able to scale your program across multiple machines.
  Rich.
  
  --
  libguestfs - tools for accessing and modifying virtual machine disk images
Why Erlang Matters by mpapet · 2008-07-06 02:12 · Score: 5, Insightful

1. Multicore ready.
Erlang will use them. Write your application in Erlang and it's done for you.
2. Scales well.
As an example, http://yaws.hyber.org/ scales very nicely when loads increase. Your basic LAMP/LYMP setup runs much better on vanilla hardware.
3. Designed for telecom
The architects designed the language to run in a telecom environment so things like upgrades can be done while the application is running.
Yaws in particular needs your help. Failover clustering inside the yaws server would be wonderful. Right now, it uses CGI to process other languages. It does it flawlessly, but a more direct solution might be a nice project.

--
http://www.maxineudall.com/2010/02/should-economists-be-sued-for-malpractice.html
Re:hard to read after by jc42 · 2008-07-06 02:14 · Score: 5, Funny

"you need large real-time systems running as sufficiently as possible."
Should that not be efficiently as possible?
You obviously haven't looked very closely at any of the "market leader" software lately.
Software from the Big Guys is more and more designed to sell (think forced upgrades) bigger, faster systems. You don't do this by making your software efficient.
The logic behind many software updates these days is "Will this release require sufficient resources that customers will be persuaded to upgrade to new hardware?"

--
Those who do study history are doomed to stand helplessly by while everyone else repeats it.
Gibberish by SpinyNorman · 2008-07-06 02:18 · Score: 5, Insightful

If you want to build computing into a utility, you need large real-time systems running as sufficiently as possible.
But if you want to build sprockets into a weasel you need small batch-mode systems running as necessarily as possible.
If the poster had anything interesting to say (I'd guess not, but who knows!), it was totally obscured by his lack of grasp of the English language.
Why Erlang doesn't matter by SanityInAnarchy · 2008-07-06 02:37 · Score: 5, Interesting

1. Invariable variables.
This appears to have been done for no reason other than the designer's preference. In fact, it's not strictly true -- variables can be unbound, and later bound. They just can't be re-bound once bound.
2. Weird syntax.
Why, exactly, are there three different kinds of (required) line endings? It seems as though the syntax is designed to be as different from C as possible, while maintaining at least as many quirks. Moreso, even -- when constructing normal, trivial programs, you're going to hit most language features head-on and at their worst. Where's my 'print "hello\n"' that works most other places?
I don't believe the important features of Erlang are mutually-exclusive with the sane syntax of, say, Ruby or Python.
3. Not Unicode-ready.
Strings are defined as ASCII -- maybe latin1. But there's no direct unicode support in the language -- if you're lucky, there are functions you can pipe it through.
There are other things I haven't mentioned, mostly implementation-specific -- things like the fact that function-reloading cannot be done when you natively-compile (with hipe) for extra speed. My plan is to take the features I actually like from Erlang and implement them elsewhere, in a language I can actually stomach for its real tasks.

--
Don't thank God, thank a doctor!
Stupid article by IamTheRealMike · 2008-07-06 02:39 · Score: 5, Informative
Wow, it's not often I strongly criticise articles around here, but that was total garbage.
For the smart ones that didn't RTFA, here's a quick summary:
- I like Erlang.
- Big companies like Google and Amazon make things fast by using concurrency.
- Erlang supports (one type of) concurrency.
- Thus Google and Amazon are [probably] using Erlang.
- Thus everyone should learn Erlang.
For the record, I work for Google and we don't use Erlang anywhere in the codebase. Google Gears restricts you to message passing between threads because JavaScript interpreters are not thread-safe, so it's the only way that can work. Visual Basic threading works the same way for similar reasons. It's not because eliminating shared state is somehow noble and pure, regardless of what the article would have you believe, and in fact systems like BigTable use both shared-state concurrency and message passing based concurrency.
The article says this:

Architects (but also university-professors for that matter) still think they can build current and future industrial-grade and internet-grade systems with the same technologies as they did 10-15 years ago.
But in fact the Google search engine, which is one of the larger "industrial-grade, internet-grade" systems I know of, is written entirely in C++. A language which is much the same as it was 10-15 years ago. Thus the central point of his argument seems flawed to me.
Seeing as the article is merely an advert for Erlang, I'll engage in some advocacy myself. If you have an interest in programming languages, feel free to check out Erlang, but be aware that such languages are taking options away from you, not giving you more. A multi-paradigm language like version two of D is a better way to go imho - it supports primitives needed to write in a functional style like transitive invariance, as well as a simple lambda syntax, easy closures and first class support for lazyness.
However it also compiles down to self-contained native code in an intuitive way, or at least, a way that's intuitive to the 99.9% of programmers used to imperative languages, unlike Erlang or Haskell. It provides garbage collection but doesn't force you to use it, unlike Java. It doesn't rely on a VM or JIT, unlike C#. It provides some measure of C and C++ interopability, unlike most other languages. And it has lots of time-saving and safety-enhancing features done in a clean way too.
1. Re:Stupid article by IamTheRealMike · 2008-07-06 05:38 · Score: 5, Interesting
  
  Yes, D is very young and has problems. But then again, what language didn't? It's easy to forget but Python was first released in 1991. It took many years before it became mainstream (and some would say it's still not there yet).
  The post-mortem is an interesting document, but I disagree with the authors conclusions. The compilers are buggy, well, C++ had exactly the same problem for a long time but still was a huge success. In particular, the trend seems to be basing new compilers on LLVM, which has a pretty robust optimization core. Frontend bugs are by comparison pretty trivial and easy to fix. Another few years and I think this problem will be licked - and besides, lots of C++ code has workarounds for compiler issues. Same thing for class libraries.
  You're right about C-level FFIs. However D provides a simple C++ FFI which as far as I know is unique. Such a thing would be very useful for a company like Google which has a lot of C++ code, as it'd simplify binding considerably (I don't mean to imply anything about the future direction of the codebase, by the way).
  The argument about parallelism is a more interesting one. But I disagree with that too :) D provides exactly what is needed for automatic sharding of work across cores (or machines). Specifically the combination of transitive invariance, reflection and purity enforcement is a very powerful one.
  Essentially, if you can write your code to consist of non-trivial trees of pure functions, then it's perfectly safe to parallelise something like this:
  foreach (item; list) { fooResults[item] = someTransform(item); barResults[item] = anotherTransform(item); }
  If someTransform and anotherTransform are both pure, by implication their parameters are transitively invariant, and thus they can both be invoked in parallel (because the compiler knows "item" can't be changed). What's more both calls can be invoked simultaneously as well.
  Once the compiler knows these things, making this code run in parallel is simply another compiler optimization. That's the whole theory behind how functional languages can be super easy to parallelize. But in fact the key concepts can be applied to imperative languages as well, with the advantage that you can still have temporary mutable state within the function scopes - you just can't modify the heap, or anything reachable through your arguments.
  D has keywords that let the compiler know and enforce function purity.
  Now as it happens I doubt that any D compiler today implements this optimisation - it's sophisticated and transitive invariance is newly introduced in D2. But all the pieces of the puzzle are there. This also lets the compiler do calculations on data structures available at compile time.