Scaling Large Projects With Erlang
Delchanat points out a blog entry which notes,
"The two biggest computing-providers of today, Amazon and Google, are building their concurrent offerings on top of really concurrent programming languages and systems. Not only because they want to, but because they need to. If you want to build computing into a utility, you need large real-time systems running as efficiently as possible. You need your technology to be able to scale in a similar way as other, comparable utilities or large real-time systems are scaling — utilities like telephony and electricity. Erlang is a language that has all the right properties and mechanisms in place to do what utility computing requires. Amazon SimpleDB is built upon Erlang. IMDB (owned by Amazon) is switching from Perl to Erlang. Google Gears is using Erlang-style concurrency, and the list goes on."
People may also want to check out Scala at:
http://www.scala-lang.org/
It also uses the Erlang style concurrency approach and runs on the JVM with class compatibility with other JVM languages, ie Java, Groovy, etc.
Wow, it's not often I strongly criticise articles around here, but that was total garbage.
For the smart ones that didn't RTFA, here's a quick summary:
For the record, I work for Google and we don't use Erlang anywhere in the codebase. Google Gears restricts you to message passing between threads because JavaScript interpreters are not thread-safe, so it's the only way that can work. Visual Basic threading works the same way for similar reasons. It's not because eliminating shared state is somehow noble and pure, regardless of what the article would have you believe, and in fact systems like BigTable use both shared-state concurrency and message passing based concurrency.
The article says this:
But in fact the Google search engine, which is one of the larger "industrial-grade, internet-grade" systems I know of, is written entirely in C++. A language which is much the same as it was 10-15 years ago. Thus the central point of his argument seems flawed to me.
Seeing as the article is merely an advert for Erlang, I'll engage in some advocacy myself. If you have an interest in programming languages, feel free to check out Erlang, but be aware that such languages are taking options away from you, not giving you more. A multi-paradigm language like version two of D is a better way to go imho - it supports primitives needed to write in a functional style like transitive invariance, as well as a simple lambda syntax, easy closures and first class support for lazyness.
However it also compiles down to self-contained native code in an intuitive way, or at least, a way that's intuitive to the 99.9% of programmers used to imperative languages, unlike Erlang or Haskell. It provides garbage collection but doesn't force you to use it, unlike Java. It doesn't rely on a VM or JIT, unlike C#. It provides some measure of C and C++ interopability, unlike most other languages. And it has lots of time-saving and safety-enhancing features done in a clean way too.
Can't point you to a comparison article, but one language you should consider is Scala. It compiles to the Java platform, and thus can interact almost transparently with existing Java code and libraries, and uses Erlang's concurrency model. It can do both functional and imperitive, object-oriented tasks. It's statically-typed, but with features I didn't think were possible outside a dynamic language, such as duck-typing (only compile-time checked!)
It's very powerful, but sometimes hard to figure out. Not my ideal language, but the closest I've found.
Official site:
http://www.scala-lang.org/
The busy Java developer's guide to Scala:
http://www.ibm.com/developerworks/views/java/libraryview.jsp?search_by=scala+neward
Scala for Java refugees:
http://www.codecommit.com/blog/scala/roundup-scala-for-java-refugees
1) Actually, there are quite a few good reasons for this, largely around the complete elimination of mutexing and locks. Just because you don't understand the purpose doesn't mean there wasn't one.
2) Oooooh, a language is faulty because it has a syntax with which you are not familiar. Immediately kill all non-Java clones!
3) They're just lists of numbers; they're neither ASCII nor Latin1. There is unicode parsing in the XMERL module.
Please wait until you know a language before criticizing it.
StoneCypher is Full of BS
Just a minor correction: Erlang has native code compilation on quite a few architectures -- try the "+native" flag. Most projects seem content with just using the VM interpreter, though.
Best,
Thomas Lindgren
1. Invariable variables.
This appears to have been done for no reason other than the designer's preference. In fact, it's not strictly true -- variables can be unbound, and later bound. They just can't be re-bound once bound.
On the contrary, there are very good reasons for having single-assignment variables. It makes the code more similar to plain mathematics, which makes it easier to reason about, and significantly reduces the number of programming errors. And you don't have to take that from me - there are some 20 years of experience at Ericsson and elsewhere with writing huge telecom applications in Erlang.
2. Weird syntax.
Why, exactly, are there three different kinds of (required) line endings? It seems as though the syntax is designed to be as different from C as possible, while maintaining at least as many quirks. Moreso, even -- when constructing normal, trivial programs, you're going to hit most language features head-on and at their worst. Where's my 'print "hello\n"' that works most other places?
I don't believe the important features of Erlang are mutually-exclusive with the sane syntax of, say, Ruby or Python.
The syntax is certainly different from C, Ruby, or Python, but this is because it is derived from the Prolog syntax. Furthermore, it is actually pretty systematic, once you get over those initial differences. It is a poor programmer who cannot master both worlds.
3. Not Unicode-ready.
Strings are defined as ASCII -- maybe latin1. But there's no direct unicode support in the language -- if you're lucky, there are functions you can pipe it through.
A standard unicode library is still missing, but can be hoped for. At least, there is nothing in the basic representation of strings that prevents full unicode support (*cough* Java *cough*).
There are other things I haven't mentioned, mostly implementation-specific -- things like the fact that function-reloading cannot be done when you natively-compile (with hipe) for extra speed.
That's simply wrong. Dynamic code upgrade still works, native or not. It's the unloading of older native code from memory that is not being done (this is safe, but could be a memory leak in a very long running server).
My plan is to take the features I actually like from Erlang and implement them elsewhere, in a language I can actually stomach for its real tasks.
Well, good luck, and see you in 20 years. Meanwhile, the rest of us will be over here, getting stuff done with the language we have. For my part, I don't know anything else that lets me be as productive, at least for general problem solving.
Actually, there are quite a few good reasons for this, largely around the complete elimination of mutexing and locks.
...What? No, the elimination of mutexing and locks is made possible by a shared-nothing architecture.
Oooooh, a language is faulty because it has a syntax with which you are not familiar.
Hey, I mentioned Ruby. I don't mind LISP, either.
The point is not that the language is unfamiliar, the point is that it's inconsistent (and unfamiliar) for no good reason. I use English, but I could make a lot of the same criticisms about it.
They're just lists of numbers;
In that case, the argument becomes, "Erlang has very poor text-processing, if any at all."
If Erlang has text-processing functions that are designed to operate on these "lists of numbers", then yeah, it's pretty much going to be ASCII. And how are Erlang source files read? Could be "neither ASCII nor Latin1" if you like, but they can't be Unicode unless the parser is actually Unicode-aware.
Don't thank God, thank a doctor!
TFA more or less says that IMDB is switching from Perl to Erlang. So I looked at the link and here's what I got:
(From here
We are looking for developers with experience building web scale distributed systems. We are currently working in Perl but have plans to use Java, Erlang and any other language that we think will suit our purposes. We aren't looking for expertise in any of those, particularly, but we expect that you will be an expert in the systems you know. We do require that you be passionate about testing (unit, integration, fault-injection) and code quality. Experience with relational databases (Oracle, MySQL, etc), embedded databases (BerkeleyDB, CDB, MonetDB, etc) and Linux are a big plus.
I'll leave anyone to draw his own conclusions.
Erlang vs. Stackless python: a first benchmark is a very good discussion of lots of niggling details in benchmarking a concurrency language. The comments are quite good.
Every mans' island needs an ocean; choose your ocean carefully.
Actually, there are quite a few good reasons for this, largely around the complete elimination of mutexing and locks.
...What? No, the elimination of mutexing and locks is made possible by a shared-nothing architecture.
Oooooh, a language is faulty because it has a syntax with which you are not familiar.
Hey, I mentioned Ruby. I don't mind LISP, either.
The point is not that the language is unfamiliar, the point is that it's inconsistent (and unfamiliar) for no good reason. I use English, but I could make a lot of the same criticisms about it.
It's not that it's syntax is /inconsistent/ Erlang is actually incredibly consistent, it's just very different. Once you learn the 3 or 4 quirks that separate it from other languages those 3 or 4 quirks are very consistently applied.
Take for instance the punctuation (not line ending characters as is suggested).
Commas separated arguments in function calls, data constructors, and patterns. Periods separate functions.
Semi-Colons separate clauses. (this is the trickiest, but can be thought of as signifying the existence of multiple cases of pattern matching).
They're just lists of numbers;
In that case, the argument becomes, "Erlang has very poor text-processing, if any at all."
If Erlang has text-processing functions that are designed to operate on these "lists of numbers", then yeah, it's pretty much going to be ASCII. And how are Erlang source files read? Could be "neither ASCII nor Latin1" if you like, but they can't be Unicode unless the parser is actually Unicode-aware.
Have you had a look at Clojure? It is a Lisp dialect that runs on the JVM with good Java interop and has built-in support for STM concurrency.