Slashdot Mirror


Scaling Large Projects With Erlang

Delchanat points out a blog entry which notes, "The two biggest computing-providers of today, Amazon and Google, are building their concurrent offerings on top of really concurrent programming languages and systems. Not only because they want to, but because they need to. If you want to build computing into a utility, you need large real-time systems running as efficiently as possible. You need your technology to be able to scale in a similar way as other, comparable utilities or large real-time systems are scaling — utilities like telephony and electricity. Erlang is a language that has all the right properties and mechanisms in place to do what utility computing requires. Amazon SimpleDB is built upon Erlang. IMDB (owned by Amazon) is switching from Perl to Erlang. Google Gears is using Erlang-style concurrency, and the list goes on."

36 of 200 comments (clear)

  1. Erlang: The Movie ! by Enlightenment · · Score: 5, Funny
  2. Sufficiently? by Anonymous Coward · · Score: 5, Interesting

    Perhaps the systems would be better running efficiently rather than sufficiently?

  3. Huh? by The+Breeze · · Score: 4, Insightful

    "The two biggest computing providers of today"?

    What the hell does that mean?

    Also, is it just me or does the article intro sound like it was written by someone who has taken way too many marketing classes?

    1. Re:Huh? by K.+S.+Kyosuke · · Score: 4, Interesting

      Maybe they referred to Amazon EC2 and Google Application Engine?

      --
      Ezekiel 23:20
    2. Re:Huh? by fermion · · Score: 3, Insightful
      Definitely market copy. Extremely general, not useful information, indiscriminate name dropping, with unintended consequences.

      For instance, by dropping the imdb name, it is now my impression that this Erlang thing is best at destroying otherwise useful sites by making them less reliable and more annoying to users. Who in their right mind would want to do that. Oh, marketing people, thats who!

      --
      "She's a scientist and a lesbian. She's not going to let it slide." Orphan Black
  4. Who wrote the summary? GWB? by Junior+J.+Junior+III · · Score: 5, Funny

    "running as sufficiently as possible"?

    Sometimes as a nation we must ask ourselves, is our children learning?

    --
    You see? You see? Your stupid minds! Stupid! Stupid!
  5. Scala by fils · · Score: 5, Informative

    People may also want to check out Scala at:
    http://www.scala-lang.org/

    It also uses the Erlang style concurrency approach and runs on the JVM with class compatibility with other JVM languages, ie Java, Groovy, etc.

    1. Re:Scala by bonefry · · Score: 4, Informative

      There is a significant difference between Scala and Erlang.

      Erlang uses green threads. And green threads have advantages and disadvantages over native threads.

      For instance Erlang is bad at IO but on the other hand it can spawn millions of threads, something that the JVM has a hard time doing because native threads are limited by the kernel.

    2. Re:Scala by Cyberax · · Score: 3, Informative

      Scala has actors, which are allow you to do something _like_ green threads: http://lamp.epfl.ch/~phaller/doc/ActorsTutorial.html

    3. Re:Scala by jonabbey · · Score: 3, Informative

      Modern JVMs on the modern Linux Kernel can spawn quite a hellacious amount of threads these days, actually.

      The problem with Java is the shared-state synchronization that is often necessary, and the extra work required to distribute state to threads across different VMs. A functional language and programming style could work quite well on top of the JVM, though, and could leverage RMI and some kind of message port facility for the distribution.

    4. Re:Scala by TheRaven64 · · Score: 3, Interesting

      Last time I checked, the Linux kernel was limited to around 8000 threads per process on x86, since it used an LDT entry for TLS on each one (and, if you didn't properly dispose of your threads, would leak the LDT entry, causing some really difficult-to-track bugs). I believe a modern JVM uses an N:M threading model, and removes locks if all threads that can access a one are on the same OS thread. I doubt it scales as well as beam, which is designed from the ground up to handle insane numbers of (potentially very short-lived) threads, however.

      --
      I am TheRaven on Soylent News
    5. Re:Scala by Richard+W.M.+Jones · · Score: 5, Informative

      "Last time you checked" was some time last century in that case. Linux kernels have been able to support at least 100,000 threads for ages.

      That doesn't mean that using shared memory concurrency is a good idea though. When your computer comes with 10s or 100s of cores you'll realise that maybe SMP wasn't the best model of concurrency to choose. That's where models such as map-reduce, Erlang's shared nothing concurrency, message passing, and MPI come into their own. Even today they are useful because you'll be able to scale your program across multiple machines.

      Rich.

    6. Re:Scala by TheRaven64 · · Score: 3, Informative
      Note that I said 'per process' - each process has its own LDT, and so each one can support 8K threads, so you can get 100K processes with 13 processes easily. This might not still be the case - implementing TLS using an extra register would avoid this limitation but would remove one GPR, and they are quite scarce on x86. The other option, updating the LDT every few thread context switches introduces a lot of TLB churn.

      I quite agree that shared memory concurrency is a bad idea, however. Unfortunately, until you have message passing instructions in the hardware, you're stuck emulating message passing on top of shared memory, which leads to cache coherency issues and a host of other problems.

      --
      I am TheRaven on Soylent News
    7. Re:Scala by Jamie+Lokier · · Score: 4, Informative

      Linux threads stopped using the LDT on x86 in 2002. This change went mainstream over subsequent years, and is nowadays always used on x86.

      There was once a limit on the number of processes, too, due to each process having an entry in the GDT. That has long been removed too.

  6. Why Erlang Matters by mpapet · · Score: 5, Insightful

    1. Multicore ready.
    Erlang will use them. Write your application in Erlang and it's done for you.

    2. Scales well.
    As an example, http://yaws.hyber.org/ scales very nicely when loads increase. Your basic LAMP/LYMP setup runs much better on vanilla hardware.

    3. Designed for telecom
    The architects designed the language to run in a telecom environment so things like upgrades can be done while the application is running.

    Yaws in particular needs your help. Failover clustering inside the yaws server would be wonderful. Right now, it uses CGI to process other languages. It does it flawlessly, but a more direct solution might be a nice project.

    --
    http://www.maxineudall.com/2010/02/should-economists-be-sued-for-malpractice.html
  7. Re:hard to read after by jc42 · · Score: 5, Funny

    "you need large real-time systems running as sufficiently as possible."

    Should that not be efficiently as possible?

    You obviously haven't looked very closely at any of the "market leader" software lately.

    Software from the Big Guys is more and more designed to sell (think forced upgrades) bigger, faster systems. You don't do this by making your software efficient.

    The logic behind many software updates these days is "Will this release require sufficient resources that customers will be persuaded to upgrade to new hardware?"

    --
    Those who do study history are doomed to stand helplessly by while everyone else repeats it.
  8. Comparison of functional languages? by radarsat1 · · Score: 4, Interesting

    I think the summary (and article) are somewhat poorly written, but that doesn't shadow the fact that functional languages are becoming more and more interesting these days with concurrency becoming so important.

    I'd like to learn one, but there are several out there.. What I'd like to see is a good in-depth comparison of different concurrent functional languages: why would I choose Haskell, or Erlang, or OCaml, for example? Are they all interpreted? (Does one exist that compiles?) Which ones support concurrency? What language features do they boast, and what are the advantages and disadvantages of these features? Do they have a complete set of libraries?

    Anyone know of an article like this? I've been searching for a while. Every article on functional languages I've found seems to concentrate on a particular one, but I can't find something helping me decide which one is most worth learning.

    1. Re:Comparison of functional languages? by AndyS · · Score: 3, Insightful

      Brief answer:

      All three languages have both interpreters and compilers (ocaml is part of the base distribution, haskell has a number of compilers, and Erlang apparently has a compiler)

      They all support concurrency, all in slightly different ways. They all have a lot of libraries.

      Ocaml is sort of a functional language that includes object oriented features and also has very good performance numbers. It allows mutable updates, including arrays and references. For threading I believe it has the usual mutexes and so on, but nothing more sophisticated (but I could be wrong)

      Haskell is a pure functional language. It tends to be a test bed for programming language ideas. It has some interesting features that can screw with your mind - it's lazy (which means that it only evaluates things when required), and pure (manipulating state can be interesting). It has mutex support, but also (in GHC) support for software transactional memory, which can be used to simulate erlang style concurrency.

      Not really an expert on Erlang, but to my knowledge it pushes you to a 'mini-server' model, where you write each component as a mini server which then performs a single job, and you then spend most of your time sending messages to other processes. The Erlang system then distributes this across multiple machines for you and handles fault tolerance etc.

    2. Re:Comparison of functional languages? by Richard+W.M.+Jones · · Score: 4, Interesting

      OCaml compiles down to native code, which about 10-20% slower than C. Faster than C in a few (narrow) cases.

      Haskell is also compiled to native code, but difficulties with the execution model mean it's pretty slow for any practical use.

      Erlang is interpreted - the execution model is similar to Perl or Python - which means its slow on single cores, but of course the whole point of Erlang is to run in highly concurrent, distributed machines. There is a project to use OCaml for the performance-critical, single threaded parts, and Erlang for coordinating the parallelism.

      Of course, this is probably missing the point. Unless you're doing intensive numerical work, you probably don't need the performance. The real advantage of these languages is how your code will be much smaller, easier to understand, safer, and faster to write.

      Rich.

  9. Gibberish by SpinyNorman · · Score: 5, Insightful

    If you want to build computing into a utility, you need large real-time systems running as sufficiently as possible.

    But if you want to build sprockets into a weasel you need small batch-mode systems running as necessarily as possible.

    If the poster had anything interesting to say (I'd guess not, but who knows!), it was totally obscured by his lack of grasp of the English language.

  10. Re:Deceptive by IamTheRealMike · · Score: 4, Insightful

    Actually, Gears doesn't use Erlang either. What he means is that Gears threading doesn't allow for shared state (is it really threading then?). Instead threads communicate back to the browser by message passing.

    It's remarkably deceptive indeed to even imply that Gears and Erlang are connected. Message passing based concurrency isn't exactly new or limited to Erlang, and can be implemented in any language.

    I'm not sure what the point of this piece is. I've looked at Erlang and didn't see much of anything to get me excited. It's a functional language, which like most of them have unnecessarily weird syntax and force immutable state. I don't really see what this buys you over a language like D 2 (or hell, even C++) in which you can write in a functional message passing style if you like, but then still use imperative shared state whenever useful, convenient or performant.

  11. Too late by Fnord666 · · Score: 3, Funny

    Anyhow, this post was not intended to be a rant about old-school technology solutions vs. current and future technology problems.

    Given that this statement appears almost halfway through the blog post, I would say that it was already too late for that.

    --
    'The tyrant will always find pretext for his tyranny.' - Aesop's Fables
  12. Why Erlang doesn't matter by SanityInAnarchy · · Score: 5, Interesting

    1. Invariable variables.
    This appears to have been done for no reason other than the designer's preference. In fact, it's not strictly true -- variables can be unbound, and later bound. They just can't be re-bound once bound.

    2. Weird syntax.
    Why, exactly, are there three different kinds of (required) line endings? It seems as though the syntax is designed to be as different from C as possible, while maintaining at least as many quirks. Moreso, even -- when constructing normal, trivial programs, you're going to hit most language features head-on and at their worst. Where's my 'print "hello\n"' that works most other places?

    I don't believe the important features of Erlang are mutually-exclusive with the sane syntax of, say, Ruby or Python.

    3. Not Unicode-ready.
    Strings are defined as ASCII -- maybe latin1. But there's no direct unicode support in the language -- if you're lucky, there are functions you can pipe it through.

    There are other things I haven't mentioned, mostly implementation-specific -- things like the fact that function-reloading cannot be done when you natively-compile (with hipe) for extra speed. My plan is to take the features I actually like from Erlang and implement them elsewhere, in a language I can actually stomach for its real tasks.

    --
    Don't thank God, thank a doctor!
    1. Re:Why Erlang doesn't matter by stonecypher · · Score: 4, Informative

      1) Actually, there are quite a few good reasons for this, largely around the complete elimination of mutexing and locks. Just because you don't understand the purpose doesn't mean there wasn't one.

      2) Oooooh, a language is faulty because it has a syntax with which you are not familiar. Immediately kill all non-Java clones!

      3) They're just lists of numbers; they're neither ASCII nor Latin1. There is unicode parsing in the XMERL module.

      Please wait until you know a language before criticizing it.

      --
      StoneCypher is Full of BS
    2. Re:Why Erlang doesn't matter by Anonymous Coward · · Score: 3, Informative

      1. Invariable variables.
      This appears to have been done for no reason other than the designer's preference. In fact, it's not strictly true -- variables can be unbound, and later bound. They just can't be re-bound once bound.

      On the contrary, there are very good reasons for having single-assignment variables. It makes the code more similar to plain mathematics, which makes it easier to reason about, and significantly reduces the number of programming errors. And you don't have to take that from me - there are some 20 years of experience at Ericsson and elsewhere with writing huge telecom applications in Erlang.

      2. Weird syntax.
      Why, exactly, are there three different kinds of (required) line endings? It seems as though the syntax is designed to be as different from C as possible, while maintaining at least as many quirks. Moreso, even -- when constructing normal, trivial programs, you're going to hit most language features head-on and at their worst. Where's my 'print "hello\n"' that works most other places?

      I don't believe the important features of Erlang are mutually-exclusive with the sane syntax of, say, Ruby or Python.

      The syntax is certainly different from C, Ruby, or Python, but this is because it is derived from the Prolog syntax. Furthermore, it is actually pretty systematic, once you get over those initial differences. It is a poor programmer who cannot master both worlds.

      3. Not Unicode-ready.
      Strings are defined as ASCII -- maybe latin1. But there's no direct unicode support in the language -- if you're lucky, there are functions you can pipe it through.

      A standard unicode library is still missing, but can be hoped for. At least, there is nothing in the basic representation of strings that prevents full unicode support (*cough* Java *cough*).

      There are other things I haven't mentioned, mostly implementation-specific -- things like the fact that function-reloading cannot be done when you natively-compile (with hipe) for extra speed.

      That's simply wrong. Dynamic code upgrade still works, native or not. It's the unloading of older native code from memory that is not being done (this is safe, but could be a memory leak in a very long running server).

      My plan is to take the features I actually like from Erlang and implement them elsewhere, in a language I can actually stomach for its real tasks.

      Well, good luck, and see you in 20 years. Meanwhile, the rest of us will be over here, getting stuff done with the language we have. For my part, I don't know anything else that lets me be as productive, at least for general problem solving.

    3. Re:Why Erlang doesn't matter by SanityInAnarchy · · Score: 4, Interesting

      As I understand it, you should look at variables in functional programming languages like Erlang more like those in a mathematical formula; such programs can be proven correct a lot easier, and since variables are effectively immutable

      All of this is based on the premise that Erlang is a functional language. It's not purely-functional, and I just don't see the point of doing it half-assedly. Erlang is effectively an imperative language dressed up like a functional language.

      And they're not immutable -- they can be unbound. As I understand it, this unboundedness is detected at runtime, not compiletime. If it was detected at compiletime, you'd have a valid point.

      it facilitates forking the line of execution in a way that would not be possible without all kinds of semaphores and other concurrency stuff

      Except that's not how Erlang does concurrency. It does concurrency with explicit "processes" (green threads) and message-passing.

      Now, it does make these very easy, and you can get it to distribute processes among a few real OS threads (one per core) -- so it's still very cool. But you're thinking of languages like Haskell, which can be automagically threaded. Erlang is manually threaded, it's just much easier to think in threads (or "processes") -- they're effectively a language feature.

      --
      Don't thank God, thank a doctor!
    4. Re:Why Erlang doesn't matter by SanityInAnarchy · · Score: 3, Informative

      Actually, there are quite a few good reasons for this, largely around the complete elimination of mutexing and locks.

      ...What? No, the elimination of mutexing and locks is made possible by a shared-nothing architecture.

      Oooooh, a language is faulty because it has a syntax with which you are not familiar.

      Hey, I mentioned Ruby. I don't mind LISP, either.

      The point is not that the language is unfamiliar, the point is that it's inconsistent (and unfamiliar) for no good reason. I use English, but I could make a lot of the same criticisms about it.

      They're just lists of numbers;

      In that case, the argument becomes, "Erlang has very poor text-processing, if any at all."

      If Erlang has text-processing functions that are designed to operate on these "lists of numbers", then yeah, it's pretty much going to be ASCII. And how are Erlang source files read? Could be "neither ASCII nor Latin1" if you like, but they can't be Unicode unless the parser is actually Unicode-aware.

      --
      Don't thank God, thank a doctor!
    5. Re:Why Erlang doesn't matter by SQL+Error · · Score: 4, Insightful

      2) Oooooh, a language is faulty because it has a syntax with which you are not familiar.

      Yes.

      Where is Lisp today? Smalltalk?

      On the other hand, languages that offered the same features with a familiar syntax have taken over the market.

    6. Re:Why Erlang doesn't matter by aconbere · · Score: 3, Informative

      Actually, there are quite a few good reasons for this, largely around the complete elimination of mutexing and locks.

      ...What? No, the elimination of mutexing and locks is made possible by a shared-nothing architecture.

      Oooooh, a language is faulty because it has a syntax with which you are not familiar.

      Hey, I mentioned Ruby. I don't mind LISP, either.

      The point is not that the language is unfamiliar, the point is that it's inconsistent (and unfamiliar) for no good reason. I use English, but I could make a lot of the same criticisms about it.

      It's not that it's syntax is /inconsistent/ Erlang is actually incredibly consistent, it's just very different. Once you learn the 3 or 4 quirks that separate it from other languages those 3 or 4 quirks are very consistently applied.

      Take for instance the punctuation (not line ending characters as is suggested).

      Commas separated arguments in function calls, data constructors, and patterns. Periods separate functions.

      Semi-Colons separate clauses. (this is the trickiest, but can be thought of as signifying the existence of multiple cases of pattern matching).

      They're just lists of numbers;

      In that case, the argument becomes, "Erlang has very poor text-processing, if any at all."

      If Erlang has text-processing functions that are designed to operate on these "lists of numbers", then yeah, it's pretty much going to be ASCII. And how are Erlang source files read? Could be "neither ASCII nor Latin1" if you like, but they can't be Unicode unless the parser is actually Unicode-aware.

  13. Stupid article by IamTheRealMike · · Score: 5, Informative

    Wow, it's not often I strongly criticise articles around here, but that was total garbage.

    For the smart ones that didn't RTFA, here's a quick summary:

    • I like Erlang.
    • Big companies like Google and Amazon make things fast by using concurrency.
    • Erlang supports (one type of) concurrency.
    • Thus Google and Amazon are [probably] using Erlang.
    • Thus everyone should learn Erlang.

    For the record, I work for Google and we don't use Erlang anywhere in the codebase. Google Gears restricts you to message passing between threads because JavaScript interpreters are not thread-safe, so it's the only way that can work. Visual Basic threading works the same way for similar reasons. It's not because eliminating shared state is somehow noble and pure, regardless of what the article would have you believe, and in fact systems like BigTable use both shared-state concurrency and message passing based concurrency.

    The article says this:

    Architects (but also university-professors for that matter) still think they can build current and future industrial-grade and internet-grade systems with the same technologies as they did 10-15 years ago.

    But in fact the Google search engine, which is one of the larger "industrial-grade, internet-grade" systems I know of, is written entirely in C++. A language which is much the same as it was 10-15 years ago. Thus the central point of his argument seems flawed to me.

    Seeing as the article is merely an advert for Erlang, I'll engage in some advocacy myself. If you have an interest in programming languages, feel free to check out Erlang, but be aware that such languages are taking options away from you, not giving you more. A multi-paradigm language like version two of D is a better way to go imho - it supports primitives needed to write in a functional style like transitive invariance, as well as a simple lambda syntax, easy closures and first class support for lazyness.

    However it also compiles down to self-contained native code in an intuitive way, or at least, a way that's intuitive to the 99.9% of programmers used to imperative languages, unlike Erlang or Haskell. It provides garbage collection but doesn't force you to use it, unlike Java. It doesn't rely on a VM or JIT, unlike C#. It provides some measure of C and C++ interopability, unlike most other languages. And it has lots of time-saving and safety-enhancing features done in a clean way too.

    1. Re:Stupid article by burris · · Score: 3, Interesting

      I'm not going to disagree with most of your post, I think you're spot on. However, your suggestion of D is totally off. I like the D programming language quite a bit and version 2 is going to be really cool. However, even version 1 of D is not ready for prime-time. Version 2 of D is unstable and not recommended for production by even the author himself. All of the other languages you mentioned such as Erlang or Haskell are much more mature.

      Also, "most other languages" have a foreign function interface for C, including Erlang, Haskell, Python, Java, Perl, Ruby, etc... In fact, I can't think of a well known programming language actually used by people other than the author that does not have an FFI... It is true that in most cases the FFI of other languages is more difficult to use than the one in D, but they are there.

    2. Re:Stupid article by IamTheRealMike · · Score: 5, Interesting

      Yes, D is very young and has problems. But then again, what language didn't? It's easy to forget but Python was first released in 1991. It took many years before it became mainstream (and some would say it's still not there yet).

      The post-mortem is an interesting document, but I disagree with the authors conclusions. The compilers are buggy, well, C++ had exactly the same problem for a long time but still was a huge success. In particular, the trend seems to be basing new compilers on LLVM, which has a pretty robust optimization core. Frontend bugs are by comparison pretty trivial and easy to fix. Another few years and I think this problem will be licked - and besides, lots of C++ code has workarounds for compiler issues. Same thing for class libraries.

      You're right about C-level FFIs. However D provides a simple C++ FFI which as far as I know is unique. Such a thing would be very useful for a company like Google which has a lot of C++ code, as it'd simplify binding considerably (I don't mean to imply anything about the future direction of the codebase, by the way).

      The argument about parallelism is a more interesting one. But I disagree with that too :) D provides exactly what is needed for automatic sharding of work across cores (or machines). Specifically the combination of transitive invariance, reflection and purity enforcement is a very powerful one.

      Essentially, if you can write your code to consist of non-trivial trees of pure functions, then it's perfectly safe to parallelise something like this:

      foreach (item; list) {
          fooResults[item] = someTransform(item);
          barResults[item] = anotherTransform(item);
      }

      If someTransform and anotherTransform are both pure, by implication their parameters are transitively invariant, and thus they can both be invoked in parallel (because the compiler knows "item" can't be changed). What's more both calls can be invoked simultaneously as well.

      Once the compiler knows these things, making this code run in parallel is simply another compiler optimization. That's the whole theory behind how functional languages can be super easy to parallelize. But in fact the key concepts can be applied to imperative languages as well, with the advantage that you can still have temporary mutable state within the function scopes - you just can't modify the heap, or anything reachable through your arguments.

      D has keywords that let the compiler know and enforce function purity.

      Now as it happens I doubt that any D compiler today implements this optimisation - it's sophisticated and transitive invariance is newly introduced in D2. But all the pieces of the puzzle are there. This also lets the compiler do calculations on data structures available at compile time.

  14. no new language needed by speedtux · · Score: 4, Insightful

    Erlang is a language that has all the right properties and mechanisms in place to do what utility computing requires.

    Well, except that it's darned inconvenient to actually write the applications in it.

    Google Gears is using Erlang-style concurrency, and the list goes on."

    Yup, and it makes more sense to add "Erlang-style concurrency" to existing languages than to throw out everything and switch to Erlang.

  15. "Cloud computing" is an Xmas artifact by Animats · · Score: 4, Interesting

    The enthusiasm for "cloud computing" may evaporate when Xmas rolls around.

    I went to a talk at Stanford by the architect of Amazon's web services. It came out in questioning that the real motivation between Amazon's low-priced web services is that their load in the Xmas shopping season is about 4x the load for the rest of the year. Their infrastructure is sized for the November-December peak, so for ten months of the year they have vast excess capacity. That's why Amazon's web services are so cheap.

    Don't expect good response time during the shopping season. Although this Xmas might be OK, due to the recession.

    1. Re:"Cloud computing" is an Xmas artifact by Chang · · Score: 3, Insightful

      While the origin of EC2 in 2006 is certainly related to peak capacity requirements at Amazon, it is certainly way beyond that point now.

      Two Christmas seasons have come and gone without major capacity problems on EC2.

      The reality is that EC2 has grown far beyond its roots as a way for Amazon to amortize their peak capacity by reselling it and it has turned into a small but growing profit center and publicity success for Amazon.

  16. Facebook chat by dristoph · · Score: 3, Interesting

    It should be noted that Facebook's relatively new chat feature, which allows Facebook users to send instant messages to all their online friends as well as see status changes, notifications, and feed stories in near real time, was developed using Erlang. http://www.planeterlang.org/story.php?title=Facebook_chat_is_developed_in_Erlang